Sample records for acid sequence diversity

  1. Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.

    PubMed

    Sakai, Ryo; Aerts, Jan

    2014-01-01

    The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.

  2. Sequence diversity and evolution of antimicrobial peptides in invertebrates.

    PubMed

    Tassanakajon, Anchalee; Somboonwiwat, Kunlaya; Amparyup, Piti

    2015-02-01

    Antimicrobial peptides (AMPs) are evolutionarily ancient molecules that act as the key components in the invertebrate innate immunity against invading pathogens. Several AMPs have been identified and characterized in invertebrates, and found to display considerable diversity in their amino acid sequence, structure and biological activity. AMP genes appear to have rapidly evolved, which might have arisen from the co-evolutionary arms race between host and pathogens, and enabled organisms to survive in different microbial environments. Here, the sequence diversity of invertebrate AMPs (defensins, cecropins, crustins and anti-lipopolysaccharide factors) are presented to provide a better understanding of the evolution pattern of these peptides that play a major role in host defense mechanisms. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Characterization of the Genetic Diversity of Acid Lime (Citrus aurantifolia (Christm.) Swingle) Cultivars of Eastern Nepal Using Inter-Simple Sequence Repeat Markers.

    PubMed

    Munankarmi, Nabin Narayan; Rana, Neesha; Bhattarai, Tribikram; Shrestha, Ram Lal; Joshi, Bal Krishna; Baral, Bikash; Shrestha, Sangita

    2018-06-12

    Acid lime ( Citrus aurantifolia (Christm.) Swingle) is an important fruit crop, which has high commercial value and is cultivated in 60 out of the 77 districts representing all geographical landscapes of Nepal. A lack of improved high-yielding varieties, infestation with various diseases, and pests, as well as poor management practices might have contributed to its extremely reduced productivity, which necessitates a reliable understanding of genetic diversity in existing cultivars. Hereby, we aim to characterize the genetic diversity of acid lime cultivars cultivated at three different agro-ecological gradients of eastern Nepal, employing PCR-based inter-simple sequence repeat (ISSR) markers. Altogether, 21 polymorphic ISSR markers were used to assess the genetic diversity in 60 acid lime cultivars sampled from different geographical locations. Analysis of binary data matrix was performed on the basis of bands obtained, and principal coordinate analysis and phenogram construction were performed using different computer algorithms. ISSR profiling yielded 234 amplicons, of which 87.18% were polymorphic. The number of amplified fragments ranged from 7⁻18, with amplicon size ranging from ca. 250⁻3200 bp. The Numerical Taxonomy and Multivariate System (NTSYS)-based cluster analysis using the unweighted pair group method of arithmetic averages (UPGMA) algorithm and Dice similarity coefficient separated 60 cultivars into two major and three minor clusters. Genetic diversity analysis using Popgene ver. 1.32 revealed the highest percentage of polymorphic bands (PPB), Nei’s genetic diversity (H), and Shannon’s information index (I) for the Terai zone (PPB = 69.66%; H = 0.215; I = 0.325), and the lowest of all three for the high hill zone (PPB = 55.13%; H = 0.173; I = 0.262). Thus, our data indicate that the ISSR marker has been successfully employed for evaluating the genetic diversity of Nepalese acid lime cultivars and has furnished valuable information on

  4. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  5. Next-Generation Sequencing Reveals Significant Bacterial Diversity of Botrytized Wine

    PubMed Central

    Bokulich, Nicholas A.; Joseph, C. M. Lucy; Allen, Greg; Benson, Andrew K.; Mills, David A.

    2012-01-01

    While wine fermentation has long been known to involve complex microbial communities, the composition and role of bacteria other than a select set of lactic acid bacteria (LAB) has often been assumed either negligible or detrimental. This study served as a pilot study for using barcoded amplicon next-generation sequencing to profile bacterial community structure in wines and grape musts, comparing the taxonomic depth achieved by sequencing two different domains of prokaryotic 16S rDNA (V4 and V5). This study was designed to serve two goals: 1) to empirically determine the most taxonomically informative 16S rDNA target region for barcoded amplicon sequencing of wine, comparing V4 and V5 domains of bacterial 16S rDNA to terminal restriction fragment length polymorphism (TRFLP) of LAB communities; and 2) to explore the bacterial communities of wine fermentation to better understand the biodiversity of wine at a depth previously unattainable using other techniques. Analysis of amplicons from the V4 and V5 provided similar views of the bacterial communities of botrytized wine fermentations, revealing a broad diversity of low-abundance taxa not traditionally associated with wine, as well as atypical LAB communities initially detected by TRFLP. The V4 domain was determined as the more suitable read for wine ecology studies, as it provided greater taxonomic depth for profiling LAB communities. In addition, targeted enrichment was used to isolate two species of Alphaproteobacteria from a finished fermentation. Significant differences in diversity between inoculated and uninoculated samples suggest that Saccharomyces inoculation exerts selective pressure on bacterial diversity in these fermentations, most notably suppressing abundance of acetic acid bacteria. These results determine the bacterial diversity of botrytized wines to be far higher than previously realized, providing further insight into the fermentation dynamics of these wines, and demonstrate the utility of next

  6. Diversity of Functionally Permissive Sequences in the Receptor-Binding Site of Influenza Hemagglutinin.

    PubMed

    Wu, Nicholas C; Xie, Jia; Zheng, Tianqing; Nycholat, Corwin M; Grande, Geramie; Paulson, James C; Lerner, Richard A; Wilson, Ian A

    2017-06-14

    Influenza A virus hemagglutinin (HA) initiates viral entry by engaging host receptor sialylated glycans via its receptor-binding site (RBS). The amino acid sequence of the RBS naturally varies across avian and human influenza virus subtypes and is also evolvable. However, functional sequence diversity in the RBS has not been fully explored. Here, we performed a large-scale mutational analysis of the RBS of A/WSN/33 (H1N1) and A/Hong Kong/1/1968 (H3N2) HAs. Many replication-competent mutants not yet observed in nature were identified, including some that could escape from an RBS-targeted broadly neutralizing antibody. This functional sequence diversity is made possible by pervasive epistasis in the RBS 220-loop and can be buffered by avidity in viral receptor binding. Overall, our study reveals that the HA RBS can accommodate a much greater range of sequence diversity than previously thought, which has significant implications for the complex evolutionary interrelationships between receptor specificity and immune escape. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  8. Diversity and Activity of Alternative Nitrogenases in Sequenced Genomes and Coastal Environments

    PubMed Central

    McRose, Darcy L.; Zhang, Xinning; Kraepiel, Anne M. L.; Morel, François M. M.

    2017-01-01

    The nitrogenase enzyme, which catalyzes the reduction of N2 gas to NH4+, occurs as three separate isozyme that use Mo, Fe-only, or V. The majority of global nitrogen fixation is attributed to the more efficient ‘canonical’ Mo-nitrogenase, whereas Fe-only and V-(‘alternative’) nitrogenases are often considered ‘backup’ enzymes, used when Mo is limiting. Yet, the environmental distribution and diversity of alternative nitrogenases remains largely unknown. We searched for alternative nitrogenase genes in sequenced genomes and used PacBio sequencing to explore the diversity of canonical (nifD) and alternative (anfD and vnfD) nitrogenase amplicons in two coastal environments: the Florida Everglades and Sippewissett Marsh (MA). Genome-based searches identified an additional 25 species and 10 genera not previously known to encode alternative nitrogenases. Alternative nitrogenase amplicons were found in both Sippewissett Marsh and the Florida Everglades and their activity was further confirmed using newly developed isotopic techniques. Conserved amino acid sequences corresponding to cofactor ligands were also analyzed in anfD and vnfD amplicons, offering insight into environmental variants of these motifs. This study increases the number of available anfD and vnfD sequences ∼20-fold and allows for the first comparisons of environmental Mo-, Fe-only, and V-nitrogenase diversity. Our results suggest that alternative nitrogenases are maintained across a range of organisms and environments and that they can make important contributions to nitrogenase diversity and nitrogen fixation. PMID:28293220

  9. Diversity and Activity of Alternative Nitrogenases in Sequenced Genomes and Coastal Environments.

    PubMed

    McRose, Darcy L; Zhang, Xinning; Kraepiel, Anne M L; Morel, François M M

    2017-01-01

    The nitrogenase enzyme, which catalyzes the reduction of N 2 gas to NH 4 + , occurs as three separate isozyme that use Mo, Fe-only, or V. The majority of global nitrogen fixation is attributed to the more efficient 'canonical' Mo-nitrogenase, whereas Fe-only and V-('alternative') nitrogenases are often considered 'backup' enzymes, used when Mo is limiting. Yet, the environmental distribution and diversity of alternative nitrogenases remains largely unknown. We searched for alternative nitrogenase genes in sequenced genomes and used PacBio sequencing to explore the diversity of canonical ( nifD ) and alternative ( anfD and vnfD ) nitrogenase amplicons in two coastal environments: the Florida Everglades and Sippewissett Marsh (MA). Genome-based searches identified an additional 25 species and 10 genera not previously known to encode alternative nitrogenases. Alternative nitrogenase amplicons were found in both Sippewissett Marsh and the Florida Everglades and their activity was further confirmed using newly developed isotopic techniques. Conserved amino acid sequences corresponding to cofactor ligands were also analyzed in anfD and vnfD amplicons, offering insight into environmental variants of these motifs. This study increases the number of available anfD and vnfD sequences ∼20-fold and allows for the first comparisons of environmental Mo-, Fe-only, and V-nitrogenase diversity. Our results suggest that alternative nitrogenases are maintained across a range of organisms and environments and that they can make important contributions to nitrogenase diversity and nitrogen fixation.

  10. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  11. Fatty Acid Diversity is Not Associated with Neutral Genetic Diversity in Native Populations of the Biodiesel Plant Jatropha curcas L.

    PubMed

    Martínez-Díaz, Yesenia; González-Rodríguez, Antonio; Rico-Ponce, Héctor Rómulo; Rocha-Ramírez, Víctor; Ovando-Medina, Isidro; Espinosa-García, Francisco J

    2017-01-01

    Jatropha curcas L. (Euphorbiaceae) is a shrub native to Mexico and Central America, which produces seeds with a high oil content that can be converted to biodiesel. The genetic diversity of this plant has been widely studied, but it is not known whether the diversity of the seed oil chemical composition correlates with neutral genetic diversity. The total seed oil content, the diversity of profiles of fatty acids and phorbol esters were quantified, also, the genetic diversity obtained from simple sequence repeats was analyzed in native populations of J. curcas in Mexico. Using the fatty acids profiles, a discriminant analysis recognized three groups of individuals according to geographical origin. Bayesian assignment analysis revealed two genetic groups, while the genetic structure of the populations could not be explained by isolation-by-distance. Genetic and fatty acid profile data were not correlated based on Mantel test. Also, phorbol ester content and genetic diversity were not associated. Multiple linear regression analysis showed that total oil content was associated with altitude and seasonality of temperature. The content of unsaturated fatty acids was associated with altitude. Therefore, the cultivation planning of J. curcas should take into account chemical variation related to environmental factors. © 2017 Wiley-VHCA AG, Zurich, Switzerland.

  12. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  13. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  14. Prevalence, distribution, and sequence diversity of hmwA among commensal and otitis media non-typeable Haemophilus influenzae.

    PubMed

    Davis, Gregg S; Patel, May; Hammond, James; Zhang, Lixin; Dawid, Suzanne; Marrs, Carl F; Gilsdorf, Janet R

    2014-12-01

    Nontypeable Haemophilus influenzae (NTHi) are Gram-negative coccobacilli that colonize the human pharynx, their only known natural reservoir. Adherence to the host epithelium facilitates NTHi colonization and marks one of the first steps in NTHi pathogenesis. Epithelial cell attachment is mediated, in part, by a pair of high molecular weight (HMW) adhesins that are highly immunogenic, antigenically diverse, and display a wide range of amino acid diversity both within and between isolates. In this study, the prevalence of hmwA, which encodes the HMW adhesin, was determined for a collection of 170 NTHi isolates recovered from the middle ears of children with otitis media (OM isolates) or throats or nasopharynges of healthy children (commensal isolates) from Finland, Israel, and the U.S. Overall, hmwA was detected in 61% of NTHi isolates and was significantly more prevalent (P=0.004) among OM isolates than among commensal isolates; the prevalence ratio comparing hmwA prevalence among ear isolates with that of commensal isolates was 1.47 (95% CI (1.12, 1.92)). Ninety-five percent (98/103) of the hmwA-positive NTHi isolates possessed two hmw loci. To advance our understanding of hmwA binding sequence diversity, we determined the DNA sequence of the hmwA binding region of 33 isolates from this collection. The average amino acid identity across all hmwA sequences was 62%. Phylogenetic analyses of the hmwA binding revealed four distinct sequence clusters, and the majority of hmwA sequences (83%) belonged to one of two dominant sequence clusters. hmwA sequences did not cluster by chromosomal location, geographic region, or disease status. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Insights into the diversity of eukaryotes in acid mine drainage biofilm communities.

    PubMed

    Baker, Brett J; Tyson, Gene W; Goosherst, Lindsey; Banfield, Jillian F

    2009-04-01

    Microscopic eukaryotes are known to have important ecosystem functions, but their diversity in most environments remains vastly unexplored. Here we analyzed an 18S rRNA gene library from a subsurface iron- and sulfur-oxidizing microbial community growing in highly acidic (pH < 0.9) runoff within the Richmond Mine at Iron Mountain (northern California). Phylogenetic analysis revealed that the majority (68%) of the sequences belonged to fungi. Protists falling into the deeply branching lineage named the acidophilic protist clade (APC) and the class Heterolobosea were also present. The APC group represents kingdom-level novelty, with <76% sequence similarity to 18S rRNA gene sequences of organisms from other environments. Fluorescently labeled oligonucleotide rRNA probes were designed to target each of these groups in biofilm samples, enabling abundance and morphological characterization. Results revealed that the populations vary significantly with the habitat and no group is ubiquitous. Surprisingly, many of the eukaryotic lineages (with the exception of the APC) are closely related to neutrophiles, suggesting that they recently adapted to this extreme environment. Molecular analyses presented here confirm that the number of eukaryotic species associated with the acid mine drainage (AMD) communities is low. This finding is consistent with previous results showing a limited diversity of archaea, bacteria, and viruses in AMD environments and suggests that the environmental pressures and interplay between the members of these communities limit species diversity at all trophic levels.

  16. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  17. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  18. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  19. Evolution of sequence-defined highly functionalized nucleic acid polymers

    NASA Astrophysics Data System (ADS)

    Chen, Zhen; Lichtor, Phillip A.; Berliner, Adrian P.; Chen, Jonathan C.; Liu, David R.

    2018-03-01

    The evolution of sequence-defined synthetic polymers made of building blocks beyond those compatible with polymerase enzymes or the ribosome has the potential to generate new classes of receptors, catalysts and materials. Here we describe a ligase-mediated DNA-templated polymerization and in vitro selection system to evolve highly functionalized nucleic acid polymers (HFNAPs) made from 32 building blocks that contain eight chemically diverse side chains on a DNA backbone. Through iterated cycles of polymer translation, selection and reverse translation, we discovered HFNAPs that bind proprotein convertase subtilisin/kexin type 9 (PCSK9) and interleukin-6, two protein targets implicated in human diseases. Mutation and reselection of an active PCSK9-binding polymer yielded evolved polymers with high affinity (KD = 3 nM). This evolved polymer potently inhibited the binding between PCSK9 and the low-density lipoprotein receptor. Structure-activity relationship studies revealed that specific side chains at defined positions in the polymers are required for binding to their respective targets. Our findings expand the chemical space of evolvable polymers to include densely functionalized nucleic acids with diverse, researcher-defined chemical repertoires.

  20. High-Throughput rRNA Gene Sequencing Reveals High
and Complex Bacterial Diversity Associated with
Brazilian Coffee Bean Fermentation

    PubMed Central

    Vinícius de Melo, Gilberto

    2018-01-01

    Summary Coffee bean fermentation is a spontaneous, on-farm process involving the action of different microbial groups, including bacteria and fungi. In this study, high-throughput sequencing approach was employed to study the diversity and dynamics of bacteria associated with Brazilian coffee bean fermentation. The total DNA from fermenting coffee samples was extracted at different time points, and the 16S rRNA gene with segments around the V4 variable region was sequenced by Illumina high-throughput platform. Using this approach, the presence of over eighty bacterial genera was determined, many of which have been detected for the first time during coffee bean fermentation, including Fructobacillus, Pseudonocardia, Pedobacter, Sphingomonas and Hymenobacter. The presence of Fructobacillus suggests an influence of these bacteria on fructose metabolism during coffee fermentation. Temporal analysis showed a strong dominance of lactic acid bacteria with over 97% of read sequences at the end of fermentation, mainly represented by the Leuconostoc and Lactococcus. Metabolism of lactic acid bacteria was associated with the high formation of lactic acid during fermentation, as determined by HPLC analysis. The results reported in this study confirm the underestimation of bacterial diversity associated with coffee fermentation. New microbial groups reported in this study may be explored as functional starter cultures for on-farm coffee processing.

  1. Sequence diversity within the reovirus S2 gene: reovirus genes reassort in nature, and their termini are predicted to form a panhandle motif.

    PubMed Central

    Chapell, J D; Goral, M I; Rodgers, S E; dePamphilis, C W; Dermody, T S

    1994-01-01

    To better understand genetic diversity within mammalian reoviruses, we determined S2 nucleotide and deduced sigma 2 amino acid sequences of nine reovirus strains and compared these sequences with those of prototype strains of the three reovirus serotypes. The S2 gene and sigma 2 protein are highly conserved among the four type 1, one type 2, and seven type 3 strains studied. Phylogenetic analyses based on S2 nucleotide sequences of the 12 reovirus strains indicate that diversity within the S2 gene is independent of viral serotype. Additionally, we found marked topological differences between phylogenetic trees generated from S1 and S2 gene nucleotide sequences of the seven type 3 strains. These results demonstrate that reovirus S1 and S2 genes have distinct evolutionary histories, thus providing phylogenetic evidence for lateral transfer of reovirus genes in nature. When variability among the 12 sigma 2-encoding S2 nucleotide sequences was analyzed at synonymous positions, we found that approximately 60 nucleotides at the 5' terminus and 30 nucleotides at the 3' terminus were markedly conserved in comparison with other sigma 2-encoding regions of S2. Predictions of RNA secondary structures indicate that the more conserved S2 sequences participate in the formation of an extended region of duplex RNA interrupted by a pair of stem-loops. Among the 12 deduced sigma 2 amino acid sequences examined, substitutions were observed at only 11% of amino acid positions. This finding suggests that constraints on the structure or function of sigma 2, perhaps in part because of its location in the virion core, have limited sequence diversity within this protein. PMID:8289378

  2. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  3. Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population.

    PubMed

    Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-Suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B; Nauck, Markus; Kaminski, Wolfgang E

    2017-01-01

    The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its "a" determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the "a" determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of "a" determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated.

  4. Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity.

    PubMed

    Rodriguez-R, Luis M; Gunturu, Santosh; Tiedje, James M; Cole, James R; Konstantinidis, Konstantinos T

    2018-01-01

    Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a k -mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity ( N d ) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profiles and show that N d additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes. IMPORTANCE Estimation of the coverage provided by a metagenomic data set, i.e., what fraction of the microbial community was sampled by DNA sequencing, represents an essential first step of every culture-independent genomic study that aims to robustly assess the sequence

  5. Insertion sequence diversity in archaea.

    PubMed

    Filée, J; Siguier, P; Chandler, M

    2007-03-01

    Insertion sequences (ISs) can constitute an important component of prokaryotic (bacterial and archaeal) genomes. Over 1,500 individual ISs are included at present in the ISfinder database (www-is.biotoul.fr), and these represent only a small portion of those in the available prokaryotic genome sequences and those that are being discovered in ongoing sequencing projects. In spite of this diversity, the transposition mechanisms of only a few of these ubiquitous mobile genetic elements are known, and these are all restricted to those present in bacteria. This review presents an overview of ISs within the archaeal kingdom. We first provide a general historical summary of the known properties and behaviors of archaeal ISs. We then consider how transposition might be regulated in some cases by small antisense RNAs and by termination codon readthrough. This is followed by an extensive analysis of the IS content in the sequenced archaeal genomes present in the public databases as of June 2006, which provides an overview of their distribution among the major archaeal classes and species. We show that the diversity of archaeal ISs is very great and comparable to that of bacteria. We compare archaeal ISs to known bacterial ISs and find that most are clearly members of families first described for bacteria. Several cases of lateral gene transfer between bacteria and archaea are clearly documented, notably for methanogenic archaea. However, several archaeal ISs do not have bacterial equivalents but can be grouped into Archaea-specific groups or families. In addition to ISs, we identify and list nonautonomous IS-derived elements, such as miniature inverted-repeat transposable elements. Finally, we present a possible scenario for the evolutionary history of ISs in the Archaea.

  6. Global sequence diversity of the lactate dehydrogenase gene in Plasmodium falciparum.

    PubMed

    Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Harnyuttanakorn, Pongchai

    2018-01-09

    Antigen-detecting rapid diagnostic tests (RDTs) have been recommended by the World Health Organization for use in remote areas to improve malaria case management. Lactate dehydrogenase (LDH) of Plasmodium falciparum is one of the main parasite antigens employed by various commercial RDTs. It has been hypothesized that the poor detection of LDH-based RDTs is attributed in part to the sequence diversity of the gene. To test this, the present study aimed to investigate the genetic diversity of the P. falciparum ldh gene in Thailand and to construct the map of LDH sequence diversity in P. falciparum populations worldwide. The ldh gene was sequenced for 50 P. falciparum isolates in Thailand and compared with hundreds of sequences from P. falciparum populations worldwide. Several indices of molecular variation were calculated, including the proportion of polymorphic sites, the average nucleotide diversity index (π), and the haplotype diversity index (H). Tests of positive selection and neutrality tests were performed to determine signatures of natural selection on the gene. Mean genetic distance within and between species of Plasmodium ldh was analysed to infer evolutionary relationships. Nucleotide sequences of P. falciparum ldh could be classified into 9 alleles, encoding 5 isoforms of LDH. L1a was the most common allelic type and was distributed in P. falciparum populations worldwide. Plasmodium falciparum ldh sequences were highly conserved, with haplotype and nucleotide diversity values of 0.203 and 0.0004, respectively. The extremely low genetic diversity was maintained by purifying selection, likely due to functional constraints. Phylogenetic analysis inferred the close genetic relationship of P. falciparum to malaria parasites of great apes, rather than to other human malaria parasites. This study revealed the global genetic variation of the ldh gene in P. falciparum, providing knowledge for improving detection of LDH-based RDTs and supporting the candidacy of

  7. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  8. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  9. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  10. Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population

    PubMed Central

    Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C. Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B.; Nauck, Markus; Kaminski, Wolfgang E.

    2017-01-01

    The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its “a” determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the “a” determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of “a” determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly

  11. Correlation between fibroin amino acid sequence and physical silk properties.

    PubMed

    Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

    2003-09-12

    The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.

  12. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  13. Survey of duckweed diversity in Lake Chao and total fatty acid, triacylglycerol, profiles of representative strains.

    PubMed

    Tang, J; Li, Y; Ma, J; Cheng, J J

    2015-09-01

    Lemnaceae (duckweeds) are widely distributed aquatic flowering plants. Their high growth rate, starch content and suitability for bioremediation make them potential feedstock for biofuels. However, few natural duckweed resources have been investigated in China, and there is no information about total fatty acid (TFA) and triacylglycerol (TAG) composition of duckweeds from China. Here, the genetic diversity of a natural duckweed population collected from Lake Chao, China, was investigated using multilocus sequence typing (MLST). The 54 strains were categorised into four species in four genera, representing 12 distinct sequence types. Strains representing Lemna aequinoctialis and Spirodela polyrhiza were predominant. Interestingly, a surprisingly high degree of genetic diversification within L. aequinoctialis was observed. The four duckweed species revealed a uniform fatty acid composition, with three fatty acids, palmitic acid, linoleic acid and linolenic acid, accounting for more than 80% of the TFA. The TFA in biomass varied among species, ranging from 1.05% (of dry weight, DW) for L. punctata and S. polyrhiza to 1.62% for Wolffia globosa. The four duckweed species contained similar TAG contents, 0.02% mg · DW(-1). The fatty acid profiles of TAG were different from those of TFA, and also varied among the four species. The survey investigated the genetic diversity of duckweeds from Lake Chao, and provides an initial insight into TFA and TAG of four duckweed species, indicating that intraspecific and interspecific variations exist in the content and composition of both TFA and TAG in comparison with other studies. © 2015 German Botanical Society and The Royal Botanical Society of the Netherlands.

  14. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...

  15. High speed nucleic acid sequencing

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  16. The microbiology of Bandji, palm wine of Borassus akeassii from Burkina Faso: identification and genotypic diversity of yeasts, lactic acid and acetic acid bacteria.

    PubMed

    Ouoba, L I I; Kando, C; Parkouda, C; Sawadogo-Lingani, H; Diawara, B; Sutherland, J P

    2012-12-01

    To investigate physicochemical characteristics and especially genotypic diversity of the main culturable micro-organisms involved in fermentation of sap from Borassus akeassii, a newly identified palm tree from West Africa. Physicochemical characterization was performed using conventional methods. Identification of micro-organisms included phenotyping and sequencing of: 26S rRNA gene for yeasts, 16S rRNA and gyrB genes for lactic acid bacteria (LAB) and acetic acid bacteria (AAB). Interspecies and intraspecies genotypic diversities of the micro-organisms were screened respectively by amplification of the ITS1-5.8S rDNA-ITS2/16S-23S rDNA ITS regions and repetitive sequence-based PCR (rep-PCR). The physicochemical characteristics of samples were: pH: 3.48-4.12, titratable acidity: 1.67-3.50 mg KOH g(-1), acetic acid: 0.16-0.37%, alcohol content: 0.30-2.73%, sugars (degrees Brix): 2.70-8.50. Yeast included mainly Saccharomyces cerevisiae and species of the genera Arthroascus, Issatchenkia, Candida, Trichosporon, Hanseniaspora, Kodamaea, Schizosaccharomyces, Trigonopsis and Galactomyces. Lactobacillus plantarum was the predominant LAB species. Three other species of Lactobacillus were also identified as well as isolates of Leuconostoc mesenteroides, Fructobacillus durionis and Streptococcus mitis. Acetic acid bacteria included nine species of the genus Acetobacter with Acetobacter indonesiensis as predominant species. In addition, isolates of Gluconobacter oxydans and Gluconacetobacter saccharivorans were also identified. Intraspecies diversity was observed for some species of micro-organisms including four genotypes for Acet. indonesiensis, three for Candida tropicalis and Lactobacillus fermentum and two each for S. cerevisiae, Trichosporon asahii, Candida pararugosa and Acetobacter tropicalis. fermentation of palm sap from B. akeassii involved multi-yeast-LAB-AAB cultures at genus, species and intraspecies level. First study describing microbiological and

  17. Diversity analysis of lactic acid bacteria in takju, Korean rice wine.

    PubMed

    Jin, Jianbo; Kim, So-Young; Jin, Qing; Eom, Hyun-Ju; Han, Nam Soo

    2008-10-01

    To investigate the lactic acid bacterial population in Korean traditional rice wines, biotyping was performed using cell morphology and whole-cell protein pattern analysis by SDSPAGE, and then the isolates were identified by 16S rRNA sequencing analysis. Based on the morphological characteristics, 103 LAB isolates were detected in wine samples, characterized by whole-cell protein pattern analysis, and they were then divided into 18 patterns. By gene sequencing of 16S rRNA, the isolates were identified as Lactobacillus paracasei, Lb. arizonensis, Lb. plantarum, Lb. harbinensis, Lb. parabuchneri, Lb. brevis, and Lb. hilgardii when listed by their frequency of occurrence. It was found that the difference in bacterial diversity between rice and grape wines depends on the raw materials, especially the composition of starch and glucose.

  18. Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences

    PubMed Central

    Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia

    2017-01-01

    Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613

  19. Sequence diversity among badnavirus isolates infecting yam (Dioscorea spp.) in Ghana, Togo, Benin and Nigeria.

    PubMed

    Eni, A O; Hughes, J d'A; Asiedu, R; Rey, M E C

    2008-01-01

    We analysed the sequence diversity in the reverse transcriptase (RT)/ribonuclease H (RNaseH) coding region of 19 badnavirus isolates infecting yam (Dioscorea spp.) in Ghana, Togo, Benin, and Nigeria. Phylogenetic analysis of the deduced amino acid sequences revealed that the isolates are broadly divided into two distinct species, each clustering with Dioscorea alata bacilliform virus (DaBV) and Dioscorea sansibarensis bacilliform virus (DsBV). Fourteen isolates had 90-96% amino acid identity with DaBV, while four isolates had 83-84% amino acid identity with DsBV. One isolate from Benin, BN4Dr, was distinct and had 77 and 75% amino acid identity with DaBV and DsBV, respectively, and may be a member of a new badnavirus species infecting yam in West Africa. Viruses of the two main species were present in Ghana, Togo and Benin and were observed to infect both D. alata and D. rotundata indiscriminately. This is the first confirmed report of DsBV infection in yam in Ghana and Togo. The results of this study demonstrate that members of two distinct species of badnaviruses infect yam in the West African yam zone and suggest a putative new species, BN4Dr. We also conclude that these species are not confined to limited geographic regions or specific for yam host species. However, the three badnavirus species are serologically related. The sequence information obtained from this study can be used to develop PCR-based diagnostics to detect members of the various species and/or strains of badnaviruses infecting yam in West Africa.

  20. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  1. HIV-1 envelope sequence-based diversity measures for identifying recent infections

    PubMed Central

    Kafando, Alexis; Fournier, Eric; Serhir, Bouchra; Martineau, Christine; Doualla-Bell, Florence; Sangaré, Mohamed Ndongo; Sylla, Mohamed; Chamberland, Annie; El-Far, Mohamed; Charest, Hugues

    2017-01-01

    Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections) env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS) using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC) of the receiver operating characteristic (ROC). Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001). Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806), gp120 C2_3 (AUC = 0.805) and gp120 V3 (AUC = 0.812). Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency. PMID:29284009

  2. HIV-1 envelope sequence-based diversity measures for identifying recent infections.

    PubMed

    Kafando, Alexis; Fournier, Eric; Serhir, Bouchra; Martineau, Christine; Doualla-Bell, Florence; Sangaré, Mohamed Ndongo; Sylla, Mohamed; Chamberland, Annie; El-Far, Mohamed; Charest, Hugues; Tremblay, Cécile L

    2017-01-01

    Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections) env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS) using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC) of the receiver operating characteristic (ROC). Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001). Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806), gp120 C2_3 (AUC = 0.805) and gp120 V3 (AUC = 0.812). Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency.

  3. High-throughput sequencing reveals unprecedented diversities of Aspergillus species in outdoor air.

    PubMed

    Lee, S; An, C; Xu, S; Lee, S; Yamamoto, N

    2016-09-01

    This study used the Illumina MiSeq to analyse compositions and diversities of Aspergillus species in outdoor air. The seasonal air samplings were performed at two locations in Seoul, South Korea. The results showed the relative abundances of all Aspergillus species combined ranging from 0·20 to 18% and from 0·19 to 21% based on the number of the internal transcribed spacer 1 (ITS1) and β-tubulin (BenA) gene sequences respectively. Aspergillus fumigatus was the most dominant species with the mean relative abundances of 1·2 and 5·5% based on the number of the ITS1 and BenA sequences respectively. A total of 29 Aspergillus species were detected and identified down to the species rank, among which nine species were known opportunistic pathogens. Remarkably, eight of the nine pathogenic species were detected by either one of the two markers, suggesting the need of using multiple markers and/or primer pairs when the assessments are made based on the high-throughput sequencing. Due to diversity of species within the genus Aspergillus, the high-throughput sequencing was useful to characterize their compositions and diversities in outdoor air, which are thought to be difficult to be accurately characterized by conventional culture and/or Sanger sequencing-based techniques. Aspergillus is a diverse genus of fungi with more than 300 species reported in literature. Aspergillus is important since some species are known allergens and opportunistic human pathogens. Traditionally, growth-dependent methods have been used to detect Aspergillus species in air. However, these methods are limited in the number of isolates that can be analysed for their identities, resulting in inaccurate characterizations of Aspergillus diversities. This study used the high-throughput sequencing to explore Aspergillus diversities in outdoor, which are thought to be difficult to be accurately characterized by traditional growth-dependent techniques. © 2016 The Society for Applied Microbiology.

  4. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  5. Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

    PubMed Central

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  6. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  7. Overlap and diversity in antimicrobial peptide databases: compiling a non-redundant set of sequences.

    PubMed

    Aguilera-Mendoza, Longendri; Marrero-Ponce, Yovani; Tellez-Ibarra, Roberto; Llorente-Quesada, Monica T; Salgado, Jesús; Barigye, Stephen J; Liu, Jun

    2015-08-01

    The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are included in CAMP_Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Low Diversity in the Mitogenome of Sperm Whales Revealed by Next-Generation Sequencing

    PubMed Central

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C. Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity. PMID:23254394

  9. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing.

    PubMed

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity.

  10. Trinucleotide cassettes increase diversity of T7 phage-displayed peptide library.

    PubMed

    Krumpe, Lauren R H; Schumacher, Kathryn M; McMahon, James B; Makowski, Lee; Mori, Toshiyuki

    2007-10-05

    Amino acid sequence diversity is introduced into a phage-displayed peptide library by randomizing library oligonucleotide DNA. We recently evaluated the diversity of peptide libraries displayed on T7 lytic phage and M13 filamentous phage and showed that T7 phage can display a more diverse amino acid sequence repertoire due to differing processes of viral morphogenesis. In this study, we evaluated and compared the diversity of a 12-mer T7 phage-displayed peptide library randomized using codon-corrected trinucleotide cassettes with a T7 and an M13 12-mer phage-displayed peptide library constructed using the degenerate codon randomization method. We herein demonstrate that the combination of trinucleotide cassette amino acid codon randomization and T7 phage display construction methods resulted in a significant enhancement to the functional diversity of a 12-mer peptide library. This novel library exhibited superior amino acid uniformity and order-of-magnitude increases in amino acid sequence diversity as compared to degenerate codon randomized peptide libraries. Comparative analyses of the biophysical characteristics of the 12-mer peptide libraries revealed the trinucleotide cassette-randomized library to be a unique resource. The combination of T7 phage display and trinucleotide cassette randomization resulted in a novel resource for the potential isolation of binding peptides for new and previously studied molecular targets.

  11. Comparison of a High-Resolution Melting Assay to Next-Generation Sequencing for Analysis of HIV Diversity

    PubMed Central

    Cousins, Matthew M.; Ou, San-San; Wawer, Maria J.; Munshaw, Supriya; Swan, David; Magaret, Craig A.; Mullis, Caroline E.; Serwadda, David; Porcella, Stephen F.; Gray, Ronald H.; Quinn, Thomas C.; Donnell, Deborah; Eshleman, Susan H.

    2012-01-01

    Next-generation sequencing (NGS) has recently been used for analysis of HIV diversity, but this method is labor-intensive, costly, and requires complex protocols for data analysis. We compared diversity measures obtained using NGS data to those obtained using a diversity assay based on high-resolution melting (HRM) of DNA duplexes. The HRM diversity assay provides a single numeric score that reflects the level of diversity in the region analyzed. HIV gag and env from individuals in Rakai, Uganda, were analyzed in a previous study using NGS (n = 220 samples from 110 individuals). Three sequence-based diversity measures were calculated from the NGS sequence data (percent diversity, percent complexity, and Shannon entropy). The amplicon pools used for NGS were analyzed with the HRM diversity assay. HRM scores were significantly associated with sequence-based measures of HIV diversity for both gag and env (P < 0.001 for all measures). The level of diversity measured by the HRM diversity assay and NGS increased over time in both regions analyzed (P < 0.001 for all measures except for percent complexity in gag), and similar amounts of diversification were observed with both methods (P < 0.001 for all measures except for percent complexity in gag). Diversity measures obtained using the HRM diversity assay were significantly associated with those from NGS, and similar increases in diversity over time were detected by both methods. The HRM diversity assay is faster and less expensive than NGS, facilitating rapid analysis of large studies of HIV diversity and evolution. PMID:22785188

  12. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  13. Estimating time of HIV-1 infection from next-generation sequence diversity

    PubMed Central

    2017-01-01

    Estimating the time since infection (TI) in newly diagnosed HIV-1 patients is challenging, but important to understand the epidemiology of the infection. Here we explore the utility of virus diversity estimated by next-generation sequencing (NGS) as novel biomarker by using a recent genome-wide longitudinal dataset obtained from 11 untreated HIV-1-infected patients with known dates of infection. The results were validated on a second dataset from 31 patients. Virus diversity increased linearly with time, particularly at 3rd codon positions, with little inter-patient variation. The precision of the TI estimate improved with increasing sequencing depth, showing that diversity in NGS data yields superior estimates to the number of ambiguous sites in Sanger sequences, which is one of the alternative biomarkers. The full advantage of deep NGS was utilized with continuous diversity measures such as average pairwise distance or site entropy, rather than the fraction of polymorphic sites. The precision depended on the genomic region and codon position and was highest when 3rd codon positions in the entire pol gene were used. For these data, TI estimates had a mean absolute error of around 1 year. The error increased only slightly from around 0.6 years at a TI of 6 months to around 1.1 years at 6 years. Our results show that virus diversity determined by NGS can be used to estimate time since HIV-1 infection many years after the infection, in contrast to most alternative biomarkers. We provide the regression coefficients as well as web tool for TI estimation. PMID:28968389

  14. Comparing Sanger sequencing and high-throughput metabarcoding for inferring photobiont diversity in lichens.

    PubMed

    Paul, Fiona; Otte, Jürgen; Schmitt, Imke; Dal Grande, Francesco

    2018-06-05

    The implementation of HTS (high-throughput sequencing) approaches is rapidly changing our understanding of the lichen symbiosis, by uncovering high bacterial and fungal diversity, which is often host-specific. Recently, HTS methods revealed the presence of multiple photobionts inside a single thallus in several lichen species. This differs from Sanger technology, which typically yields a single, unambiguous algal sequence per individual. Here we compared HTS and Sanger methods for estimating the diversity of green algal symbionts within lichen thalli using 240 lichen individuals belonging to two species of lichen-forming fungi. According to HTS data, Sanger technology consistently yielded the most abundant photobiont sequence in the sample. However, if the second most abundant photobiont exceeded 30% of the total HTS reads in a sample, Sanger sequencing generally failed. Our results suggest that most lichen individuals in the two analyzed species, Lasallia hispanica and L. pustulata, indeed contain a single, predominant green algal photobiont. We conclude that Sanger sequencing is a valid approach to detect the dominant photobionts in lichen individuals and populations. We discuss which research areas in lichen ecology and evolution will continue to benefit from Sanger sequencing, and which areas will profit from HTS approaches to assessing symbiont diversity.

  15. Marine protist diversity in European coastal waters and sediments as revealed by high-throughput sequencing.

    PubMed

    Massana, Ramon; Gobet, Angélique; Audic, Stéphane; Bass, David; Bittner, Lucie; Boutte, Christophe; Chambouvet, Aurélie; Christen, Richard; Claverie, Jean-Michel; Decelle, Johan; Dolan, John R; Dunthorn, Micah; Edvardsen, Bente; Forn, Irene; Forster, Dominik; Guillou, Laure; Jaillon, Olivier; Kooistra, Wiebe H C F; Logares, Ramiro; Mahé, Frédéric; Not, Fabrice; Ogata, Hiroyuki; Pawlowski, Jan; Pernice, Massimo C; Probert, Ian; Romac, Sarah; Richards, Thomas; Santini, Sébastien; Shalchian-Tabrizi, Kamran; Siano, Raffaele; Simon, Nathalie; Stoeck, Thorsten; Vaulot, Daniel; Zingone, Adriana; de Vargas, Colomban

    2015-10-01

    Although protists are critical components of marine ecosystems, they are still poorly characterized. Here we analysed the taxonomic diversity of planktonic and benthic protist communities collected in six distant European coastal sites. Environmental deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) from three size fractions (pico-, nano- and micro/mesoplankton), as well as from dissolved DNA and surface sediments were used as templates for tag pyrosequencing of the V4 region of the 18S ribosomal DNA. Beta-diversity analyses split the protist community structure into three main clusters: picoplankton-nanoplankton-dissolved DNA, micro/mesoplankton and sediments. Within each cluster, protist communities from the same site and time clustered together, while communities from the same site but different seasons were unrelated. Both DNA and RNA-based surveys provided similar relative abundances for most class-level taxonomic groups. Yet, particular groups were overrepresented in one of the two templates, such as marine alveolates (MALV)-I and MALV-II that were much more abundant in DNA surveys. Overall, the groups displaying the highest relative contribution were Dinophyceae, Diatomea, Ciliophora and Acantharia. Also, well represented were Mamiellophyceae, Cryptomonadales, marine alveolates and marine stramenopiles in the picoplankton, and Monadofilosa and basal Fungi in sediments. Our extensive and systematic sequencing of geographically separated sites provides the most comprehensive molecular description of coastal marine protist diversity to date. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  16. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    PubMed

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Diversity of the Cronobacter Genus as Revealed by Multilocus Sequence Typing

    PubMed Central

    Joseph, S.; Sonbol, H.; Hariri, S.; Desai, P.; McClelland, M.

    2012-01-01

    Cronobacter (previously known as Enterobacter sakazakii) is a diverse bacterial genus consisting of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. universalis, C. muytjensii, C. dublinensis, and C. condimenti. In this study, we have used a multilocus sequence typing (MLST) approach employing the alleles of 7 genes (atpD, fusA, glnS, gltB, gyrB, infB, and ppsA; total length, 3,036 bp) to investigate the phylogenetic relationship of 325 Cronobacter species isolates. Strains were chosen on the basis of their species, geographic and temporal distribution, source, and clinical outcome. The earliest strain was isolated from milk powder in 1950, and the earliest clinical strain was isolated in 1953. The existence of seven species was supported by MLST. Intraspecific variation ranged from low diversity in C. sakazakii to extensive diversity within some species, such as C. muytjensii and C. dublinensis, including evidence of gene conversion between species. The predominant species from clinical sources was found to be C. sakazakii. C. sakazakii sequence type 4 (ST4) was the predominant sequence type of cerebral spinal fluid isolates from cases of meningitis. PMID:22785185

  18. Exploiting genes and functional diversity of chlorogenic acid and luteolin biosyntheses in Lonicera japonica and their substitutes.

    PubMed

    Yuan, Yuan; Wang, Zhouyong; Jiang, Chao; Wang, Xumin; Huang, Luqi

    2014-01-25

    Chlorogenic acids (CGAs) and luteolin are active compounds in Lonicera japonica, a plant of high medicinal value in traditional Chinese medicine. This study provides a comprehensive overview of gene families involved in chlorogenic acid and luteolin biosynthesis in L. japonica, as well as its substitutes Lonicera hypoglauca and Lonicera macranthoides. The gene sequence feature and gene expression patterns in various tissues and buds of the species were characterized. Bioinformatics analysis revealed that 14 chlorogenic acid and luteolin biosynthesis-related genes were identified from the L. japonica transcriptome assembly. Phylogenetic analyses suggested that the function of individual gene could be differentiation and induce active compound diversity. Their orthologous genes were also recognized in L. hypoglauca and L. macranthoides genomic datasets, except for LHCHS1 and LMC4H2. The expression patterns of these genes are different in the tissues of L. japonica, L. hypoglauca and L. macranthoides. Results also showed that CGAs were controlled in the first step of biosynthesis, whereas both steps controlled luteolin in the bud of L. japonica. The expression of LJFNS2 exhibited positive correlation with luteolin levels in L. japonica. This study provides significant information for understanding the functional diversity of gene families involved in chlorogenic acid and the luteolin biosynthesis, active compound diversity of L. japonica and its substitutes, and the different usages of the three species. Copyright © 2012. Published by Elsevier B.V.

  19. Measuring the diversity of the human microbiota with targeted next-generation sequencing.

    PubMed

    Finotello, Francesca; Mastrorilli, Eleonora; Di Camillo, Barbara

    2016-12-26

    The human microbiota is a complex ecological community of commensal, symbiotic and pathogenic microorganisms harboured by the human body. Next-generation sequencing (NGS) technologies, in particular targeted amplicon sequencing of the 16S ribosomal RNA gene (16S-seq), are enabling the identification and quantification of human-resident microorganisms at unprecedented resolution, providing novel insights into the role of the microbiota in health and disease. Once microbial abundances are quantified through NGS data analysis, diversity indices provide valuable mathematical tools to describe the ecological complexity of a single sample or to detect species differences between samples. However, diversity is not a determined physical quantity for which a consensus definition and unit of measure have been established, and several diversity indices are currently available. Furthermore, they were originally developed for macroecology and their robustness to the possible bias introduced by sequencing has not been characterized so far. To assist the reader with the selection and interpretation of diversity measures, we review a panel of broadly used indices, describing their mathematical formulations, purposes and properties, and characterize their behaviour and criticalities in dependence of the data features using simulated data as ground truth. In addition, we make available an R package, DiversitySeq, which implements in a unified framework the full panel of diversity indices and a simulator of 16S-seq data, and thus represents a valuable resource for the analysis of diversity from NGS count data and for the benchmarking of computational methods for 16S-seq. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. Estimating and comparing microbial diversity in the presence of sequencing errors

    PubMed Central

    Chiu, Chun-Huo

    2016-01-01

    Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This

  1. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing

    Treesearch

    Alana Alexander; Debbie Steel; Beth Slikas; Kendra Hoekzema; Colm Carraher; Matthew Parks; Richard Cronn; C. Scott Baker

    2012-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20...

  2. The amino acid sequence of Staphylococcus aureus penicillinase.

    PubMed Central

    Ambler, R P

    1975-01-01

    The amino acid sequence of the penicillinase (penicillin amido-beta-lactamhydrolase, EC 3.5.2.6) from Staphylococcus aureus strain PC1 was determined. The protein consists of a single polypeptide chain of 257 residues, and the sequence was determined by characterization of tryptic, chymotryptic, peptic and CNBr peptides, with some additional evidence from thermolysin and S. aureus proteinase peptides. A mistake in the preliminary report of the sequence is corrected; residues 113-116 are now thought to be -Lys-Lys-Val-Lys- rather than -Lys-Val-Lys-Lys-. Detailed evidence for the amino acid sequence has been deposited as Supplementary Publication SUP 50056 (91 pages) at the British Library (Lending Division), Boston Spa, Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1218078

  3. Low level of sequence diversity at merozoite surface protein-1 locus of Plasmodium ovale curtisi and P. ovale wallikeri from Thai isolates.

    PubMed

    Putaporntip, Chaturong; Hughes, Austin L; Jongwutiwes, Somchai

    2013-01-01

    The merozoite surface protein-1 (MSP-1) is a candidate target for the development of blood stage vaccines against malaria. Polymorphism in MSP-1 can be useful as a genetic marker for strain differentiation in malarial parasites. Although sequence diversity in the MSP-1 locus has been extensively analyzed in field isolates of Plasmodium falciparum and P. vivax, the extent of variation in its homologues in P. ovale curtisi and P. ovale wallikeri, remains unknown. Analysis of the mitochondrial cytochrome b sequences of 10 P. ovale isolates from symptomatic malaria patients from diverse endemic areas of Thailand revealed co-existence of P. ovale curtisi (n = 5) and P. ovale wallikeri (n = 5). Direct sequencing of the PCR-amplified products encompassing the entire coding region of MSP-1 of P. ovale curtisi (PocMSP-1) and P. ovale wallikeri (PowMSP-1) has identified 3 imperfect repeated segments in the former and one in the latter. Most amino acid differences between these proteins were located in the interspecies variable domains of malarial MSP-1. Synonymous nucleotide diversity (πS) exceeded nonsynonymous nucleotide diversity (πN) for both PocMSP-1 and PowMSP-1, albeit at a non-significant level. However, when MSP-1 of both these species was considered together, πS was significantly greater than πN (p<0.0001), suggesting that purifying selection has shaped diversity at this locus prior to speciation. Phylogenetic analysis based on conserved domains has placed PocMSP-1 and PowMSP-1 in a distinct bifurcating branch that probably diverged from each other around 4.5 million years ago. The MSP-1 sequences support that P. ovale curtisi and P. ovale wallikeri are distinct species. Both species are sympatric in Thailand. The low level of sequence diversity in PocMSP-1 and PowMSP-1 among Thai isolates could stem from persistent low prevalence of these species, limiting the chance of outcrossing at this locus.

  4. Low Level of Sequence Diversity at Merozoite Surface Protein-1 Locus of Plasmodium ovale curtisi and P. ovale wallikeri from Thai Isolates

    PubMed Central

    Putaporntip, Chaturong; Hughes, Austin L.; Jongwutiwes, Somchai

    2013-01-01

    Background The merozoite surface protein-1 (MSP-1) is a candidate target for the development of blood stage vaccines against malaria. Polymorphism in MSP-1 can be useful as a genetic marker for strain differentiation in malarial parasites. Although sequence diversity in the MSP-1 locus has been extensively analyzed in field isolates of Plasmodium falciparum and P. vivax, the extent of variation in its homologues in P. ovale curtisi and P. ovale wallikeri, remains unknown. Methodology/Principal Findings Analysis of the mitochondrial cytochrome b sequences of 10 P. ovale isolates from symptomatic malaria patients from diverse endemic areas of Thailand revealed co-existence of P. ovale curtisi (n = 5) and P. ovale wallikeri (n = 5). Direct sequencing of the PCR-amplified products encompassing the entire coding region of MSP-1 of P. ovale curtisi (PocMSP-1) and P. ovale wallikeri (PowMSP-1) has identified 3 imperfect repeated segments in the former and one in the latter. Most amino acid differences between these proteins were located in the interspecies variable domains of malarial MSP-1. Synonymous nucleotide diversity (πS) exceeded nonsynonymous nucleotide diversity (πN) for both PocMSP-1 and PowMSP-1, albeit at a non-significant level. However, when MSP-1 of both these species was considered together, πS was significantly greater than πN (p<0.0001), suggesting that purifying selection has shaped diversity at this locus prior to speciation. Phylogenetic analysis based on conserved domains has placed PocMSP-1 and PowMSP-1 in a distinct bifurcating branch that probably diverged from each other around 4.5 million years ago. Conclusion/Significance The MSP-1 sequences support that P. ovale curtisi and P. ovale wallikeri are distinct species. Both species are sympatric in Thailand. The low level of sequence diversity in PocMSP-1 and PowMSP-1 among Thai isolates could stem from persistent low prevalence of these species, limiting the chance of outcrossing at

  5. [Study on Microbial Diversity of Peri-implantitis Subgingival by High-throughput Sequencing].

    PubMed

    Li, Zhi-jie; Wang, Shao-guo; Li, Yue-hong; Tu, Dong-xiang; Liu, Shi-yun; Nie, Hong-bing; Li, Zhi-qiang; Zhang, Ju-mei

    2015-07-01

    To study microbial diversity of peri-implantitis subgingival with high-throughput sequencing, and investigate microbiological etiology of peri-implantitis. Subgingival plaques were sampled from the patients with peri-implantitis (D group) and non-peri-implantitis subjects (N group). The microbiological diversity of the subgingival plaques was detected by sequencing V4 region of 16S rRNA with Illumina Miseq platform. The diversity of the community structure was analyzed using Mothur software. A total of 156 507 gene sequences were detected in nine samples and 4 402 operational taxonomic units (OTUs) were found. Selenomonas, Pseudomonas, and Fusobacterium were dominant bacteria in D group, while Fusobacterium, Veillonella and Streptococcus were dominant bacteria in N group. Differences between peri-implantitis and non-peri-implantitis bacterial communities were observed at all phylogenetic levels by LEfSe, which was also found in PcoA test. The occurrence of peri-implantitis is not only related to periodontitis pathogenic microbe, but also related with the changes of oral microbial community structure. Treponema, Herbaspirillum, Butyricimonas and Phaeobacte may be closely related to the occurrence and development of peri-implantitis.

  6. Novel chytrid lineages dominate fungal sequences in diverse marine and freshwater habitats

    NASA Astrophysics Data System (ADS)

    Comeau, André M.; Vincent, Warwick F.; Bernier, Louis; Lovejoy, Connie

    2016-07-01

    In aquatic environments, fungal communities remain little studied despite their taxonomic and functional diversity. To extend the ecological coverage of this group, we conducted an in-depth analysis of fungal sequences within our collection of 3.6 million V4 18S rRNA pyrosequences originating from 319 individual marine (including sea-ice) and freshwater samples from libraries generated within diverse projects studying Arctic and temperate biomes in the past decade. Among the ~1.7 million post-filtered reads of highest taxonomic and phylogenetic quality, 23,263 fungal sequences were identified. The overall mean proportion was 1.35%, but with large variability; for example, from 0.01 to 59% of total sequences for Arctic seawater samples. Almost all sample types were dominated by Chytridiomycota-like sequences, followed by moderate-to-minor contributions of Ascomycota, Cryptomycota and Basidiomycota. Species and/or strain richness was high, with many novel sequences and high niche separation. The affinity of the most common reads to phytoplankton parasites suggests that aquatic fungi deserve renewed attention for their role in algal succession and carbon cycling.

  7. NEP: web server for epitope prediction based on antibody neutralization of viral strains with diverse sequences.

    PubMed

    Chuang, Gwo-Yu; Liou, David; Kwong, Peter D; Georgiev, Ivelin S

    2014-07-01

    Delineation of the antigenic site, or epitope, recognized by an antibody can provide clues about functional vulnerabilities and resistance mechanisms, and can therefore guide antibody optimization and epitope-based vaccine design. Previously, we developed an algorithm for antibody-epitope prediction based on antibody neutralization of viral strains with diverse sequences and validated the algorithm on a set of broadly neutralizing HIV-1 antibodies. Here we describe the implementation of this algorithm, NEP (Neutralization-based Epitope Prediction), as a web-based server. The users must supply as input: (i) an alignment of antigen sequences of diverse viral strains; (ii) neutralization data for the antibody of interest against the same set of antigen sequences; and (iii) (optional) a structure of the unbound antigen, for enhanced prediction accuracy. The prediction results can be downloaded or viewed interactively on the antigen structure (if supplied) from the web browser using a JSmol applet. Since neutralization experiments are typically performed as one of the first steps in the characterization of an antibody to determine its breadth and potency, the NEP server can be used to predict antibody-epitope information at no additional experimental costs. NEP can be accessed on the internet at http://exon.niaid.nih.gov/nep. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  8. Code-Time Diversity for Direct Sequence Spread Spectrum Systems

    PubMed Central

    Hassan, A. Y.

    2014-01-01

    Time diversity is achieved in direct sequence spread spectrum by receiving different faded delayed copies of the transmitted symbols from different uncorrelated channel paths when the transmission signal bandwidth is greater than the coherence bandwidth of the channel. In this paper, a new time diversity scheme is proposed for spread spectrum systems. It is called code-time diversity. In this new scheme, N spreading codes are used to transmit one data symbol over N successive symbols interval. The diversity order in the proposed scheme equals to the number of the used spreading codes N multiplied by the number of the uncorrelated paths of the channel L. The paper represents the transmitted signal model. Two demodulators structures will be proposed based on the received signal models from Rayleigh flat and frequency selective fading channels. Probability of error in the proposed diversity scheme is also calculated for the same two fading channels. Finally, simulation results are represented and compared with that of maximal ration combiner (MRC) and multiple-input and multiple-output (MIMO) systems. PMID:24982925

  9. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  10. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  11. Multi-locus and long amplicon sequencing approach to study microbial diversity at species level using the MinION™ portable nanopore sequencer

    PubMed Central

    Sanz, Yolanda

    2017-01-01

    Abstract The miniaturized and portable DNA sequencer MinION™ has demonstrated great potential in different analyses such as genome-wide sequencing, pathogen outbreak detection and surveillance, human genome variability, and microbial diversity. In this study, we tested the ability of the MinION™ platform to perform long amplicon sequencing in order to design new approaches to study microbial diversity using a multi-locus approach. After compiling a robust database by parsing and extracting the rrn bacterial region from more than 67000 complete or draft bacterial genomes, we demonstrated that the data obtained during sequencing of the long amplicon in the MinION™ device using R9 and R9.4 chemistries were sufficient to study 2 mock microbial communities in a multiplex manner and to almost completely reconstruct the microbial diversity contained in the HM782D and D6305 mock communities. Although nanopore-based sequencing produces reads with lower per-base accuracy compared with other platforms, we presented a novel approach consisting of multi-locus and long amplicon sequencing using the MinION™ MkIb DNA sequencer and R9 and R9.4 chemistries that help to overcome the main disadvantage of this portable sequencing platform. Furthermore, the nanopore sequencing library, constructed with the last releases of pore chemistry (R9.4) and sequencing kit (SQK-LSK108), permitted the retrieval of the higher level of 1D read accuracy sufficient to characterize the microbial species present in each mock community analysed. Improvements in nanopore chemistry, such as minimizing base-calling errors and new library protocols able to produce rapid 1D libraries, will provide more reliable information in the near future. Such data will be useful for more comprehensive and faster specific detection of microbial species and strains in complex ecosystems. PMID:28605506

  12. Amino acid sequence of the human fibronectin receptor

    PubMed Central

    1987-01-01

    The amino acid sequence deduced from cDNA of the human placental fibronectin receptor is reported. The receptor is composed of two subunits: an alpha subunit of 1,008 amino acids which is processed into two polypeptides disulfide bonded to one another, and a beta subunit of 778 amino acids. Each subunit has near its COOH terminus a hydrophobic segment. This and other sequence features suggest a structure for the receptor in which the hydrophobic segments serve as transmembrane domains anchoring each subunit to the membrane and dividing each into a large ectodomain and a short cytoplasmic domain. The alpha subunit ectodomain has five sequence elements homologous to consensus Ca2+- binding sites of several calcium-binding proteins, and the beta subunit contains a fourfold repeat strikingly rich in cysteine. The alpha subunit sequence is 46% homologous to the alpha subunit of the vitronectin receptor. The beta subunit is 44% homologous to the human platelet adhesion receptor subunit IIIa and 47% homologous to a leukocyte adhesion receptor beta subunit. The high degree of homology (85%) of the beta subunit with one of the polypeptides of a chicken adhesion receptor complex referred to as integrin complex strongly suggests that the latter polypeptide is the chicken homologue of the fibronectin receptor beta subunit. These receptor subunit homologies define a superfamily of adhesion receptors. The availability of the entire protein sequence for the fibronectin receptor will facilitate studies on the functions of these receptors. PMID:2958481

  13. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  14. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  15. Necessary Sequencing Depth and Clustering Method to Obtain Relatively Stable Diversity Patterns in Studying Fish Gut Microbiota.

    PubMed

    Xiao, Fanshu; Yu, Yuhe; Li, Jinjin; Juneau, Philippe; Yan, Qingyun

    2018-05-25

    The 16S rRNA gene is one of the most commonly used molecular markers for estimating bacterial diversity during the past decades. However, there is no consistency about the sequencing depth (from thousand to millions of sequences per sample), and the clustering methods used to generate OTUs may also be different among studies. These inconsistent premises make effective comparisons among studies difficult or unreliable. This study aims to examine the necessary sequencing depth and clustering method that would be needed to ensure a stable diversity patterns for studying fish gut microbiota. A total number of 42 samples dataset of Siniperca chuatsi (carnivorous fish) gut microbiota were used to test how the sequencing depth and clustering may affect the alpha and beta diversity patterns of fish intestinal microbiota. Interestingly, we found that the sequencing depth (resampling 1000-11,000 per sample) and the clustering methods (UPARSE and UCLUST) did not bias the estimates of the diversity patterns during the fish development from larva to adult. Although we should acknowledge that a suitable sequencing depth may differ case by case, our finding indicates that a shallow sequencing such as 1000 sequences per sample may be also enough to reflect the general diversity patterns of fish gut microbiota. However, we have shown in the present study that strict pre-processing of the original sequences is required to ensure reliable results. This study provides evidences to help making a strong scientific choice of the sequencing depth and clustering method for future studies on fish gut microbiota patterns, but at the same time reducing as much as possible the costs related to the analysis.

  16. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  17. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

    PubMed

    Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

    2017-04-15

    Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly

  18. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids

  19. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites.

    PubMed

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-10-14

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311 T  = IBT 12289 T ). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species.

  20. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    PubMed Central

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-01-01

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species. PMID:27739446

  1. Soil Parameters Drive the Structure, Diversity and Metabolic Potentials of the Bacterial Communities Across Temperate Beech Forest Soil Sequences.

    PubMed

    Jeanbille, M; Buée, M; Bach, C; Cébron, A; Frey-Klett, P; Turpault, M P; Uroz, S

    2016-02-01

    Soil and climatic conditions as well as land cover and land management have been shown to strongly impact the structure and diversity of the soil bacterial communities. Here, we addressed under a same land cover the potential effect of the edaphic parameters on the soil bacterial communities, excluding potential confounding factors as climate. To do this, we characterized two natural soil sequences occurring in the Montiers experimental site. Spatially distant soil samples were collected below Fagus sylvatica tree stands to assess the effect of soil sequences on the edaphic parameters, as well as the structure and diversity of the bacterial communities. Soil analyses revealed that the two soil sequences were characterized by higher pH and calcium and magnesium contents in the lower plots. Metabolic assays based on Biolog Ecoplates highlighted higher intensity and richness in usable carbon substrates in the lower plots than in the middle and upper plots, although no significant differences occurred in the abundance of bacterial and fungal communities along the soil sequences as assessed using quantitative PCR. Pyrosequencing analysis of 16S ribosomal RNA (rRNA) gene amplicons revealed that Proteobacteria, Acidobacteria and Bacteroidetes were the most abundantly represented phyla. Acidobacteria, Proteobacteria and Chlamydiae were significantly enriched in the most acidic and nutrient-poor soils compared to the Bacteroidetes, which were significantly enriched in the soils presenting the higher pH and nutrient contents. Interestingly, aluminium, nitrogen, calcium, nutrient availability and pH appeared to be the best predictors of the bacterial community structures along the soil sequences.

  2. Exploitation of the diverse insertion sequence element content of dairy Lactobacillus helveticus starters as a rapid method to identify different strains.

    PubMed

    Kaleta, Pawel; Callanan, Michael J; O'Callaghan, John; Fitzgerald, Gerald F; Beresford, Thomas P; Ross, R Paul

    2009-10-01

    The species Lactobacillus helveticus is a commonly used thermophilic starter and/or adjunct culture for Swiss and Cheddar cheese manufacture. Its use is normally associated with flavour improvement which is known to be associated with culture traits such as rapid autolysis and high proteolytic activity. The genome of the commercial strain, DPC4571, was recently sequenced and found to have an abundance of IS sequences in terms of both abundance (213 intact) and diversity (21 types). Given this unique diversity for a lactic acid bacterium, we investigated whether PCR-based IS fingerprinting could be used as a discriminatory tool to distinguish between different strains of Lb. helveticus. A set of ten primers targeting five of the most numerous groups (ISL1201, ISLhe65, ISLhe2, ISLhe15 and ISL2) of IS elements was designed. Multiplex-PCR with all primers resulted in 1-12 discreet amplicons for each strain tested. The resultant fingerprints (in the 0.5 kb-3 kb range) were found to be strain specific and reproducible. This approach thus provides a valuable method to distinguish between Lb. helveticus strains while giving some indication of the relative abundance of IS sequences in each strain.

  3. Sequence diversity and molecular evolutionary rates between buffalo and cattle.

    PubMed

    Moaeen-ud-Din, M; Bilal, G

    2015-02-01

    Identification of genes of importance regarding production traits in buffalo is impaired by a paucity of genomic resources. Choice to fill this gap is to exploit data available for cow. The cross-species application of comparative genomics tools is potential gear to investigate the buffalo genome. However, this is dependent on nucleotide sequences similarity. In this study, gene diversity between buffalo and cattle was determined using 86 gene orthologues. There was approximately 3% difference in all genes in terms of nucleotide diversity and 0.267 ± 0.134 in amino acids, indicating the possibility for successfully using cross-species strategies for genomic studies. There were significantly higher non-synonymous substitutions both in cattle and buffalo; however, there was similar difference in terms of dN- dS (4.414 versus 4.745) in buffalo and cattle, respectively. Higher rate of non-synonymous substitutions at similar level in buffalo and cattle indicated a similar positive selection pressure. Results for relative rate test were assessed with the chi-squared test. There was no significance difference on unique mutations between cattle and buffalo lineages at synonymous sites. However, there was a significance difference on unique mutations for non-synonymous sites, indicating ongoing mutagenic process that generates substitutional mutation at approximately the same rate at silent sites. Moreover, despite of common ancestry, our results indicate a different divergent time among genes of cattle and buffalo. This is the first demonstration that variable rates of molecular evolution may be present within the family Bovidae. © 2014 Blackwell Verlag GmbH.

  4. Oral treponeme major surface protein: Sequence diversity and distributions within periodontal niches.

    PubMed

    You, M; Chan, Y; Lacap-Bugler, D C; Huo, Y-B; Gao, W; Leung, W K; Watt, R M

    2017-12-01

    Treponema denticola and other species (phylotypes) of oral spirochetes are widely considered to play important etiological roles in periodontitis and other oral infections. The major surface protein (Msp) of T. denticola is directly implicated in several pathological mechanisms. Here, we have analyzed msp sequence diversity across 68 strains of oral phylogroup 1 and 2 treponemes; including reference strains of T. denticola, Treponema putidum, Treponema medium, 'Treponema vincentii', and 'Treponema sinensis'. All encoded Msp proteins contained highly conserved, taxon-specific signal peptides, and shared a predicted 'three-domain' structure. A clone-based strategy employing 'msp-specific' polymerase chain reaction primers was used to analyze msp gene sequence diversity present in subgingival plaque samples collected from a group of individuals with chronic periodontitis (n=10), vs periodontitis-free controls (n=10). We obtained 626 clinical msp gene sequences, which were assigned to 21 distinct 'clinical msp genotypes' (95% sequence identity cut-off). The most frequently detected clinical msp genotype corresponded to T. denticola ATCC 35405 T , but this was not correlated to disease status. UniFrac and libshuff analysis revealed that individuals with periodontitis and periodontitis-free controls harbored significantly different communities of treponeme clinical msp genotypes (P<.001). Patients with periodontitis had higher levels of clinical msp genotype diversity than periodontitis-free controls (Mann-Whitney U-test, P<.05). The relative proportions of 'T. vincentii' clinical msp genotypes were significantly higher in the control group than in the periodontitis group (P=.018). In conclusion, our data clearly show that both healthy and diseased individuals commonly harbor a wide diversity of Treponema clinical msp genotypes within their subgingival niches. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. Microbial diversity at the moderate acidic stage in three different sulfidic mine tailings dumps generating acid mine drainage.

    PubMed

    Korehi, Hananeh; Blöthe, Marco; Schippers, Axel

    2014-11-01

    In freshly deposited sulfidic mine tailings the pH is alkaline or circumneutral. Due to pyrite or pyrrhotite oxidation the pH is dropping over time to pH values <3 at which acidophilic iron- and sulfur-oxidizing prokaryotes prevail and accelerate the oxidation processes, well described for several mine waste sites. The microbial communities at the moderate acidic stage in mine tailings are only scarcely studied. Here we investigated the microbial diversity via 16S rRNA gene sequence analysis in eight samples (pH range 3.2-6.5) from three different sulfidic mine tailings dumps in Botswana, Germany and Sweden. In total 701 partial 16S rRNA gene sequences revealed a divergent microbial community between the three sites and at different tailings depths. Proteobacteria and Firmicutes were overall the most abundant phyla in the clone libraries. Acidobacteria, Actinobacteria, Bacteroidetes, and Nitrospira occurred less frequently. The found microbial communities were completely different to microbial communities in tailings at

  6. Diversity of thermophilic fungi in Tengchong Rehai National Park revealed by ITS nucleotide sequence analyses.

    PubMed

    Pan, Wen-Zheng; Huang, Xiao-Wei; Wei, Kang-Bi; Zhang, Chun-Mei; Yang, Dong-Mei; Ding, Jun-Mei; Zhang, Ke-Qin

    2010-04-01

    The geothermal sites near neutral and alkalescent thermal springs in Tengchong Rehai National Park were examined through cultivation-dependent approach to determine the diversity of thermophilic fungi in these environments. Here, we collected soils samples in this area, plated on agar media conducive for fungal growth, obtained pure cultures, and then employed the method of internal transcribed spacer (ITS) sequencing combined with morphological analysis for identification of thermophilic fungi to the species level. In total, 102 strains were isolated and identified as Rhizomucor miehei, Chaetomium sp., Talaromyces thermophilus, Talaromyces byssochlamydoides, Thermoascus aurantiacus Miehe var. levisporus, Thermomyces lanuginosus, Scytalidium thermophilum, Malbranchea flava, Myceliophthora sp. 1, Myceliophthora sp. 2, Myceliophthora sp. 3, and Coprinopsis sp. Two species, T. lanuginosus and S. thermophilum were the dominant species, representing 34.78% and 28.26% of the sample, respectively. Our results indicated a greater diversity of thermophilic fungi in neutral and alkaline geothermal sites than acidic sites around hot springs reported in previous studies. Most of our strains thrived at alkaline growth conditions.

  7. Diverse bacterial PKS sequences derived from okadaic acid-producing dinoflagellates.

    PubMed

    Perez, Roberto; Liu, Li; Lopez, Jose; An, Tianying; Rein, Kathleen S

    2008-05-22

    Okadaic acid (OA) and the related dinophysistoxins are isolated from dinoflagellates of the genus Prorocentrum and Dinophysis. Bacteria of the Roseobacter group have been associated with okadaic acid producing dinoflagellates and have been previously implicated in OA production. Analysis of 16S rRNA libraries reveals that Roseobacter are the most abundant bacteria associated with OA producing dinoflagellates of the genus Prorocentrum and are not found in association with non-toxic dinoflagellates. While some polyketide synthase (PKS) genes form a highly supported Prorocentrum clade, most appear to be bacterial, but unrelated to Roseobacter or Alpha-Proteobacterial PKSs or those derived from other Alveolates Karenia brevis or Crytosporidium parvum.

  8. Diverse Bacterial PKS Sequences Derived From Okadaic Acid-Producing Dinoflagellates

    PubMed Central

    Perez, Roberto; Liu, Li; Lopez, Jose; An, Tianying; Rein, Kathleen S.

    2008-01-01

    Okadaic acid (OA) and the related dinophysistoxins are isolated from dinoflagellates of the genus Prorocentrum and Dinophysis. Bacteria of the Roseobacter group have been associated with okadaic acid producing dinoflagellates and have been previously implicated in OA production. Analysis of 16S rRNA libraries reveals that Roseobacter are the most abundant bacteria associated with OA producing dinoflagellates of the genus Prorocentrum and are not found in association with non-toxic dinoflagellates. While some polyketide synthase (PKS) genes form a highly supported Prorocentrum clade, most appear to be bacterial, but unrelated to Roseobacter or Alpha-Proteobacterial PKSs or those derived from other Alveolates Karenia brevis or Crytosporidium parvum. PMID:18728765

  9. Complete amino acid sequence of bovine colostrum low-Mr cysteine proteinase inhibitor.

    PubMed

    Hirado, M; Tsunasawa, S; Sakiyama, F; Niinobe, M; Fujii, S

    1985-07-01

    The complete amino acid sequence of bovine colostrum cysteine proteinase inhibitor was determined by sequencing native inhibitor and peptides obtained by cyanogen bromide degradation, Achromobacter lysylendopeptidase digestion and partial acid hydrolysis of reduced and S-carboxymethylated protein. Achromobacter peptidase digestion was successfully used to isolate two disulfide-containing peptides. The inhibitor consists of 112 amino acids with an Mr of 12787. Two disulfide bonds were established between Cys 66 and Cys 77 and between Cys 90 and Cys 110. A high degree of homology in the sequence was found between the colostrum inhibitor and human gamma-trace, human salivary acidic protein and chicken egg-white cystatin.

  10. Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)

    PubMed Central

    Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping

    2014-01-01

    Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054

  11. Sequence diversity of NanA manifests in distinct enzyme kinetics and inhibitor susceptibility

    NASA Astrophysics Data System (ADS)

    Xu, Zhongli; von Grafenstein, Susanne; Walther, Elisabeth; Fuchs, Julian E.; Liedl, Klaus R.; Sauerbrei, Andreas; Schmidtke, Michaela

    2016-04-01

    Streptococcus pneumoniae is the leading pathogen causing bacterial pneumonia and meningitis. Its surface-associated virulence factor neuraminidase A (NanA) promotes the bacterial colonization by removing the terminal sialyl residues from glycoconjugates on eukaryotic cell surface. The predominant role of NanA in the pathogenesis of pneumococci renders it an attractive target for therapeutic intervention. Despite the highly conserved activity of NanA, our alignment of the 11 NanAs revealed the evolutionary diversity of this enzyme. The amino acid substitutions we identified, particularly those in the lectin domain and in the insertion domain next to the catalytic centre triggered our special interest. We synthesised the representative NanAs and the mutagenized derivatives from E. coli for enzyme kinetics study and neuraminidase inhibitor susceptibility test. Via molecular docking we got a deeper insight into the differences between the two major variants of NanA and their influence on the ligand-target interactions. In addition, our molecular dynamics simulations revealed a prominent intrinsic flexibility of the linker between the active site and the insertion domain, which influences the inhibitor binding. Our findings for the first time associated the primary sequence diversity of NanA with the biochemical properties of the enzyme and with the inhibitory efficiency of neuraminidase inhibitors.

  12. Diversity of lactic acid bacteria in suan-tsai and fu-tsai, traditional fermented mustard products of Taiwan.

    PubMed

    Chao, Shiou-Huei; Wu, Ruei-Jie; Watanabe, Koichi; Tsai, Ying-Chieh

    2009-11-15

    Fu-tsai and suan-tsai are spontaneously fermented mustard products traditionally prepared by the Hakka tribe of Taiwan. We chose 5 different processing stages of these products for analysis of the microbial community of lactic acid bacteria (LAB) by 16S rRNA gene sequencing. From 500 LAB isolates we identified 119 representative strains belonging to 5 genera and 18 species, including Enterococcus (1 species), Lactobacillus (11 species), Leuconostoc (3 species), Pediococcus (1 species), and Weissella (2 species). The LAB composition of mustard fermented for 3 days, known as the Mu sample, was the most diverse, with 11 different LAB species being isolated. We used sequence analysis of the 16S rRNA gene to identify the LAB strains and analysis of the dnaA, pheS, and rpoA genes to identify 13 LAB strains for which identification by 16S rRNA gene sequences was not possible. These 13 strains were found to belong to 5 validated known species: Lactobacillus farciminis, Leuconostoc mesenteroides, Leuconostoc pseudomesenteroides, Weissella cibaria, and Weissella paramesenteroides, and 5 possibly novel Lactobacillus species. These results revealed that there is a high level of diversity in LAB at the different stages of fermentation in the production of suan-tsai and fu-tsai.

  13. Influence of geographical origin and flour type on diversity of lactic acid bacteria in traditional Belgian sourdoughs.

    PubMed

    Scheirlinck, Ilse; Van der Meulen, Roel; Van Schoor, Ann; Vancanneyt, Marc; De Vuyst, Luc; Vandamme, Peter; Huys, Geert

    2007-10-01

    A culture-based approach was used to investigate the diversity of lactic acid bacteria (LAB) in Belgian traditional sourdoughs and to assess the influence of flour type, bakery environment, geographical origin, and technological characteristics on the taxonomic composition of these LAB communities. For this purpose, a total of 714 LAB from 21 sourdoughs sampled at 11 artisan bakeries throughout Belgium were subjected to a polyphasic identification approach. The microbial composition of the traditional sourdoughs was characterized by bacteriological culture in combination with genotypic identification methods, including repetitive element sequence-based PCR fingerprinting and phenylalanyl-tRNA synthase (pheS) gene sequence analysis. LAB from Belgian sourdoughs belonged to the genera Lactobacillus, Pediococcus, Leuconostoc, Weissella, and Enterococcus, with the heterofermentative species Lactobacillus paralimentarius, Lactobacillus sanfranciscensis, Lactobacillus plantarum, and Lactobacillus pontis as the most frequently isolated taxa. Statistical analysis of the identification data indicated that the microbial composition of the sourdoughs is mainly affected by the bakery environment rather than the flour type (wheat, rye, spelt, or a mixture of these) used. In conclusion, the polyphasic approach, based on rapid genotypic screening and high-resolution, sequence-dependent identification, proved to be a powerful tool for studying the LAB diversity in traditional fermented foods such as sourdough.

  14. The Diversity Present in 5140 Human Mitochondrial Genomes

    PubMed Central

    Pereira, Luísa; Freitas, Fernando; Fernandes, Verónica; Pereira, Joana B.; Costa, Marta D.; Costa, Stephanie; Máximo, Valdemar; Macaulay, Vincent; Rocha, Ricardo; Samuels, David C.

    2009-01-01

    We analyzed the current status (as of the end of August 2008) of human mitochondrial genomes deposited in GenBank, amounting to 5140 complete or coding-region sequences, in order to present an overall picture of the diversity present in the mitochondrial DNA of the global human population. To perform this task, we developed mtDNA-GeneSyn, a computer tool that identifies and exhaustedly classifies the diversity present in large genetic data sets. The diversity observed in the 5140 human mitochondrial genomes was compared with all possible transitions and transversions from the standard human mitochondrial reference genome. This comparison showed that tRNA and rRNA secondary structures have a large effect in limiting the diversity of the human mitochondrial sequences, whereas for the protein-coding genes there is a bias toward less variation at the second codon positions. The analysis of the observed amino acid variations showed a tolerance of variations that convert between the amino acids V, I, A, M, and T. This defines a group of amino acids with similar chemical properties that can interconvert by a single transition. PMID:19426953

  15. Microbial Diversity in Deep-sea Methane Seep Sediments Presented by SSU rRNA Gene Tag Sequencing

    PubMed Central

    Nunoura, Takuro; Takaki, Yoshihiro; Kazama, Hiromi; Hirai, Miho; Ashi, Juichiro; Imachi, Hiroyuki; Takai, Ken

    2012-01-01

    Microbial community structures in methane seep sediments in the Nankai Trough were analyzed by tag-sequencing analysis for the small subunit (SSU) rRNA gene using a newly developed primer set. The dominant members of Archaea were Deep-sea Hydrothermal Vent Euryarchaeotic Group 6 (DHVEG 6), Marine Group I (MGI) and Deep Sea Archaeal Group (DSAG), and those in Bacteria were Alpha-, Gamma-, Delta- and Epsilonproteobacteria, Chloroflexi, Bacteroidetes, Planctomycetes and Acidobacteria. Diversity and richness were examined by 8,709 and 7,690 tag-sequences from sediments at 5 and 25 cm below the seafloor (cmbsf), respectively. The estimated diversity and richness in the methane seep sediment are as high as those in soil and deep-sea hydrothermal environments, although the tag-sequences obtained in this study were not sufficient to show whole microbial diversity in this analysis. We also compared the diversity and richness of each taxon/division between the sediments from the two depths, and found that the diversity and richness of some taxa/divisions varied significantly along with the depth. PMID:22510646

  16. Assessment of antibody library diversity through next generation sequencing and technical error compensation

    PubMed Central

    Lisi, Simonetta; Chirichella, Michele; Arisi, Ivan; Goracci, Martina; Cremisi, Federico; Cattaneo, Antonino

    2017-01-01

    Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its complexity (also known as diversity), i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library complexity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Complexity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring complexity from such a small sampling is, however, very rudimental and gives limited information about the real diversity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library complexity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the complexity, taking in consideration the sequencing error. PMID:28505201

  17. Assessment of antibody library diversity through next generation sequencing and technical error compensation.

    PubMed

    Fantini, Marco; Pandolfini, Luca; Lisi, Simonetta; Chirichella, Michele; Arisi, Ivan; Terrigno, Marco; Goracci, Martina; Cremisi, Federico; Cattaneo, Antonino

    2017-01-01

    Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its complexity (also known as diversity), i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library complexity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Complexity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring complexity from such a small sampling is, however, very rudimental and gives limited information about the real diversity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library complexity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the complexity, taking in consideration the sequencing error.

  18. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P [Santa Fe, NM; White, P Scott [Los Alamos, NM

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  19. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

    PubMed Central

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-01-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

  20. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    PubMed

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  1. Genotyping-by-sequencing (GBS) revealed molecular genetic diversity of Iranian wheat landraces and cultivars

    USDA-ARS?s Scientific Manuscript database

    Genetic diversity is an essential resource for breeders to improve new cultivars with desirable characteristics. Recently genotyping-by-sequencing (GBS), a next generation sequencing (NGS) based technology that can simplify complex genomes, has been used as a high-throughput and cost-effective molec...

  2. Population diversity of Diaphorina citri (Hemiptera: Liviidae) in China based on whole mitochondrial genome sequences.

    PubMed

    Wu, Fengnian; Jiang, Hongyan; Beattie, G Andrew C; Holford, Paul; Chen, Jianchi; Wallis, Christopher M; Zheng, Zheng; Deng, Xiaoling; Cen, Yijing

    2018-04-24

    Diaphorina citri (Asian citrus psyllid; ACP) transmits 'Candidatus Liberibacter asiaticus' associated with citrus Huanglongbing (HLB). ACP has been reported in 11 provinces/regions in China, yet its population diversity remains unclear. In this study, we evaluated ACP population diversity in China using representative whole mitochondrial genome (mitogenome) sequences. Additional mitogenome sequences outside China were also acquired and evaluated. The sizes of the 27 ACP mitogenome sequences ranged from 14 986 to 15 030 bp. Along with three previously published mitogenome sequences, the 30 sequences formed three major mitochondrial groups (MGs): MG1, present in southwestern China and occurring at elevations above 1000 m; MG2, present in southeastern China and Southeast Asia (Cambodia, Indonesia, Malaysia, and Vietnam) and occurring at elevations below 180 m; and MG3, present in the USA and Pakistan. Single nucleotide polymorphisms in five genes (cox2, atp8, nad3, nad1 and rrnL) contributed mostly in the ACP diversity. Among these genes, rrnL had the most variation. Mitogenome sequences analyses revealed two major phylogenetic groups of ACP present in China as well as a possible unique group present currently in Pakistan and the USA. The information could have significant implications for current ACP control and HLB management. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  3. Improved serial analysis of V1 ribosomal sequence tags (SARST-V1) provides a rapid, comprehensive, sequence-based characterization of bacterial diversity and community composition.

    PubMed

    Yu, Zhongtang; Yu, Marie; Morrison, Mark

    2006-04-01

    Serial analysis of ribosomal sequence tags (SARST) is a recently developed technology that can generate large 16S rRNA gene (rrs) sequence data sets from microbiomes, but there are numerous enzymatic and purification steps required to construct the ribosomal sequence tag (RST) clone libraries. We report here an improved SARST method, which still targets the V1 hypervariable region of rrs genes, but reduces the number of enzymes, oligonucleotides, reagents, and technical steps needed to produce the RST clone libraries. The new method, hereafter referred to as SARST-V1, was used to examine the eubacterial diversity present in community DNA recovered from the microbiome resident in the ovine rumen. The 190 sequenced clones contained 1055 RSTs and no less than 236 unique phylotypes (based on > or = 95% sequence identity) that were assigned to eight different eubacterial phyla. Rarefaction and monomolecular curve analyses predicted that the complete RST clone library contains 99% of the 353 unique phylotypes predicted to exist in this microbiome. When compared with ribosomal intergenic spacer analysis (RISA) of the same community DNA sample, as well as a compilation of nine previously published conventional rrs clone libraries prepared from the same type of samples, the RST clone library provided a more comprehensive characterization of the eubacterial diversity present in rumen microbiomes. As such, SARST-V1 should be a useful tool applicable to comprehensive examination of diversity and composition in microbiomes and offers an affordable, sequence-based method for diversity analysis.

  4. Diversity within Italian Cheesemaking Brine-Associated Bacterial Communities Evidenced by Massive Parallel 16S rRNA Gene Tag Sequencing

    PubMed Central

    Marino, Marilena; Innocente, Nadia; Maifreni, Michela; Mounier, Jérôme; Cobo-Díaz, José F.; Coton, Emmanuel; Carraro, Lisa; Cardazzo, Barbara

    2017-01-01

    This study explored the bacterial diversity of brines used for cheesemaking in Italy, as well as their physicochemical characteristics. In this context, 19 brines used to salt soft, semi-hard, and hard Italian cheeses were collected in 14 commercial cheese plants and analyzed using a culture-independent amplicon sequencing approach in order to describe their bacterial microbiota. Large NaCl concentration variations were observed among the selected brines, with hard cheese brines exhibiting the highest values. Acidity values showed a great variability too, probably in relation to the brine use prior to sampling. Despite their high salt content, brine microbial loads ranged from 2.11 to 6.51 log CFU/mL for the total mesophilic count. Microbial community profiling assessed by 16S rRNA gene sequencing showed that these ecosystems were dominated by Firmicutes and Proteobacteria, followed by Actinobacteria and Bacteroidetes. Cheese type and brine salinity seem to be the main parameters accountable for brine microbial diversity. On the contrary, brine pH, acidity and protein concentration, correlated to cheese brine age, did not have any selective effect on the microbiota composition. Nine major genera were present in all analyzed brines, indicating that they might compose the core microbiome of cheese brines. Staphylococcus aureus was occasionally detected in brines using selective culture media. Interestingly, bacterial genera associated with a functional and technological use were frequently detected. Indeed Bifidobacteriaceae, which might be valuable probiotic candidates, and specific microbial genera such as Tetragenococcus, Corynebacterium and non-pathogenic Staphylococcus, which can contribute to sensorial properties of ripened cheeses, were widespread within brines. PMID:29163411

  5. Diversity of predominant lactic acid bacteria associated with cocoa fermentation in Nigeria.

    PubMed

    Kostinek, Melanie; Ban-Koffi, Louis; Ottah-Atikpo, Margaret; Teniola, David; Schillinger, Ulrich; Holzapfel, Wilhelm H; Franz, Charles M A P

    2008-04-01

    The fermentation of cocoa relies on a complex succession of bacteria and filamentous fungi, all of which can have an impact on cocoa flavor. So far, few investigations have focused on the diversity of lactic acid bacteria involved in cocoa fermentation, and many earlier investigations did not rely on polyphasic taxonomical approaches, which take both phenotypic and genotypic characterization techniques into account. In our study, we characterized predominant lactic acid bacteria from cocoa fermentations in Nigeria, using a combination of phenotypic tests, repetitive extragenic palindromic PCR, and sequencing of the 16S rRNA gene of representative strains for accurate species identification. Thus, of a total of 193 lactic acid bacteria (LAB) strains isolated from common media used to cultivate LAB, 40 (20.7%) were heterofermentative and consisted of either L. brevis or L. fermentum strains. The majority of the isolates were homofermentative rods (110 strains; 57% of isolates) which were characterized as L. plantarum strains. The homofermentative cocci consisted predominantly of 35 (18.1% of isolates) Pediococcus acidilactici strains. Thus, the LAB populations derived from these media in this study were accurately described. This can contribute to the further assessment of the effect of common LAB strains on the flavor characteristics of fermenting cocoa in further studies.

  6. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    DOE PAGES

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; ...

    2015-05-12

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are

  7. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are

  8. High-accuracy identification of incident HIV-1 infections using a sequence clustering based diversity measure.

    PubMed

    Xia, Xia-Yu; Ge, Meng; Hsi, Jenny H; He, Xiang; Ruan, Yu-Hua; Wang, Zhi-Xin; Shao, Yi-Ming; Pan, Xian-Ming

    2014-01-01

    Accurate estimates of HIV-1 incidence are essential for monitoring epidemic trends and evaluating intervention efforts. However, the long asymptomatic stage of HIV-1 infection makes it difficult to effectively distinguish incident infections from chronic ones. Current incidence assays based on serology or viral sequence diversity are both still lacking in accuracy. In the present work, a sequence clustering based diversity (SCBD) assay was devised by utilizing the fact that viral sequences derived from each transmitted/founder (T/F) strain tend to cluster together at early stage, and that only the intra-cluster diversity is correlated with the time since HIV-1 infection. The dot-matrix pairwise alignment was used to eliminate the disproportional impact of insertion/deletions (indels) and recombination events, and so was the proportion of clusterable sequences (Pc) as an index to identify late chronic infections with declined viral genetic diversity. Tested on a dataset containing 398 incident and 163 chronic infection cases collected from the Los Alamos HIV database (last modified 2/8/2012), our SCBD method achieved 99.5% sensitivity and 98.8% specificity, with an overall accuracy of 99.3%. Further analysis and evaluation also suggested its performance was not affected by host factors such as the viral subtypes and transmission routes. The SCBD method demonstrated the potential of sequencing based techniques to become useful for identifying incident infections. Its use may be most advantageous for settings with low to moderate incidence relative to available resources. The online service is available at http://www.bioinfo.tsinghua.edu.cn:8080/SCBD/index.jsp.

  9. Functional diversity of microbial communities in pristine aquifers inferred by PLFA- and sequencing-based approaches

    NASA Astrophysics Data System (ADS)

    Schwab, Valérie F.; Herrmann, Martina; Roth, Vanessa-Nina; Gleixner, Gerd; Lehmann, Robert; Pohnert, Georg; Trumbore, Susan; Küsel, Kirsten; Totsche, Kai U.

    2017-05-01

    Microorganisms in groundwater play an important role in aquifer biogeochemical cycles and water quality. However, the mechanisms linking the functional diversity of microbial populations and the groundwater physico-chemistry are still not well understood due to the complexity of interactions between surface and subsurface. Within the framework of Hainich (north-western Thuringia, central Germany) Critical Zone Exploratory of the Collaborative Research Centre AquaDiva, we used the relative abundances of phospholipid-derived fatty acids (PLFAs) to link specific biochemical markers within the microbial communities to the spatio-temporal changes of the groundwater physico-chemistry. The functional diversities of the microbial communities were mainly correlated with groundwater chemistry, including dissolved O2, Fet and NH4+ concentrations. Abundances of PLFAs derived from eukaryotes and potential nitrite-oxidizing bacteria (11Me16:0 as biomarker for Nitrospira moscoviensis) were high at sites with elevated O2 concentration where groundwater recharge supplies bioavailable substrates. In anoxic groundwaters more rich in Fet, PLFAs abundant in sulfate-reducing bacteria (SRB), iron-reducing bacteria and fungi increased with Fet and HCO3- concentrations, suggesting the occurrence of active iron reduction and the possible role of fungi in meditating iron solubilization and transport in those aquifer domains. In more NH4+-rich anoxic groundwaters, anammox bacteria and SRB-derived PLFAs increased with NH4+ concentration, further evidencing the dependence of the anammox process on ammonium concentration and potential links between SRB and anammox bacteria. Additional support of the PLFA-based bacterial communities was found in DNA- and RNA-based Illumina MiSeq amplicon sequencing of bacterial 16S rRNA genes, which showed high predominance of nitrite-oxidizing bacteria Nitrospira, e.g. Nitrospira moscoviensis, in oxic aquifer zones and of anammox bacteria in more NH4+-rich

  10. WEB-server for search of a periodicity in amino acid and nucleotide sequences

    NASA Astrophysics Data System (ADS)

    E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

    2017-12-01

    A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.

  11. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  12. Influence of Geographical Origin and Flour Type on Diversity of Lactic Acid Bacteria in Traditional Belgian Sourdoughs▿ †

    PubMed Central

    Scheirlinck, Ilse; Van der Meulen, Roel; Van Schoor, Ann; Vancanneyt, Marc; De Vuyst, Luc; Vandamme, Peter; Huys, Geert

    2007-01-01

    A culture-based approach was used to investigate the diversity of lactic acid bacteria (LAB) in Belgian traditional sourdoughs and to assess the influence of flour type, bakery environment, geographical origin, and technological characteristics on the taxonomic composition of these LAB communities. For this purpose, a total of 714 LAB from 21 sourdoughs sampled at 11 artisan bakeries throughout Belgium were subjected to a polyphasic identification approach. The microbial composition of the traditional sourdoughs was characterized by bacteriological culture in combination with genotypic identification methods, including repetitive element sequence-based PCR fingerprinting and phenylalanyl-tRNA synthase (pheS) gene sequence analysis. LAB from Belgian sourdoughs belonged to the genera Lactobacillus, Pediococcus, Leuconostoc, Weissella, and Enterococcus, with the heterofermentative species Lactobacillus paralimentarius, Lactobacillus sanfranciscensis, Lactobacillus plantarum, and Lactobacillus pontis as the most frequently isolated taxa. Statistical analysis of the identification data indicated that the microbial composition of the sourdoughs is mainly affected by the bakery environment rather than the flour type (wheat, rye, spelt, or a mixture of these) used. In conclusion, the polyphasic approach, based on rapid genotypic screening and high-resolution, sequence-dependent identification, proved to be a powerful tool for studying the LAB diversity in traditional fermented foods such as sourdough. PMID:17675431

  13. A dominant conformational role for amino acid diversity in minimalist protein–protein interfaces

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbreth, Ryan N.; Esaki, Kaori; Koide, Akiko

    Recent studies have shown that highly simplified interaction surfaces consisting of combinations of just two amino acids, Tyr and Ser, exhibit high affinity and specificity. The high functional levels of such minimalist interfaces might thus indicate small contributions of greater amino acid diversity seen in natural interfaces. Toward addressing this issue, we have produced a pair of binding proteins built on the fibronectin type III scaffold, termed “monobodies.” One monobody contains the Tyr/Ser binary-code interface (termed YS) and the other contains an expanded amino acid diversity interface (YSX), but both bind to an identical target, maltose-binding protein. The YSX monobodymore » bound with higher affinity, a slower off rate and a more favorable enthalpic contribution than the YS monobody. High-resolution X-ray crystal structures revealed that both proteins bound to an essentially identical epitope, providing a unique opportunity to directly investigate the role of amino acid diversity in a protein interaction interface. Surprisingly, Tyr still dominates the YSX paratope and the additional amino acid types are primarily used to conformationally optimize contacts made by tyrosines. Scanning mutagenesis showed that while all contacting Tyr side chains are essential in the YS monobody, the YSX interface was more tolerant to mutations. These results suggest that the conformational, not chemical, diversity of additional types of amino acids provided higher functionality and evolutionary robustness, supporting the dominant role of Tyr and the importance of conformational diversity in forming protein interaction interfaces.« less

  14. A Dominant Conformational Role for Amino Acid Diversity in Minimalist Protein-Protein Interfaces

    PubMed Central

    Gilbreth, Ryan N.; Esaki, Kaori; Koide, Akiko; Sidhu, Sachdev S.; Koide, Shohei

    2008-01-01

    Recent studies have shown that highly simplified interaction surfaces consisting of combinations of just two amino acids, Tyr and Ser, exhibit high affinity and specificity. The high functional levels of such minimalist interfaces might thus indicate small contributions of greater amino acid diversity seen in natural interfaces. Toward addressing this issue, we have produced a pair of binding proteins built on the fibronectin type III scaffold, termed “monobodies”. One monobody contains the Tyr/Ser binary-code interface (termed YS) and the other contains an expanded amino acid diversity interface (YSX), but both bind to an identical target, maltose binding protein (MBP). The YSX monobody bound with higher affinity, a slower off rate and a more favorable enthalpic contribution than the YS monobody. High-resolution x-ray crystal structures revealed that both proteins bound to an essentially identical epitope, providing a unique opportunity to directly investigate the role of amino acid diversity in a protein interaction interface. Surprisingly, Tyr still dominates the YSX paratope and the additional amino acid types are primarily used to conformationally optimize contacts made by tyrosines. Scanning mutagenesis showed that while all contacting Tyr side-chains are essential in the YS monobody, the YSX interface was more tolerant to mutations. These results suggest that the conformational, not chemical, diversity of additional types of amino acids provided higher functionality and evolutionary robustness, supporting the dominant role of Tyr and the importance of conformational diversity in forming protein interaction interfaces. PMID:18602117

  15. Genetic diversity analysis of Gossypium arboreum germplasm accessions using genotyping-by-sequencing.

    PubMed

    Li, Ruijuan; Erpelding, John E

    2016-10-01

    The diploid cotton species Gossypium arboreum possesses many favorable agronomic traits such as drought tolerance and disease resistance, which can be utilized in the development of improved upland cotton cultivars. The USDA National Plant Germplasm System maintains more than 1600 G. arboreum accessions. Little information is available on the genetic diversity of the collection thereby limiting the utilization of this cotton species. The genetic diversity and population structure of the G. arboreum germplasm collection were assessed by genotyping-by-sequencing of 375 accessions. Using genome-wide single nucleotide polymorphism sequence data, two major clusters were inferred with 302 accessions in Cluster 1, 64 accessions in Cluster 2, and nine accessions unassigned due to their nearly equal membership to each cluster. These two clusters were further evaluated independently resulting in the identification of two sub-clusters for the 302 Cluster 1 accessions and three sub-clusters for the 64 Cluster 2 accessions. Low to moderate genetic diversity between clusters and sub-clusters were observed indicating a narrow genetic base. Cluster 2 accessions were more genetically diverse and the majority of the accessions in this cluster were landraces. In contrast, Cluster 1 is composed of varieties or breeding lines more recently added to the collection. The majority of the accessions had kinship values ranging from 0.6 to 0.8. Eight pairs of accessions were identified as potential redundancies due to their high kinship relatedness. The genetic diversity and genotype data from this study are essential to enhance germplasm utilization to identify genetically diverse accessions for the detection of quantitative trait loci associated with important traits that would benefit upland cotton improvement.

  16. Genetic diversity of mtDNA D-loop sequences in four native Chinese chicken breeds.

    PubMed

    Guo, H W; Li, C; Wang, X N; Li, Z J; Sun, G R; Li, G X; Liu, X J; Kang, X T; Han, R L

    2017-10-01

    1. To explore the genetic diversity of Chinese indigenous chicken breeds, a 585 bp fragment of the mitochondrial DNA (mtDNA) region was sequenced in 102 birds from the Xichuan black-bone chicken, Yunyang black-bone chicken and Lushi chicken. In addition, 30 mtDNA D-loop sequences of Silkie fowls were downloaded from NCBI. The mtDNA D-loop sequence polymorphism and maternal origin of 4 chicken breeds were analysed in this study. 2. The results showed that a total of 33 mutation sites and 28 haplotypes were detected in the 4 chicken breeds. The haplotype diversity and nucleotide diversity of these 4 native breeds were 0.916 ± 0.014 and 0.012 ± 0.002, respectively. Three clusters were formed in 4 Chinese native chickens and 12 reference breeds. Both the Xichuan black-bone chicken and Yunyang black-bone chicken were grouped into one cluster. Four haplogroups (A, B, C and E) emerged in the median-joining network in these breeds. 3. It was concluded that these 4 Chinese chicken breeds had high genetic diversity. The phylogenetic tree and median network profiles showed that Chinese native chickens and its neighbouring countries had at least two maternal origins, one from Yunnan, China and another from Southeast Asia or its surrounding area.

  17. Diversity and distribution of culturable lactic acid bacterial species in Indonesian Sayur Asin.

    PubMed

    Mangunwardoyo, Wibowo; Abinawanto; Salamah, Andi; Sukara, Endang; Sulistiani; Dinoto, Achmad

    2016-08-01

    Lactic acid bacteria (LAB) play important roles in processing of Sayur Asin (spontaneously fermented mustard). Unfortunately, information about LAB in Indonesian Sayur Asin, prepared by traditional manufactures which is important as baseline data for maintenance of food quality and safety, is unclear. The aim of this study was to describe the diversity and distribution of culturable lactic acid bacteria in Sayur Asin of Indonesia. Four Sayur Asin samples (fermentation liquor and fermented mustard) were collected at harvesting times (3-7 days after fermentation) from two traditional manufactures in Tulung Agung (TA) and Kediri (KDR), East Java provinces, Indonesia. LAB strains were isolated by using MRS agar method supplemented with 1% CaCO 3 and characterized morphologically. Identification of the strains was performed basedon 16S rDNA analysis and the phylogenetic tree was drawn to understand the phylogenetic relationship of the collected strains. Different profiles were detected in total count of the plates, salinity and pH of fermenting liquor of Sayur Asin in TA and KDR provinces. A total of 172 LAB isolates were successfully isolated and identified based on their 16S rDNA sequences. Phylogenetic analysis of 27 representative LAB strains from Sayur Asin showed that these strains belonged to 5 distinct species namely Lactobacilus farciminis (N=32), L. fermentum (N=4), L. namurensis (N=15), L. plantarum (N=118) and L. parafarraginis (N=1). Strains D5-S-2013 and B4-S-2013 showed a close phylogenetic relationship with L. composti and L. paralimentarius, respectively where as the sequence had slightly lower similarity of lower than 99%, suggesting that they may be classified into novel species and need further investigation due to exhibition of significant differences in their nucleotide sequences. Lactobacillus plantarum was found being dominant in all sayur asin samples. Lactobacilli were recognized as the major group of lactic acid bacteria in Sayur Asin

  18. Pervasive adaptive protein evolution apparent in diversity patterns around amino acid substitutions in Drosophila simulans.

    PubMed

    Sattath, Shmuel; Elyashiv, Eyal; Kolodny, Oren; Rinott, Yosef; Sella, Guy

    2011-02-10

    In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence-in particular, conclusions about the rate and strength of beneficial substitutions-remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ∼13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation.

  19. Soil amino acid composition across a boreal forest successional sequence

    Treesearch

    Nancy R. Werdin-Pfisterer; Knut Kielland; Richard D. Boone

    2009-01-01

    Soil amino acids are important sources of organic nitrogen for plant nutrition, yet few studies have examined which amino acids are most prevalent in the soil. In this study, we examined the composition, concentration, and seasonal patterns of soil amino acids across a primary successional sequence encompassing a natural gradient of plant productivity and soil...

  20. Preferential amino acid sequences in alumina-catalyzed peptide bond formation.

    PubMed

    Bujdák, J; Rode, B M

    2002-05-21

    The catalytic effect of activated alumina on amino acid condensation was investigated. The readiness of amino acids to form peptide sequences was estimated on the basis of the yield of dipeptides and was found to decrease in the order glycine (Gly), alanine (Ala), leucine (Leu), valine (Val), proline (Pro). For example, approximately 15% Gly was converted to the dipeptide (Gly(2)), 5% to cyclic anhydride (cyc(Gly(2))) and small amounts of tri- (Gly(3)) and tetrapeptide (Gly(4)) were formed after 28 days. On the other hand, only trace amounts of Pro(2) were formed from proline under the same conditions. Preferential formation of certain sequences was observed in the mixed reaction systems containing two amino acids. For example, almost ten times more Gly-Val than Val-Gly was formed in the Gly+Val reaction system. The preferred sequences can be explained on the basis of an inductive effect that side groups have on the nucleophilicity and electrophilicity, respectively, of the amino and carboxyl groups. A comparison with published data of amino acid reactions in other reaction systems revealed that the main trends of preferential sequence formation were the same as those described for the salt-induced peptide formation (SIPF) reaction. The results of this work and other previously published papers show that alumina and related mineral surfaces might have played a crucial role in the prebiotic formation of the first peptides on the primitive earth.

  1. Rapid Quantification of Mutant Fitness in Diverse Bacteria by Sequencing Randomly Bar-Coded Transposons

    PubMed Central

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; Lamson, Jacob S.; He, Jennifer; Hoover, Cindi A.; Blow, Matthew J.; Bristow, James; Butland, Gareth

    2015-01-01

    ABSTRACT Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with any transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative d-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. PMID:25968644

  2. Propionibacterium acnes: Disease-Causing Agent or Common Contaminant? Detection in Diverse Patient Samples by Next-Generation Sequencing

    PubMed Central

    Friis-Nielsen, Jens; Vinner, Lasse; Hansen, Thomas Arn; Richter, Stine Raith; Fridholm, Helena; Herrera, Jose Alejandro Romero; Lund, Ole; Brunak, Søren; Izarzugaza, Jose M. G.; Mourier, Tobias; Nielsen, Lars Peter

    2016-01-01

    Propionibacterium acnes is the most abundant bacterium on human skin, particularly in sebaceous areas. P. acnes is suggested to be an opportunistic pathogen involved in the development of diverse medical conditions but is also a proven contaminant of human clinical samples and surgical wounds. Its significance as a pathogen is consequently a matter of debate. In the present study, we investigated the presence of P. acnes DNA in 250 next-generation sequencing data sets generated from 180 samples of 20 different sample types, mostly of cancerous origin. The samples were subjected to either microbial enrichment, involving nuclease treatment to reduce the amount of host nucleic acids, or shotgun sequencing. We detected high proportions of P. acnes DNA in enriched samples, particularly skin tissue-derived and other tissue samples, with the levels being higher in enriched samples than in shotgun-sequenced samples. P. acnes reads were detected in most samples analyzed, though the proportions in most shotgun-sequenced samples were low. Our results show that P. acnes can be detected in practically all sample types when molecular methods, such as next-generation sequencing, are employed. The possibility of contamination from the patient or other sources, including laboratory reagents or environment, should therefore always be considered carefully when P. acnes is detected in clinical samples. We advocate that detection of P. acnes always be accompanied by experiments validating the association between this bacterium and any clinical condition. PMID:26818667

  3. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  4. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  5. Exploring the environmental diversity of kinetoplastid flagellates in the high-throughput DNA sequencing era

    PubMed Central

    d’Avila-Levy, Claudia Masini; Boucinha, Carolina; Kostygov, Alexei; Santos, Helena Lúcia Carneiro; Morelli, Karina Alessandra; Grybchuk-Ieremenko, Anastasiia; Duval, Linda; Votýpka, Jan; Yurchenko, Vyacheslav; Grellier, Philippe; Lukeš, Julius

    2015-01-01

    The class Kinetoplastea encompasses both free-living and parasitic species from a wide range of hosts. Several representatives of this group are responsible for severe human diseases and for economic losses in agriculture and livestock. While this group encompasses over 30 genera, most of the available information has been derived from the vertebrate pathogenic genera Leishmaniaand Trypanosoma. Recent studies of the previously neglected groups of Kinetoplastea indicated that the actual diversity is much higher than previously thought. This article discusses the known segment of kinetoplastid diversity and how gene-directed Sanger sequencing and next-generation sequencing methods can help to deepen our knowledge of these interesting protists. PMID:26602872

  6. Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

    PubMed

    Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

    2013-08-01

    To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. New Tools For Understanding Microbial Diversity Using High-throughput Sequence Data

    NASA Astrophysics Data System (ADS)

    Knight, R.; Hamady, M.; Liu, Z.; Lozupone, C.

    2007-12-01

    High-throughput sequencing techniques such as 454 are straining the limits of tools traditionally used to build trees, choose OTUs, and perform other essential sequencing tasks. We have developed a workflow for phylogenetic analysis of large-scale sequence data sets that combines existing tools, such as the Arb phylogeny package and the NAST multiple sequence alignment tool, with new methods for choosing and clustering OTUs and for performing phylogenetic community analysis with UniFrac. This talk discusses the cyberinfrastructure we are developing to support the human microbiome project, and the application of these workflows to analyze very large data sets that contrast the gut microbiota with a range of physical environments. These tools will ultimately help to define core and peripheral microbiomes in a range of environments, and will allow us to understand the physical and biotic factors that contribute most to differences in microbial diversity.

  8. Intermediary metabolism in protists: a sequence-based view of facultative anaerobic metabolism in evolutionarily diverse eukaryotes.

    PubMed

    Ginger, Michael L; Fritz-Laylin, Lillian K; Fulton, Chandler; Cande, W Zacheus; Dawson, Scott C

    2010-12-01

    Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2-3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H(2) in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes. Copyright © 2010 Elsevier GmbH. All rights reserved.

  9. Intermediary Metabolism in Protists: a Sequence-based View of Facultative Anaerobic Metabolism in Evolutionarily Diverse Eukaryotes

    PubMed Central

    Ginger, Michael L.; Fritz-Laylin, Lillian K.; Fulton, Chandler; Cande, W. Zacheus; Dawson, Scott C.

    2011-01-01

    Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2–3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H2 in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes. PMID:21036663

  10. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE PAGES

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.; ...

    2018-01-09

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  11. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  12. The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase.

    PubMed Central

    Freemont, P S; Dunbar, B; Fothergill-Gilmore, L A

    1988-01-01

    The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase, comprising 363 residues, was determined. The sequence was deduced by automated sequencing of CNBr-cleavage, o-iodosobenzoic acid-cleavage, trypsin-digest and staphylococcal-proteinase-digest fragments. Comparison of the sequence with other class I aldolase sequences shows that the mammalian muscle isoenzyme is one of the most highly conserved enzymes known, with only about 2% of the residues changing per 100 million years. Non-mammalian aldolases appear to be evolving at the same rate as other glycolytic enzymes, with about 4% of the residues changing per 100 million years. Secondary-structure predictions are analysed in an accompanying paper [Sawyer, Fothergill-Gilmore & Freemont (1988) Biochem. J. 249, 789-793]. PMID:3355497

  13. Diversity amongst trigeminal neurons revealed by high throughput single cell sequencing

    PubMed Central

    Nguyen, Minh Q.; Wu, Youmei; Bonilla, Lauren S.; von Buchholtz, Lars J.

    2017-01-01

    The trigeminal ganglion contains somatosensory neurons that detect a range of thermal, mechanical and chemical cues and innervate unique sensory compartments in the head and neck including the eyes, nose, mouth, meninges and vibrissae. We used single-cell sequencing and in situ hybridization to examine the cellular diversity of the trigeminal ganglion in mice, defining thirteen clusters of neurons. We show that clusters are well conserved in dorsal root ganglia suggesting they represent distinct functional classes of somatosensory neurons and not specialization associated with their sensory targets. Notably, functionally important genes (e.g. the mechanosensory channel Piezo2 and the capsaicin gated ion channel Trpv1) segregate into multiple clusters and often are expressed in subsets of cells within a cluster. Therefore, the 13 genetically-defined classes are likely to be physiologically heterogeneous rather than highly parallel (i.e., redundant) lines of sensory input. Our analysis harnesses the power of single-cell sequencing to provide a unique platform for in silico expression profiling that complements other approaches linking gene-expression with function and exposes unexpected diversity in the somatosensory system. PMID:28957441

  14. Complete sequence and diversity of a maize-associated Polerovirus in East Africa

    USDA-ARS?s Scientific Manuscript database

    Since 2011-2012, Maize lethal necrosis (MLN) has emerged in East Africa, causing massive yield loss and propelling research to identify viruses and virus populations present in maize. As expected, next generation sequencing (NGS) has revealed diverse and abundant viruses from the family Potyviridae,...

  15. mtDNA sequence diversity of Hazara ethnic group from Pakistan.

    PubMed

    Rakha, Allah; Fatima; Peng, Min-Sheng; Adan, Atif; Bi, Rui; Yasmin, Memona; Yao, Yong-Gang

    2017-09-01

    The present study was undertaken to investigate mitochondrial DNA (mtDNA) control region sequences of Hazaras from Pakistan, so as to generate mtDNA reference database for forensic casework in Pakistan and to analyze phylogenetic relationship of this particular ethnic group with geographically proximal populations. Complete mtDNA control region (nt 16024-576) sequences were generated through Sanger Sequencing for 319 Hazara individuals from Quetta, Baluchistan. The population sample set showed a total of 189 distinct haplotypes, belonging mainly to West Eurasian (51.72%), East & Southeast Asian (29.78%) and South Asian (18.50%) haplogroups. Compared with other populations from Pakistan, the Hazara population had a relatively high haplotype diversity (0.9945) and a lower random match probability (0.0085). The dataset has been incorporated into EMPOP database under accession number EMP00680. The data herein comprises the largest, and likely most thoroughly examined, control region mtDNA dataset from Hazaras of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Genetic diversity analysis of Leuconostoc mesenteroides from Korean vegetables and food products by multilocus sequence typing.

    PubMed

    Sharma, Anshul; Kaur, Jasmine; Lee, Sulhee; Park, Young-Seo

    2018-06-01

    In the present study, 35 Leuconostoc mesenteroides strains isolated from vegetables and food products from South Korea were studied by multilocus sequence typing (MLST) of seven housekeeping genes (atpA, groEL, gyrB, pheS, pyrG, rpoA, and uvrC). The fragment sizes of the seven amplified housekeeping genes ranged in length from 366 to 1414 bp. Sequence analysis indicated 27 different sequence types (STs) with 25 of them being represented by a single strain indicating high genetic diversity, whereas the remaining 2 were characterized by five strains each. In total, 220 polymorphic nucleotide sites were detected among seven housekeeping genes. The phylogenetic analysis based on the STs of the seven loci indicated that the 35 strains belonged to two major groups, A (28 strains) and B (7 strains). Split decomposition analysis showed that intraspecies recombination played a role in generating diversity among strains. The minimum spanning tree showed that the evolution of the STs was not correlated with food source. This study signifies that the multilocus sequence typing is a valuable tool to access the genetic diversity among L. mesenteroides strains from South Korea and can be used further to monitor the evolutionary changes.

  17. Genetic diversity assessment of anoxygenic photosynthetic bacteria by distance-based grouping analysis of pufM sequences.

    PubMed

    Zeng, Y H; Chen, X H; Jiao, N Z

    2007-12-01

    To assess how completely the diversity of anoxygenic phototrophic bacteria (APB) was sampled in natural environments. All nucleotide sequences of the APB marker gene pufM from cultures and environmental clones were retrieved from the GenBank database. A set of cutoff values (sequence distances 0.06, 0.15 and 0.48 for species, genus, and (sub)phylum levels, respectively) was established using a distance-based grouping program. Analysis of the environmental clones revealed that current efforts on APB isolation and sampling in natural environments are largely inadequate. Analysis of the average distance between each identified genus and an uncultured environmental pufM sequence indicated that the majority of cultured APB genera lack environmental representatives. The distance-based grouping method is fast and efficient for bulk functional gene sequences analysis. The results clearly show that we are at a relatively early stage in sampling the global richness of APB species. Periodical assessment will undoubtedly facilitate in-depth analysis of potential biogeographical distribution pattern of APB. This is the first attempt to assess the present understanding of APB diversity in natural environments. The method used is also useful for assessing the diversity of other functional genes.

  18. Salmonella enterica Prophage Sequence Profiles Reflect Genome Diversity and Can Be Used for High Discrimination Subtyping.

    PubMed

    Mottawea, Walid; Duceppe, Marc-Olivier; Dupras, Andrée A; Usongo, Valentine; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Hamel, Jeremie; Kukavica-Ibrulj, Irena; Boyle, Brian; Gill, Alexander; Burnett, Elton; Franz, Eelco; Arya, Gitanjali; Weadge, Joel T; Gruenheid, Samantha; Wiedmann, Martin; Huang, Hongsheng; Daigle, France; Moineau, Sylvain; Bekal, Sadjia; Levesque, Roger C; Goodridge, Lawrence D; Ogunremi, Dele

    2018-01-01

    Non-typhoidal Salmonella is a leading cause of foodborne illness worldwide. Prompt and accurate identification of the sources of Salmonella responsible for disease outbreaks is crucial to minimize infections and eliminate ongoing sources of contamination. Current subtyping tools including single nucleotide polymorphism (SNP) typing may be inadequate, in some instances, to provide the required discrimination among epidemiologically unrelated Salmonella strains. Prophage genes represent the majority of the accessory genes in bacteria genomes and have potential to be used as high discrimination markers in Salmonella . In this study, the prophage sequence diversity in different Salmonella serovars and genetically related strains was investigated. Using whole genome sequences of 1,760 isolates of S. enterica representing 151 Salmonella serovars and 66 closely related bacteria, prophage sequences were identified from assembled contigs using PHASTER. We detected 154 different prophages in S. enterica genomes. Prophage sequences were highly variable among S. enterica serovars with a median ± interquartile range (IQR) of 5 ± 3 prophage regions per genome. While some prophage sequences were highly conserved among the strains of specific serovars, few regions were lineage specific. Therefore, strains belonging to each serovar could be clustered separately based on their prophage content. Analysis of S . Enteritidis isolates from seven outbreaks generated distinct prophage profiles for each outbreak. Taken altogether, the diversity of the prophage sequences correlates with genome diversity. Prophage repertoires provide an additional marker for differentiating S. enterica subtypes during foodborne outbreaks.

  19. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids

    PubMed Central

    Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

    2010-01-01

    Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279–284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614

  20. Distribution and diversity of Verrucomicrobia methanotrophs in geothermal and acidic environments.

    PubMed

    Sharp, Christine E; Smirnova, Angela V; Graham, Jaime M; Stott, Matthew B; Khadka, Roshan; Moore, Tim R; Grasby, Stephen E; Strack, Maria; Dunfield, Peter F

    2014-06-01

    Recently, methanotrophic members of the phylum Verrucomicrobia have been described, but little is known about their distribution in nature. We surveyed methanotrophic bacteria in geothermal springs and acidic wetlands via pyrosequencing of 16S rRNA gene amplicons. Putative methanotrophic Verrucomicrobia were found in samples covering a broad temperature range (22.5-81.6°C), but only in acidic conditions (pH 1.8-5.0) and only in geothermal environments, not in acidic bogs or fens. Phylogenetically, three 16S rRNA gene sequence clusters of putative methanotrophic Verrucomicrobia were observed. Those detected in high-temperature geothermal samples (44.1-81.6°C) grouped with known thermoacidiphilic 'Methylacidiphilum' isolates. A second group dominated in moderate-temperature geothermal samples (22.5-40.1°C) and a representative mesophilic methanotroph from this group was isolated (strain LP2A). Genome sequencing verified that strain LP2A possessed particulate methane monooxygenase, but its 16S rRNA gene sequence identity to 'Methylacidiphilum infernorum' strain V4 was only 90.6%. A third group clustered distantly with known methanotrophic Verrucomicrobia. Using pmoA-gene targeted quantitative polymerase chain reaction, two geothermal soil profiles showed a dominance of LP2A-like pmoA sequences in the cooler surface layers and 'Methylacidiphilum'-like pmoA sequences in deeper, hotter layers. Based on these results, there appears to be a thermophilic group and a mesophilic group of methanotrophic Verrucomicrobia. However, both were detected only in acidic geothermal environments. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  1. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  2. New Insight Into the Diversity of SemiSWEET Sugar Transporters and the Homologs in Prokaryotes

    PubMed Central

    Jia, Baolei; Hao, Lujiang; Xuan, Yuan Hu; Jeon, Che Ok

    2018-01-01

    Sugars will eventually be exported transporters (SWEETs) and SemiSWEETs represent a family of sugar transporters in eukaryotes and prokaryotes, respectively. SWEETs contain seven transmembrane helices (TMHs), while SemiSWEETs contain three. The functions of SemiSWEETs are less studied. In this perspective article, we analyzed the diversity and conservation of SemiSWEETs and further proposed the possible functions. 1,922 SemiSWEET homologs were retrieved from the UniProt database, which is not proportional to the sequenced prokaryotic genomes. However, these proteins are very diverse in sequences and can be classified into 19 clusters when >50% sequence identity is required. Moreover, a gene context analysis indicated that several SemiSWEETs are located in the operons that are related to diverse carbohydrate metabolism. Several proteins with seven TMHs can be found in bacteria, and sequence alignment suggested that these proteins in bacteria may be formed by the duplication and fusion. Multiple sequence alignments showed that the amino acids for sugar translocation are still conserved and coevolved, although the sequences show diversity. Among them, the functions of a few amino acids are still not clear. These findings highlight the challenges that exist in SemiSWEETs and provide future researchers the foundation to explore these uncharted areas. PMID:29872447

  3. New Insight Into the Diversity of SemiSWEET Sugar Transporters and the Homologs in Prokaryotes.

    PubMed

    Jia, Baolei; Hao, Lujiang; Xuan, Yuan Hu; Jeon, Che Ok

    2018-01-01

    Sugars will eventually be exported transporters (SWEETs) and SemiSWEETs represent a family of sugar transporters in eukaryotes and prokaryotes, respectively. SWEETs contain seven transmembrane helices (TMHs), while SemiSWEETs contain three. The functions of SemiSWEETs are less studied. In this perspective article, we analyzed the diversity and conservation of SemiSWEETs and further proposed the possible functions. 1,922 SemiSWEET homologs were retrieved from the UniProt database, which is not proportional to the sequenced prokaryotic genomes. However, these proteins are very diverse in sequences and can be classified into 19 clusters when >50% sequence identity is required. Moreover, a gene context analysis indicated that several SemiSWEETs are located in the operons that are related to diverse carbohydrate metabolism. Several proteins with seven TMHs can be found in bacteria, and sequence alignment suggested that these proteins in bacteria may be formed by the duplication and fusion. Multiple sequence alignments showed that the amino acids for sugar translocation are still conserved and coevolved, although the sequences show diversity. Among them, the functions of a few amino acids are still not clear. These findings highlight the challenges that exist in SemiSWEETs and provide future researchers the foundation to explore these uncharted areas.

  4. Amino acid sequence of a trypsin inhibitor from a Spirometra (Spirometra erinaceieuropaei).

    PubMed

    Sanda, A; Uchida, A; Itagaki, T; Kobayashi, H; Inokuchi, N; Koyama, T; Iwama, M; Ohgi, K; Irie, M

    2001-12-01

    A trypsin inhibitor that is highly homologous with bovine pancreatic trypsin inhibitor (BPTI) was co-purified along with RNase from Spirometra (Spirometra erinaceieuropaei). The amino acid sequence of this inhibitor (SETI) and the nucleotide sequence of the cDNA encoding this protein were determined by protein chemistry and gene technology. SETI contains 68 amino acid residues and has a molecular mass of 7,798 Da. SETI has 31 amino acid residues that are identical with BPTI's sequence, including 6 half-cystine and 5 aromatic amino acid residues. The active site Lys residue in BPTI is replaced by an Arg residue in SETI. SETI is an effective inhibitor of trypsin and moderately inhibits a-chymotrypsin, but less inhibits elastase or subtilisin. SETI was expressed by E. coli containing a PelB vector carrying the SETI encoding cDNA; an expression yield of 0.68 mg/l was obtained. The phylogenetic relationship of SETI and the other BPTI-like trypsin inhibitors was analyzed using most likelihood inference methods.

  5. Twenty-one genome sequences from Pseudomonas species and 19 genome sequences from diverse bacteria isolated from the rhizosphere and endosphere of Populus deltoides.

    PubMed

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn M; Johnson, Courtney M; Martin, Stanton L; Land, Miriam L; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A

    2012-11-01

    To aid in the investigation of the Populus deltoides microbiome, we generated draft genome sequences for 21 Pseudomonas strains and 19 other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium, and Variovorax were generated.

  6. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  7. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  8. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  9. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  10. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  11. Nucleotide sequence analysis of the gene encoding the Deinococcus radiodurans surface protein, derived amino acid sequence, and complementary protein chemical studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peters, J.; Peters, M.; Lottspeich, F.

    1987-11-01

    The complete nucleotide sequence of the gene encoding the surface (hexagonally packed intermediate (HPI))-layer polypeptide of Deinococcus radiodurans Sark was determined and found to encode a polypeptide of 1036 amino acids. Amino acid sequence analysis of about 30% of the residues revealed that the mature polypeptide consists of at least 978 amino acids. The N terminus was blocked to Edman degradation. The results of proteolytic modification of the HPI layer in situ and M/sub r/ estimations of the HPI polypeptide expressed in Escherichia coli indicated that there is a leader sequence. The N-terminal region contained a very high percentage (29%)more » of threonine and serine, including a cluster of nine consecutive serine or threonine residues, whereas a stretch near the C terminus was extremely rich in aromatic amino acids (29%). The protein contained at least two disulfide bridges, as well as tightly bound reducing sugars and fatty acids.« less

  12. Determining Clostridium difficile intra-taxa diversity by mining multilocus sequence typing databases.

    PubMed

    Muñoz, Marina; Ríos-Chaparro, Dora Inés; Patarroyo, Manuel Alfonso; Ramírez, Juan David

    2017-03-14

    Multilocus sequence typing (MLST) is a highly discriminatory typing strategy; it is reproducible and scalable. There is a MLST scheme for Clostridium difficile (CD), a gram positive bacillus causing different pathologies of the gastrointestinal tract. This work was aimed at describing the frequency of sequence types (STs) and Clades (C) reported and evalute the intra-taxa diversity in the CD MLST database (CD-MLST-db) using an MLSA approach. Analysis of 1778 available isolates showed that clade 1 (C1) was the most frequent worldwide (57.7%), followed by C2 (29.1%). Regarding sequence types (STs), it was found that ST-1, belonging to C2, was the most frequent. The isolates analysed came from 17 countries, mostly from the United Kingdom (UK) (1541 STs, 87.0%). The diversity of the seven housekeeping genes in the MLST scheme was evaluated, and alleles from the profiles (STs), for identifying CD population structure. It was found that adk and atpA are conserved genes allowing a limited amount of clusters to be discriminated; however, different genes such as drx, glyA and particularly sodA showed high diversity indexes and grouped CD populations in many clusters, suggesting that these genes' contribution to CD typing should be revised. It was identified that CD STs reported to date have a mostly clonal population structure with foreseen events of recombination; however, one group of STs was not assigned to a clade being highly different containing at least nine well-supported clusters, suggesting a greater amount of clades for CD. This study shows the usefulness of CD-MLST-db as a tool for studying CD distribution and population structure, identifying the need for reviewing the usefulness of sodA as housekeeping gene within the MLST scheme and suggesting the existence of a greater amount of CD clades. The study also shows the plausible exchange of genetic material between STs, contributing towards intra-taxa genetic diversity.

  13. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    PubMed Central

    Utturkar, Sagar M.; Klingeman, Dawn M.; Johnson, Courtney M.; Martin, Stanton L.; Land, Miriam L.; Lu, Tse-Yuan S.; Schadt, Christopher W.; Doktycz, Mitchel J.

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome, we generated draft genome sequences for 21 Pseudomonas strains and 19 other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium, and Variovorax were generated. PMID:23045501

  14. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn Marie

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  15. Characterization of relative abundance of lactic acid bacteria species in French organic sourdough by cultural, qPCR and MiSeq high-throughput sequencing methods.

    PubMed

    Michel, Elisa; Monfort, Clarisse; Deffrasnes, Marion; Guezenec, Stéphane; Lhomme, Emilie; Barret, Matthieu; Sicard, Delphine; Dousset, Xavier; Onno, Bernard

    2016-12-19

    In order to contribute to the description of sourdough LAB composition, MiSeq sequencing and qPCR methods were performed in association with cultural methods. A panel of 16 French organic bakers and farmer-bakers were selected for this work. The lactic acid bacteria (LAB) diversity of their organic sourdoughs was investigated quantitatively and qualitatively combining (i) Lactobacillus sanfranciscensis-specific qPCR, (ii) global sequencing with MiSeq Illumina technology and (iii) molecular isolates identification. In addition, LAB and yeast enumeration, pH, Total Titratable Acidity, organic acids and bread specific volume were analyzed. Microbial and physico-chemical data were statistically treated by Principal Component Analysis (PCA) and Hierarchical Ascendant Classification (HAC). Total yeast counts were 6 log 10 to 7.6 log 10 CFU/g while LAB counts varied from 7.2 log 10 to 9.6 log 10 CFU/g. Values obtained by L. sanfranciscensis-specific qPCR were estimated between 7.2 and 10.3 log 10 CFU/g, except for one sample at 4.4 log 10 CFU/g. HAC and PCA clustered the sixteen sourdoughs into three classes described by their variables but without links to bakers' practices. L. sanfranciscensis was the dominant species in 13 of the 16 sourdoughs analyzed by Next Generation Sequencing (NGS), by the culture dependent method this species was dominant only in only 10 samples. Based on isolates identification, LAB diversity was higher for 7 sourdoughs with the recovery of L. curvatus, L. brevis, L. heilongjiangensis, L. xiangfangensis, L. koreensis, L. pontis, Weissella sp. and Pediococcus pentosaceus, as the most representative species. L. koreensis, L. heilongjiangensis and L. xiangfangensis were identified in traditional Asian food and here for the first time as dominant in organic sourdough. This study highlighted that L. sanfranciscensis was not the major species in 6/16 sourdough samples and that a relatively high LAB diversity can be observed in French organic

  16. Amino acid sequence of tyrosinase from Neurospora crassa.

    PubMed Central

    Lerch, K

    1978-01-01

    The amino-acid sequence of tyrosinase from Neurospora crassa (monophenol,dihydroxyphenylalanine:oxygen oxidoreductase, EC 1.14.18.1) is reported. This copper-containing oxidase consists of a single polypeptide chain of 407 amino acids. The primary structure was determined by automated and manual sequence analysis on fragments produced by cleavage with cyanogen bromide and on peptides obtained by digestion with trypsin, pepsin, thermolysin, or chymotrypsin. The amino terminus of the protein is acetylated and the single cysteinyl residue 96 is covalently linked via a thioether bridge to histidyl residue 94. The formation and the possible role of this unusual structure in Neurospora tyrosinase is discussed. Dye-sensitized photooxidation of apotyrosinase and active-site-directed inactivation of the native enzyme indicate the possible involvement of histidyl residues 188, 192, 289, and 305 or 306 as ligands to the active-site copper as well as in the catalytic mechanism of this monooxygenase. PMID:151279

  17. High levels of diversity characterize mandrill (Mandrillus sphinx) Mhc-DRB sequences.

    PubMed

    Abbott, Kristin M; Wickings, E Jean; Knapp, Leslie A

    2006-08-01

    The major histocompatibility complex (MHC) is highly polymorphic in most primate species studied thus far. The rhesus macaque (Macaca mulatta) has been studied extensively and the Mhc-DRB region demonstrates variability similar to humans. The extent of MHC diversity is relatively unknown for other Old World monkeys (OWM), especially among genera other than Macaca. A molecular survey of the Mhc-DRB region in mandrills (Mandrillus sphinx) revealed extensive variability, suggesting that other OWMs may also possess high levels of Mhc-DRB polymorphism. In the present study, 33 Mhc-DRB loci were identified from only 13 animals. Eleven were wild-born and presumed to be unrelated and two were captive-born twins. Two to seven different sequences were identified for each individual, suggesting that some mandrills may have as many as four Mhc-DRB loci on a single haplotype. From these sequences, representatives of at least six Mhc-DRB loci or lineages were identified. As observed in other primates, some new lineages may have arisen through the process of gene conversion. These findings indicate that mandrills have Mhc-DRB diversity not unlike rhesus macaques and humans.

  18. Diversity and distribution of culturable lactic acid bacterial species in Indonesian Sayur Asin

    PubMed Central

    Mangunwardoyo, Wibowo; Abinawanto; Salamah, Andi; Sukara, Endang; Sulistiani; Dinoto, Achmad

    2016-01-01

    Background and Objectives: Lactic acid bacteria (LAB) play important roles in processing of Sayur Asin (spontaneously fermented mustard). Unfortunately, information about LAB in Indonesian Sayur Asin, prepared by traditional manufactures which is important as baseline data for maintenance of food quality and safety, is unclear. The aim of this study was to describe the diversity and distribution of culturable lactic acid bacteria in Sayur Asin of Indonesia. Materials and Methods: Four Sayur Asin samples (fermentation liquor and fermented mustard) were collected at harvesting times (3–7 days after fermentation) from two traditional manufactures in Tulung Agung (TA) and Kediri (KDR), East Java provinces, Indonesia. LAB strains were isolated by using MRS agar method supplemented with 1% CaCO 3 and characterized morphologically. Identification of the strains was performed basedon 16S rDNA analysis and the phylogenetic tree was drawn to understand the phylogenetic relationship of the collected strains. Results: Different profiles were detected in total count of the plates, salinity and pH of fermenting liquor of Sayur Asin in TA and KDR provinces. A total of 172 LAB isolates were successfully isolated and identified based on their 16S rDNA sequences. Phylogenetic analysis of 27 representative LAB strains from Sayur Asin showed that these strains belonged to 5 distinct species namely Lactobacilus farciminis (N=32), L. fermentum (N=4), L. namurensis (N=15), L. plantarum (N=118) and L. parafarraginis (N=1). Strains D5-S-2013 and B4-S-2013 showed a close phylogenetic relationship with L. composti and L. paralimentarius, respectively where as the sequence had slightly lower similarity of lower than 99%, suggesting that they may be classified into novel species and need further investigation due to exhibition of significant differences in their nucleotide sequences. Lactobacillus plantarum was found being dominant in all sayur asin samples. Conclusion: Lactobacilli were

  19. Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

    PubMed Central

    Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

    2016-01-01

    A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974

  20. Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

    PubMed

    Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

    2016-01-01

    A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.

  1. Fungal genome sequencing: basic biology to biotechnology.

    PubMed

    Sharma, Krishna Kant

    2016-08-01

    The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research.

  2. High diversity of picornaviruses in rats from different continents revealed by deep sequencing.

    PubMed

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-Phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-08-17

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission.

  3. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  4. Sequence-Based Discovery Demonstrates That Fixed Light Chain Human Transgenic Rats Produce a Diverse Repertoire of Antigen-Specific Antibodies.

    PubMed

    Harris, Katherine E; Aldred, Shelley Force; Davison, Laura M; Ogana, Heather Anne N; Boudreau, Andrew; Brüggemann, Marianne; Osborn, Michael; Ma, Biao; Buelow, Benjamin; Clarke, Starlynn C; Dang, Kevin H; Iyer, Suhasini; Jorgensen, Brett; Pham, Duy T; Pratap, Payal P; Rangaswamy, Udaya S; Schellenberger, Ute; van Schooten, Wim C; Ugamraj, Harshad S; Vafa, Omid; Buelow, Roland; Trinklein, Nathan D

    2018-01-01

    We created a novel transgenic rat that expresses human antibodies comprising a diverse repertoire of heavy chains with a single common rearranged kappa light chain (IgKV3-15-JK1). This fixed light chain animal, called OmniFlic, presents a unique system for human therapeutic antibody discovery and a model to study heavy chain repertoire diversity in the context of a constant light chain. The purpose of this study was to analyze heavy chain variable gene usage, clonotype diversity, and to describe the sequence characteristics of antigen-specific monoclonal antibodies (mAbs) isolated from immunized OmniFlic animals. Using next-generation sequencing antibody repertoire analysis, we measured heavy chain variable gene usage and the diversity of clonotypes present in the lymph node germinal centers of 75 OmniFlic rats immunized with 9 different protein antigens. Furthermore, we expressed 2,560 unique heavy chain sequences sampled from a diverse set of clonotypes as fixed light chain antibody proteins and measured their binding to antigen by ELISA. Finally, we measured patterns and overall levels of somatic hypermutation in the full B-cell repertoire and in the 2,560 mAbs tested for binding. The results demonstrate that OmniFlic animals produce an abundance of antigen-specific antibodies with heavy chain clonotype diversity that is similar to what has been described with unrestricted light chain use in mammals. In addition, we show that sequence-based discovery is a highly effective and efficient way to identify a large number of diverse monoclonal antibodies to a protein target of interest.

  5. Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica).

    PubMed

    He, Shui-Lian; Yang, Yang; Morrell, Peter L; Yi, Ting-Shuang

    2015-01-01

    Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.

  6. PubDNA Finder: a web database linking full-text articles to sequences of nucleic acids.

    PubMed

    García-Remesal, Miguel; Cuevas, Alejandro; Pérez-Rey, David; Martín, Luis; Anguita, Alberto; de la Iglesia, Diana; de la Calle, Guillermo; Crespo, José; Maojo, Víctor

    2010-11-01

    PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder

  7. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  8. Semiconductor Sequencing Reveals the Diversity of Bacterial Communities in an Amazon Reservoir Considered as a Methane Source

    NASA Astrophysics Data System (ADS)

    Graças, D. A.; Ramos, R. T.; Sá, P. G.; Baraúna, R. A.; Schneider, M. C.; Silva, A.

    2013-05-01

    The Amazon region has enormous hydro potential which is used for power generation. In fact, there are several hydroelectric power stations (HPS) already installed and many under construction or designed. It's in the Amazon which the HPS of Tucuruí, fifth largest in the world, is located. The construction of this hydroelectric dam flooded an area of 2,400 km2 of forest that decomposing, releasing greenhouse gases such as methane (CH4). Methane is the most abundant organic gas in the atmosphere and the second most important greenhouse gas. In this study, we use semicondutor sequencing to assess the bacterial diversity along a water column of 70 meters deep in the Tucuruí reservoir. One liter of water was collected every 10 meters along the water column for total DNA extraction. A fragment of approximately 150 base pairs of the 16S rRNA gene was amplified by polymerase chain reaction using universal primers. These fragments were then paralleled sequenced in Ion Torrent® platform using barcodes on the 316 chip. After the quality filters, about 237 thousands reads were obtained, representing more than 300 Mbp. For bacterial diversity analysis, we used only reads longer than 100 base pairs. The taxonomic diversity was obtained from the Ribosomal Database Project Classifier and alpha diversity analysis (diversity indices and rarefaction) was performed using the RDP pyrosequencing pipeline. Although it is recommended for data pyrosequencing, that pipeline is able to process data obtained from semiconductor sequencing once all of them are fasta files. Over 75% of the sequences were not classified in any phylum, which leads us to believe that there is a huge diversity in the bacterial environment whose function is still unclear. Among the sequences that could be classified, there is a predominance of proteobacteria in all layers, but in higher concentrations at the lower layers. Cyanobacteria accounted for about 3% in the layers of 0m and 10m, leading us to conclude that

  9. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    PubMed

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  10. Archaeon and archaeal virus diversity classification via sequence entropy and fractal dimension

    NASA Astrophysics Data System (ADS)

    Tremberger, George, Jr.; Gallardo, Victor; Espinoza, Carola; Holden, Todd; Gadura, N.; Cheung, E.; Schneider, P.; Lieberman, D.; Cheung, T.

    2010-09-01

    Archaea are important potential candidates in astrobiology as their metabolism includes solar, inorganic and organic energy sources. Archaeal viruses would also be expected to be present in a sustainable archaeal exobiological community. Genetic sequence Shannon entropy and fractal dimension can be used to establish a two-dimensional measure for classification and phylogenetic study of these organisms. A sequence fractal dimension can be calculated from a numerical series consisting of the atomic numbers of each nucleotide. Archaeal 16S and 23S ribosomal RNA sequences were studied. Outliers in the 16S rRNA fractal dimension and entropy plot were found to be halophilic archaea. Positive correlation (R-square ~ 0.75, N = 18) was observed between fractal dimension and entropy across the studied species. The 16S ribosomal RNA sequence entropy correlates with the 23S ribosomal RNA sequence entropy across species with R-square 0.93, N = 18. Entropy values correspond positively with branch lengths of a published phylogeny. The studied archaeal virus sequences have high fractal dimensions of 2.02 or more. A comparison of selected extremophile sequences with archaeal sequences from the Humboldt Marine Ecosystem database (Wood-Hull Oceanography Institute, MIT) suggests the presence of continuous sequence expression as inferred from distributions of entropy and fractal dimension, consistent with the diversity expected in an exobiological archaeal community.

  11. Diversity Analysis of Dairy and Nondairy Lactococcus lactis Isolates, Using a Novel Multilocus Sequence Analysis Scheme and (GTG)5-PCR Fingerprinting▿

    PubMed Central

    Rademaker, Jan L. W.; Herbet, Hélène; Starrenburg, Marjo J. C.; Naser, Sabri M.; Gevers, Dirk; Kelly, William J.; Hugenholtz, Jeroen; Swings, Jean; van Hylckama Vlieg, Johan E. T.

    2007-01-01

    The diversity of a collection of 102 lactococcus isolates including 91 Lactococcus lactis isolates of dairy and nondairy origin was explored using partial small subunit rRNA gene sequence analysis and limited phenotypic analyses. A subset of 89 strains of L. lactis subsp. cremoris and L. lactis subsp. lactis isolates was further analyzed by (GTG)5-PCR fingerprinting and a novel multilocus sequence analysis (MLSA) scheme. Two major genomic lineages within L. lactis were found. The L. lactis subsp. cremoris type-strain-like genotype lineage included both L. lactis subsp. cremoris and L. lactis subsp. lactis isolates. The other major lineage, with a L. lactis subsp. lactis type-strain-like genotype, comprised L. lactis subsp. lactis isolates only. A novel third genomic lineage represented two L. lactis subsp. lactis isolates of nondairy origin. The genomic lineages deviate from the subspecific classification of L. lactis that is based on a few phenotypic traits only. MLSA of six partial genes (atpA, encoding ATP synthase alpha subunit; pheS, encoding phenylalanine tRNA synthetase; rpoA, encoding RNA polymerase alpha chain; bcaT, encoding branched chain amino acid aminotransferase; pepN, encoding aminopeptidase N; and pepX, encoding X-prolyl dipeptidyl peptidase) revealed 363 polymorphic sites (total length, 1,970 bases) among 89 L. lactis subsp. cremoris and L. lactis subsp. lactis isolates with unique sequence types for most isolates. This allowed high-resolution cluster analysis in which dairy isolates form subclusters of limited diversity within the genomic lineages. The pheS DNA sequence analysis yielded two genetic groups dissimilar to the other genotyping analysis-based lineages, indicating a disparate acquisition route for this gene. PMID:17890345

  12. Diversity analysis of dairy and nondairy Lactococcus lactis isolates, using a novel multilocus sequence analysis scheme and (GTG)5-PCR fingerprinting.

    PubMed

    Rademaker, Jan L W; Herbet, Hélène; Starrenburg, Marjo J C; Naser, Sabri M; Gevers, Dirk; Kelly, William J; Hugenholtz, Jeroen; Swings, Jean; van Hylckama Vlieg, Johan E T

    2007-11-01

    The diversity of a collection of 102 lactococcus isolates including 91 Lactococcus lactis isolates of dairy and nondairy origin was explored using partial small subunit rRNA gene sequence analysis and limited phenotypic analyses. A subset of 89 strains of L. lactis subsp. cremoris and L. lactis subsp. lactis isolates was further analyzed by (GTG)(5)-PCR fingerprinting and a novel multilocus sequence analysis (MLSA) scheme. Two major genomic lineages within L. lactis were found. The L. lactis subsp. cremoris type-strain-like genotype lineage included both L. lactis subsp. cremoris and L. lactis subsp. lactis isolates. The other major lineage, with a L. lactis subsp. lactis type-strain-like genotype, comprised L. lactis subsp. lactis isolates only. A novel third genomic lineage represented two L. lactis subsp. lactis isolates of nondairy origin. The genomic lineages deviate from the subspecific classification of L. lactis that is based on a few phenotypic traits only. MLSA of six partial genes (atpA, encoding ATP synthase alpha subunit; pheS, encoding phenylalanine tRNA synthetase; rpoA, encoding RNA polymerase alpha chain; bcaT, encoding branched chain amino acid aminotransferase; pepN, encoding aminopeptidase N; and pepX, encoding X-prolyl dipeptidyl peptidase) revealed 363 polymorphic sites (total length, 1,970 bases) among 89 L. lactis subsp. cremoris and L. lactis subsp. lactis isolates with unique sequence types for most isolates. This allowed high-resolution cluster analysis in which dairy isolates form subclusters of limited diversity within the genomic lineages. The pheS DNA sequence analysis yielded two genetic groups dissimilar to the other genotyping analysis-based lineages, indicating a disparate acquisition route for this gene.

  13. Complete complementary DNA-derived amino acid sequence of canine cardiac phospholamban.

    PubMed Central

    Fujii, J; Ueno, A; Kitano, K; Tanaka, S; Kadoma, M; Tada, M

    1987-01-01

    Complementary DNA (cDNA) clones specific for phospholamban of sarcoplasmic reticulum membranes have been isolated from a canine cardiac cDNA library. The amino acid sequence deduced from the cDNA sequence indicates that phospholamban consists of 52 amino acid residues and lacks an amino-terminal signal sequence. The protein has an inferred mol wt 6,080 that is in agreement with its apparent monomeric mol wt 6,000, estimated previously by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Phospholamban contains two distinct domains, a hydrophilic region at the amino terminus (domain I) and a hydrophobic region at the carboxy terminus (domain II). We propose that domain I is localized at the cytoplasmic surface and offers phosphorylatable sites whereas domain II is anchored into the sarcoplasmic reticulum membrane. PMID:3793929

  14. The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase.

    PubMed Central

    Haggarty, N W; Dunbar, B; Fothergill, L A

    1983-01-01

    The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase, comprising 239 residues, was determined. The sequence was deduced from the four cyanogen bromide fragments, and from the peptides derived from these fragments after digestion with a number of proteolytic enzymes. Comparison of this sequence with that of the yeast glycolytic enzyme, phosphoglycerate mutase, shows that these enzymes are 47% identical. Most, but not all, of the residues implicated as being important for the activity of the glycolytic mutase are conserved in the erythrocyte diphosphoglycerate mutase. PMID:6313356

  15. Whole-genome sequencing and analyses identify high genetic heterogeneity, diversity and endemicity of rotavirus genotype P[6] strains circulating in Africa.

    PubMed

    Nyaga, Martin M; Tan, Yi; Seheri, Mapaseka L; Halpin, Rebecca A; Akopov, Asmik; Stucker, Karla M; Fedorova, Nadia B; Shrivastava, Susmita; Duncan Steele, A; Mwenda, Jason M; Pickett, Brett E; Das, Suman R; Jeffrey Mphahlele, M

    2018-05-18

    Rotavirus A (RVA) exhibits a wide genotype diversity globally. Little is known about the genetic composition of genotype P[6] from Africa. This study investigated possible evolutionary mechanisms leading to genetic diversity of genotype P[6] VP4 sequences. Phylogenetic analyses on 167 P[6] VP4 full-length sequences were conducted, which included six porcine-origin sequences. Of the 167 sequences, 57 were newly acquired through whole genome sequencing as part of this study. The other 110 sequences were all publicly-available global P[6] VP4 full-length sequences downloaded from GenBank. The strength of association between the phenotypic features and the phylogeny was also determined. A number of reassortment and mixed infections of RVA genotype P[6] strains were observed in this study. Phylogenetic analyses demostrated the extensive genetic diversity that exists among human P[6] strains, porcine-like strains, their concomitant clades/subclades and estimated that P[6] VP4 gene has a higher substitution rate with the mean of 1.05E-3 substitutions/site/year. Further, the phylogenetic analyses indicated that genotype P[6] strains were endemic in Africa, characterised by an extensive genetic diversity and long-time local evolution of the viruses. This was also supported by phylogeographic clustering and G-genotype clustering of the P[6] strains when Bayesian Tip-association Significance testing (BaTS) was applied, clearly supporting that the viruses evolved locally in Africa instead of spatial mixing among different regions. Overall, the results demonstrated that multiple mechanisms such as reassortment events, various mutations and possibly interspecies transmission account for the enormous diversity of genotype P[6] strains in Africa. These findings highlight the need for continued global surveillance of rotavirus diversity. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. [Influence of PCR cycle number on microbial diversity analysis through next generation sequencing].

    PubMed

    An, Yunhe; Gao, Lijuan; Li, Junbo; Tian, Yanjie; Wang, Jinlong; Zheng, Xuejuan; Wu, Huijuan

    2016-08-25

    Using of high throughput sequencing technology to study the microbial diversity in complex samples has become one of the hottest issues in the field of microbial diversity research. In this study, the soil and sheep rumen chyme samples were used to extract DNA, respectively. Then the 25 ng total DNA was used to amplify the 16S rRNA V3 region with 20, 25, 30 PCR cycles, and the final sequencing library was constructed by mixing equal amounts of purified PCR products. Finally, the operational taxonomic unit (OUT) amount, rarefaction curve, microbial number and species were compared through data analysis. It was found that at the same amount of DNA template, the proportion of the community composition was not the best with more numbers of PCR cycle, although the species number was much more. In all, when the PCR cycle number is 25, the number of species and proportion of the community composition were the most optimal both in soil or chyme samples.

  17. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  18. Palynological composition of a Lower Cretaceous South American tropical sequence: climatic implications and diversity comparisons with other latitudes.

    PubMed

    Mejia-Velasquez, Paula J; Dilcher, David L; Jaramillo, Carlos A; Fortini, Lucas B; Manchester, Steven R

    2012-11-01

    Reconstruction of floristic patterns during the early diversification of angiosperms is impeded by the scarce fossil record, especially in tropical latitudes. Here we collected quantitative palynological data from a stratigraphic sequence in tropical South America to provide floristic and climatic insights into such tropical environments during the Early Cretaceous. We reconstructed the floristic composition of an Aptian-Albian tropical sequence from central Colombia using quantitative palynology (rarefied species richness and abundance) and used it to infer its predominant climatic conditions. Additionally, we compared our results with available quantitative data from three other sequences encompassing 70 floristic assemblages to determine latitudinal diversity patterns. Abundance of humidity indicators was higher than that of aridity indicators (61% vs. 10%). Additionally, we found an angiosperm latitudinal diversity gradient (LDG) for the Aptian, but not for the Albian, and an inverted LDG of the overall diversity for the Albian. Angiosperm species turnover during the Albian, however, was higher in humid tropics. There were humid climates in northwestern South America during the Aptian-Albian interval contrary to the widespread aridity expected for the tropical belt. The Albian inverted overall LDG is produced by a faster increase in per-sample angiosperm and pteridophyte diversity in temperate latitudes. However, humid tropical sequences had higher rates of floristic turnover suggesting a higher degree of morphological variation than in temperate regions.

  19. Palynological composition of a Lower Cretaceous South American tropical sequence: Climatic implications and diversity comparisons with other latitudes.

    USGS Publications Warehouse

    Mejia-Velasquez, Paula J.; Dilcher, David L.; Jaramillo, Carlos A.; Fortini, Lucas B.; Manchester, Steven R.

    2012-01-01

    Premise of the study: Reconstruction of floristic patterns during the early diversification of angiosperms is impeded by the scarce fossil record, especially in tropical latitudes. Here we collected quantitative palynological data from a stratigraphic sequence in tropical South America to provide floristic and climatic insights into such tropical environments during the Early Cretaceous. Methods: We reconstructed the floristic composition of an Aptian-Albian tropical sequence from central Colombia using quantitative palynology (rarefied species richness and abundance) and used it to infer its predominant climatic conditions. Additionally, we compared our results with available quantitative data from three other sequences encompassing 70 floristic assemblages to determine latitudinal diversity patterns. Key results: Abundance of humidity indicators was higher than that of aridity indicators (61% vs. 10%). Additionally, we found an angiosperm latitudinal diversity gradient (LDG) for the Aptian, but not for the Albian, and an inverted LDG of the overall diversity for the Albian. Angiosperm species turnover during the Albian, however, was higher in humid tropics. Conclusions: There were humid climates in northwestern South America during the Aptian-Albian interval contrary to the widespread aridity expected for the tropical belt. The Albian inverted overall LDG is produced by a faster increase in per-sample angiosperm and pteridophyte diversity in temperate latitudes. However, humid tropical sequences had higher rates of floristic turnover suggesting a higher degree of morphological variation than in temperate regions.

  20. Amino acid sequence of the Amur tiger prion protein.

    PubMed

    Wu, Changde; Pang, Wanyong; Zhao, Deming

    2006-10-01

    Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank.

  1. Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

    PubMed Central

    Huang, Yongjie; Mrázek, Jan

    2014-01-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  2. Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts

    PubMed Central

    Cheng, Bing; Furtado, Agnelo

    2017-01-01

    Abstract Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5΄UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee. PMID:29048540

  3. AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees

    PubMed Central

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php. PMID:24892935

  4. AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.

    PubMed

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.

  5. ITS2 sequence-structure phylogeny reveals diverse endophytic Pseudocercospora fungi on poplars.

    PubMed

    Yan, Dong-Hui; Gao, Qian; Sun, Xiaoming; Song, Xiaoyu; Li, Hongchang

    2018-04-01

    For matching the new fungal nomenclature to abolish pleomorphic names for a fungus, a genus Pseudocercospora s. str. was suggested to host holomorphic Pseudocercosproa fungi. But the Pseudocercosproa fungi need extra phylogenetic loci to clarify their taxonomy and diversity for their existing and coming species. Internal transcribed spacer 2 (ITS2) secondary structures have been promising in charactering species phylogeny in plants, animals and fungi. In present study, a conserved model of ITS2 secondary structures was confirmed on fungi in Pseudocercospora s. str. genus using RNAshape program. The model has a typical eukaryotic four-helix ITS2 secondary structure. But a single U base occurred in conserved motif of U-U mismatch in Helix 2, and a UG emerged in UGGU motif in Helix 3 to Pseudocercospora fungi. The phylogeny analyses based on the ITS2 sequence-secondary structures with compensatory base change characterizations are able to delimit more species for Pseudocercospora s. str. than phylogenic inferences of traditional multi-loci alignments do. The model was employed to explore the diversity of endophytic Pseudocercospora fungi in poplar trees. The analysis results also showed that endophytic Pseudocercospora fungi were diverse in species and evolved a specific lineage in poplar trees. This work suggested that ITS2 sequence-structures could become as additionally significant loci for species phylogenetic and taxonomic studies on Pseudocerospora fungi, and that Pseudocercospora endophytes could be important roles to Pseudocercospora fungi's evolution and function in ecology.

  6. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  7. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  8. A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    PubMed Central

    2010-01-01

    Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by

  9. Computational analysis of sequence selection mechanisms.

    PubMed

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  10. Assessment of genetic diversity among four orchids based on ddRAD sequencing data for conservation purposes.

    PubMed

    Roy, Subhas Chandra; Moitra, Kaushik; De Sarker, Dilip

    2017-01-01

    Genetic diversity was assessed in the four orchid species using NGS based ddRAD sequencing data. The assembled nucleotide sequences (fastq) were deposited in the SRA archive of NCBI Database with accession number (SRP063543 for Dendrobium , SRP065790 for Geodorum, SRP072201 for Cymbidium and SRP072378 for Rhynchostylis ). Total base pair read was 1.1 Mbp in case of Dendrobium sp., 553.3 Kbp for Geodorum sp., 1.6 Gbp for Cymbidium , and 1.4 Gbp for Rhynchostylis . Average GC% was 43.9 in Geodorum , 43.7% in Dendrobium , 41.2% in Cymbidium and 42.3% in Rhynchostylis . Four partial gene sequences were used in DnaSP5 program for nucleotide diversity and phylogenetic relationship determination ( Ycf2 gene of Dendrobium, matK gene of Geodorum , psbD gene of Cymbidium and Ycf2 gene of Ryhnchostylis ). Nucleotide diversity (per site) Pi (π) was 0.10560 in Dendrobium, 0.03586 in Geodorum, 0.01364 in Cymbidium and 0.011344 in Rhynchostylis . Neutrality test statistics showed the negative value in all the four orchid species (Tajima's D value -2.17959 in Dendrobium , -2.01655 in Geodorum, -2.12362 in Rhynchostylis and -1.54222 in Cymbidium ) indicating the purifying selection. Result for these gene sequences ( mat K and Ycf 2 and psb D) indicate that they were not evolved neutrally, but signifying that selection might have played a role in evolution of these genes in these four groups of orchids. Phylogenetic relationship was analyzed by reconstructing dendrogram based on the matK, psbD and Ycf2 gene sequences using maximum likelihood method in MEGA6 program.

  11. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  12. Multilocus sequence typing (MLST) for lineage assignment and high resolution diversity studies in Trypanosoma cruzi.

    PubMed

    Yeo, Matthew; Mauricio, Isabel L; Messenger, Louisa A; Lewis, Michael D; Llewellyn, Martin S; Acosta, Nidia; Bhattacharyya, Tapan; Diosque, Patricio; Carrasco, Hernan J; Miles, Michael A

    2011-06-01

    Multilocus sequence typing (MLST) is a powerful and highly discriminatory method for analysing pathogen population structure and epidemiology. Trypanosoma cruzi, the protozoan agent of American trypanosomiasis (Chagas disease), has remarkable genetic and ecological diversity. A standardised MLST protocol that is suitable for assignment of T. cruzi isolates to genetic lineage and for higher resolution diversity studies has not been developed. We have sequenced and diplotyped nine single copy housekeeping genes and assessed their value as part of a systematic MLST scheme for T. cruzi. A minimum panel of four MLST targets (Met-III, RB19, TcGPXII, and DHFR-TS) was shown to provide unambiguous assignment of isolates to the six known T. cruzi lineages (Discrete Typing Units, DTUs TcI-TcVI). In addition, we recommend six MLST targets (Met-II, Met-III, RB19, TcMPX, DHFR-TS, and TR) for more in depth diversity studies on the basis that diploid sequence typing (DST) with this expanded panel distinguished 38 out of 39 reference isolates. Phylogenetic analysis implies a subdivision between North and South American TcIV isolates. Single Nucleotide Polymorphism (SNP) data revealed high levels of heterozygosity among DTUs TcI, TcIII, TcIV and, for three targets, putative corresponding homozygous and heterozygous loci within DTUs TcI and TcIII. Furthermore, individual gene trees gave incongruent topologies at inter- and intra-DTU levels, inconsistent with a model of strict clonality. We demonstrate the value of systematic MLST diplotyping for describing inter-DTU relationships and for higher resolution diversity studies of T. cruzi, including presence of recombination events. The high levels of heterozygosity will facilitate future population genetics analysis based on MLST haplotypes.

  13. Bile acid malabsorption after continent urinary diversion with an ileal reservoir.

    PubMed

    Olofsson, G; Fjälling, M; Kilander, A; Ung, K A; Jonsson, O

    1998-09-01

    We determine the effect of urinary diversion with a Kock ileal reservoir on bile acid absorption and bowel habits. We asked 96 patients with a Kock ileal urinary reservoir to record bowel habits and abdominal symptoms for 1 week. Data on 75 patients were further analyzed. Bile acid absorption was determined in 29 healthy control subjects, in 17 before and 6 months after continent urinary diversion, and in 21, 2 to 14 years postoperatively. Bile acid absorption was considered pathological when retention of less than 10% of an oral capsule containing selenium-75 labeled tauroselcholic acid (SeHCAT) was noted after 1 week. Mean number of defecations plus or minus standard deviation was 9.4 +/- 6.1 (75 cases). Of the patients 13% had 15 or more stools per week and 15% complained of always having loose stools. Mean value for the SeHCAT test was 32 +/- 19% preoperatively and 17 +/- 16% 6 months postoperatively (p = 0.0023). The corresponding value for healthy controls was 39 +/- 18%. Significant relationships were found between the results of the SeHCAT test postoperatively, and the number of stools per week and consistency of the feces. All patients with more than 10 defecations per week had a pathological SeHCAT test. Most patients with an ileal urinary reservoir have fairly normal bowel habits. Bile acid absorption is significantly reduced postoperatively and approximately a third of the patients have a pathological SeHCAT test. Preoperative investigation of bowel habits is recommended and a SeHCAT test should be performed in patients with frequent, loose defecations. Other types of diversion should be offered when preoperative retention is below 10 to 20% especially in patients with impaired anal control.

  14. Neotropical Bats from Costa Rica harbour Diverse Coronaviruses.

    PubMed

    Moreira-Soto, A; Taylor-Castillo, L; Vargas-Vargas, N; Rodríguez-Herrera, B; Jiménez, C; Corrales-Aguilar, E

    2015-11-01

    Bats are hosts of diverse coronaviruses (CoVs) known to potentially cross the host-species barrier. For analysing coronavirus diversity in a bat species-rich country, a total of 421 anal swabs/faecal samples from Costa Rican bats were screened for CoV RNA-dependent RNA polymerase (RdRp) gene sequences by a pancoronavirus PCR. Six families, 24 genera and 41 species of bats were analysed. The detection rate for CoV was 1%. Individuals (n = 4) from four different species of frugivorous (Artibeus jamaicensis, Carollia perspicillata and Carollia castanea) and nectivorous (Glossophaga soricina) bats were positive for coronavirus-derived nucleic acids. Analysis of 440 nt. RdRp sequences allocated all Costa Rican bat CoVs to the α-CoV group. Several CoVs sequences clustered near previously described CoVs from the same species of bat, but were phylogenetically distant from the human CoV sequences identified to date, suggesting no recent spillover events. The Glossophaga soricina CoV sequence is sufficiently dissimilar (26% homology to the closest known bat CoVs) to represent a unique coronavirus not clustering near other CoVs found in the same bat species so far, implying an even higher CoV diversity than previously suspected. © 2015 Blackwell Verlag GmbH.

  15. The diversity of the orthoreoviruses: molecular taxonomy and phylogentic divides.

    USDA-ARS?s Scientific Manuscript database

    The family Reoviridae is a diverse group of viruses with double-stranded ribonucleic acid (RNA) genomes contained within icosahedral, layered protein capsids. Within the Reoviridae, the Orthoreovirus genus includes viruses that infect reptiles, birds and mammals (including humans). Recent sequencing...

  16. Genetic diversity of the merozoite surface protein-3 gene in Plasmodium falciparum populations in Thailand.

    PubMed

    Pattaradilokrat, Sittiporn; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Siripoon, Napaporn; Harnyuttanakorn, Pongchai

    2016-10-21

    An effective malaria vaccine is an urgently needed tool to fight against human malaria, the most deadly parasitic disease of humans. One promising candidate is the merozoite surface protein-3 (MSP-3) of Plasmodium falciparum. This antigenic protein, encoded by the merozoite surface protein (msp-3) gene, is polymorphic and classified according to size into the two allelic types of K1 and 3D7. A recent study revealed that both the K1 and 3D7 alleles co-circulated within P. falciparum populations in Thailand, but the extent of the sequence diversity and variation within each allelic type remains largely unknown. The msp-3 gene was sequenced from 59 P. falciparum samples collected from five endemic areas (Mae Hong Son, Kanchanaburi, Ranong, Trat and Ubon Ratchathani) in Thailand and analysed for nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity. The gene was also subject to population genetic analysis (F st ) and neutrality tests (Tajima's D, Fu and Li D* and Fu and Li' F* tests) to determine any signature of selection. The sequence analyses revealed eight unique DNA haplotypes and seven amino acid sequence variants, with a haplotype and nucleotide diversity of 0.828 and 0.049, respectively. Neutrality tests indicated that the polymorphism detected in the alanine heptad repeat region of MSP-3 was maintained by positive diversifying selection, suggesting its role as a potential target of protective immune responses and supporting its role as a vaccine candidate. Comparison of MSP-3 variants among parasite populations in Thailand, India and Nigeria also inferred a close genetic relationship between P. falciparum populations in Asia. This study revealed the extent of the msp-3 gene diversity in P. falciparum in Thailand, providing the fundamental basis for the better design of future blood stage malaria vaccines against P. falciparum.

  17. Complete Amino Acid Sequence of a Copper/Zinc-Superoxide Dismutase from Ginger Rhizome.

    PubMed

    Nishiyama, Yuki; Fukamizo, Tamo; Yoneda, Kazunari; Araki, Tomohiro

    2017-04-01

    Superoxide dismutase (SOD) is an antioxidant enzyme protecting cells from oxidative stress. Ginger (Zingiber officinale) is known for its antioxidant properties, however, there are no data on SODs from ginger rhizomes. In this study, we purified SOD from the rhizome of Z. officinale (Zo-SOD) and determined its complete amino acid sequence using N terminal sequencing, amino acid analysis, and de novo sequencing by tandem mass spectrometry. Zo-SOD consists of 151 amino acids with two signature Cu/Zn-SOD motifs and has high similarity to other plant Cu/Zn-SODs. Multiple sequence alignment showed that Cu/Zn-binding residues and cysteines forming a disulfide bond, which are highly conserved in Cu/Zn-SODs, are also present in Zo-SOD. Phylogenetic analysis revealed that plant Cu/Zn-SODs clustered into distinct chloroplastic, cytoplasmic, and intermediate groups. Among them, only chloroplastic enzymes carried amino acid substitutions in the region functionally important for enzymatic activity, suggesting that chloroplastic SODs may have a function distinct from those of SODs localized in other subcellular compartments. The nucleotide sequence of the Zo-SOD coding region was obtained by reverse-translation, and the gene was synthesized, cloned, and expressed. The recombinant Zo-SOD demonstrated pH stability in the range of 5-10, which is similar to other reported Cu/Zn-SODs, and thermal stability in the range of 10-60 °C, which is higher than that for most plant Cu/Zn-SODs but lower compared to the enzyme from a Z. officinale relative Curcuma aromatica.

  18. Protein location prediction using atomic composition and global features of the amino acid sequence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cherian, Betsy Sheena, E-mail: betsy.skb@gmail.com; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectivelymore » used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.« less

  19. Diversity of virus-host systems in hypersaline Lake Retba, Senegal.

    PubMed

    Sime-Ngando, Télesphore; Lucas, Soizick; Robin, Agnès; Tucker, Kimberly Pause; Colombet, Jonathan; Bettarel, Yvan; Desmond, Elie; Gribaldo, Simonetta; Forterre, Patrick; Breitbart, Mya; Prangishvili, David

    2011-08-01

    Remarkable morphological diversity of virus-like particles was observed by transmission electron microscopy in a hypersaline water sample from Lake Retba, Senegal. The majority of particles morphologically resembled hyperthermophilic archaeal DNA viruses isolated from extreme geothermal environments. Some hypersaline viral morphotypes have not been previously observed in nature, and less than 1% of observed particles had a head-and-tail morphology, which is typical for bacterial DNA viruses. Culture-independent analysis of the microbial diversity in the sample suggested the dominance of extremely halophilic archaea. Few of the 16S sequences corresponded to known archeal genera (Haloquadratum, Halorubrum and Natronomonas), whereas the majority represented novel archaeal clades. Three sequences corresponded to a new basal lineage of the haloarchaea. Bacteria belonged to four major phyla, consistent with the known diversity in saline environments. Metagenomic sequencing of DNA from the purified virus-like particles revealed very few similarities to the NCBI non-redundant database at either the nucleotide or amino acid level. Some of the identifiable virus sequences were most similar to previously described haloarchaeal viruses, but no sequence similarities were found to archaeal viruses from extreme geothermal environments. A large proportion of the sequences had similarity to previously sequenced viral metagenomes from solar salterns. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.

  20. Microbial diversity and metabolite composition of Belgian red-brown acidic ales.

    PubMed

    Snauwaert, Isabel; Roels, Sanne P; Van Nieuwerburg, Filip; Van Landschoot, Anita; De Vuyst, Luc; Vandamme, Peter

    2016-03-16

    Belgian red-brown acidic ales are sour and alcoholic fermented beers, which are produced by mixed-culture fermentation and blending. The brews are aged in oak barrels for about two years, after which mature beer is blended with young, non-aged beer to obtain the end-products. The present study evaluated the microbial community diversity of Belgian red-brown acidic ales at the end of the maturation phase of three subsequent brews of three different breweries. The microbial diversity was compared with the metabolite composition of the brews at the end of the maturation phase. Therefore, mature brew samples were subjected to 454 pyrosequencing of the 16S rRNA gene (bacteria) and the internal transcribed spacer region (yeasts) and a broad range of metabolites was quantified. The most important microbial species present in the Belgian red-brown acidic ales investigated were Pediococcus damnosus, Dekkera bruxellensis, and Acetobacter pasteurianus. In addition, this culture-independent analysis revealed operational taxonomic units that were assigned to an unclassified fungal community member, Candida, and Lactobacillus. The main metabolites present in the brew samples were L-lactic acid, D-lactic acid, and ethanol, whereas acetic acid was produced in lower quantities. The most prevailing aroma compounds were ethyl acetate, isoamyl acetate, ethyl hexanoate, and ethyl octanoate, which might be of impact on the aroma of the end-products. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV) using amplicon next generation sequencing

    PubMed Central

    Constable, Fiona E.; Nancarrow, Narelle; Plummer, Kim M.; Rodoni, Brendan

    2017-01-01

    PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored. PMID:28632759

  2. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV) using amplicon next generation sequencing.

    PubMed

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.

  3. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Myers, G.; Foley, B.; Korber, B.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived.more » Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.« less

  4. Streptococcal phosphoenolpyruvate-sugar phosphotransferase system: amino acid sequence and site of ATP-dependent phosphorylation of HPr

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deutscher, J.; Pevec, B.; Beyreuther, K.

    1986-10-21

    The amino acid sequence of histidine-containing protein (HPr) from Streptococcus faecalis has been determined by direct Edman degradation of intact HPr and by amino acid sequence analysis of tryptic peptides, V8 proteolyptic peptides, thermolytic peptides, and cyanogen bromide cleavage products. HPr from S. faecalis was found to contain 89 amino acid residues, corresponding to a molecular weight of 9438. The amino acid sequence of HPr from S. faecalis shows extended homology to the primary structure of HPr proteins from other bacteria. Besides the phosphoenolpyruvate-dependent phosphorylation of a histidyl residue in HPr, catalyzed by enzyme I of the bacterial phosphotransferase system,more » HPr was also found to be phosphorylated at a seryl residue in an ATP-dependent protein kinase catalyzed reaction. The site of ATP-dependent phosphorylation in HPr of S faecalis has now been determined. (/sup 32/P)P-Ser-HPr was digested with three different proteases, and in each case, a single labeled peptide was isolated. Following digestion with subtilisin, they obtained a peptide with the sequence -(P)Ser-Ile-Met-. Using chymotrypsin, they isolated a peptide with the sequence -Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-Gly-Val-Met-. The longest labeled peptide was obtained with V8 staphylococcal protease. According to amino acid analysis, this peptide contained 36 out of the 89 amino acid residues of HPr. The following sequence of 12 amino acid residues of the V8 peptide was determined: -Tyr-Lys-Gly-Lys-Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-. Thus, the site of ATP-dependent phosphorylation was determined to be Ser-46 within the primary structure of HPr.« less

  5. Diagnostics based on nucleic acid sequence variant profiling: PCR, hybridization, and NGS approaches.

    PubMed

    Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu

    2016-10-01

    Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  6. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    PubMed

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  7. Complete sequence and diversity of a maize-associated Polerovirus in East Africa.

    PubMed

    Massawe, Deogracious P; Stewart, Lucy R; Kamatenesi, Jovia; Asiimwe, Theodore; Redinbaugh, Margaret G

    2018-06-01

    Since 2011-2012, Maize lethal necrosis (MLN) has emerged in East Africa, causing massive yield loss and propelling research to identify viruses and virus populations present in maize. As expected, next generation sequencing (NGS) has revealed diverse and abundant viruses from the family Potyviridae, primarily sugarcane mosaic virus (SCMV), and maize chlorotic mottle virus (MCMV) (Tombusviridae), which are known to cause MLN by synergistic co-infection. In addition to these expected viruses, we identified a virus in the genus Polerovirus (family Luteoviridae) in 104/172 samples selected for MLN or other potential virus symptoms from Kenya, Uganda, Rwanda, and Tanzania. This polerovirus (MF974579) nucleotide sequence is 97% identical to maize-associated viruses recently reported in China, termed 'maize yellow mosaic virus' (MaYMV) and maize yellow dwarf virus (MaYMV; KU291101, KU291107, MYDV-RMV2; KT992824); and 99% identical to MaYMV (KY684356) infecting sugarcane and itch grass in Nigeria; 83% identical to a barley-associated polerovirus recently identified in Korea (BVG; KT962089); and 79% identical to the U.S. maize-infecting polerovirus maize yellow dwarf virus (MYDV-RMV; KT992824). Nucleotide sequences from ORF0 of 20 individual East African isolates collected from Kenya, Uganda, Rwanda, and Tanzania shared 98% or higher identity, and were detected in 104/172 (60.5%) of samples collected for virus-like symptoms, indicating extensive prevalence but limited diversity of this virus in East Africa. We refer to this virus as "MYDV-like polerovirus" until symptoms of the virus in maize are known.

  8. "De-novo" amino acid sequence elucidation of protein G'e by combined "top-down" and "bottom-up" mass spectrometry.

    PubMed

    Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F M; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L; Glocker, Michael O

    2015-03-01

    Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein G´ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α-N-gluconoylation and α-N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α-N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant (K(d)) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins.

  9. Genetic Diversity and Phylogenetic Evolution of Tibetan Sheep Based on mtDNA D-Loop Sequences

    PubMed Central

    Yue, Yaojing; Guo, Xian; Guo, Tingting; Chu, Min; Wang, Fan; Han, Jilong; Feng, Ruilin; Sun, Xiaoping; Niu, Chune; Yang, Bohui; Guo, Jian; Yuan, Chao

    2016-01-01

    The molecular and population genetic evidence of the phylogenetic status of the Tibetan sheep (Ovis aries) is not well understood, and little is known about this species’ genetic diversity. This knowledge gap is partly due to the difficulty of sample collection. This is the first work to address this question. Here, the genetic diversity and phylogenetic relationship of 636 individual Tibetan sheep from fifteen populations were assessed using 642 complete sequences of the mitochondrial DNA D-loop. Samples were collected from the Qinghai-Tibetan Plateau area in China, and reference data were obtained from the six reference breed sequences available in GenBank. The length of the sequences varied considerably, between 1031 and 1259 bp. The haplotype diversity and nucleotide diversity were 0.992±0.010 and 0.019±0.001, respectively. The average number of nucleotide differences was 19.635. The mean nucleotide composition of the 350 haplotypes was 32.961% A, 29.708% T, 22.892% C, 14.439% G, 62.669% A+T, and 37.331% G+C. Phylogenetic analysis showed that all four previously defined haplogroups (A, B, C, and D) were found in the 636 individuals of the fifteen Tibetan sheep populations but that only the D haplogroup was found in Linzhou sheep. Further, the clustering analysis divided the fifteen Tibetan sheep populations into at least two clusters. The estimation of the demographic parameters from the mismatch analyses showed that haplogroups A, B, and C had at least one demographic expansion in Tibetan sheep. These results contribute to the knowledge of Tibetan sheep populations and will help inform future conservation programs about the Tibetan sheep native to the Qinghai-Tibetan Plateau. PMID:27463976

  10. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    PubMed

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  11. Genetic Diversity and Population Structure of F3:6 Nebraska Winter Wheat Genotypes Using Genotyping-By-Sequencing.

    PubMed

    Eltaher, Shamseldeen; Sallam, Ahmed; Belamkar, Vikas; Emara, Hamdy A; Nower, Ahmed A; Salem, Khaled F M; Poland, Jesse; Baenziger, Peter S

    2018-01-01

    The availability of information on the genetic diversity and population structure in wheat ( Triticum aestivum L.) breeding lines will help wheat breeders to better use their genetic resources and manage genetic variation in their breeding program. The recent advances in sequencing technology provide the opportunity to identify tens or hundreds of thousands of single nucleotide polymorphism (SNPs) in large genome species (e.g., wheat). These SNPs can be utilized for understanding genetic diversity and performing genome wide association studies (GWAS) for complex traits. In this study, the genetic diversity and population structure were investigated in a set of 230 genotypes (F 3:6 ) derived from various crosses as a prerequisite for GWAS and genomic selection. Genotyping-by-sequencing provided 25,566 high-quality SNPs. The polymorphism information content (PIC) across chromosomes ranged from 0.09 to 0.37 with an average of 0.23. The distribution of SNPs markers on the 21 chromosomes ranged from 319 on chromosome 3D to 2,370 on chromosome 3B. The analysis of population structure revealed three subpopulations (G1, G2, and G3). Analysis of molecular variance identified 8% variance among and 92% within subpopulations. Of the three subpopulations, G2 had the highest level of genetic diversity based on three genetic diversity indices: Shannon's information index ( I ) = 0.494, diversity index ( h ) = 0.328 and unbiased diversity index (uh) = 0.331, while G3 had lowest level of genetic diversity ( I = 0.348, h = 0.226 and uh = 0.236). This high genetic diversity identified among the subpopulations can be used to develop new wheat cultivars.

  12. Genetic Diversity and Population Structure of F3:6 Nebraska Winter Wheat Genotypes Using Genotyping-By-Sequencing

    PubMed Central

    Eltaher, Shamseldeen; Sallam, Ahmed; Belamkar, Vikas; Emara, Hamdy A.; Nower, Ahmed A.; Salem, Khaled F. M.; Poland, Jesse; Baenziger, Peter S.

    2018-01-01

    The availability of information on the genetic diversity and population structure in wheat (Triticum aestivum L.) breeding lines will help wheat breeders to better use their genetic resources and manage genetic variation in their breeding program. The recent advances in sequencing technology provide the opportunity to identify tens or hundreds of thousands of single nucleotide polymorphism (SNPs) in large genome species (e.g., wheat). These SNPs can be utilized for understanding genetic diversity and performing genome wide association studies (GWAS) for complex traits. In this study, the genetic diversity and population structure were investigated in a set of 230 genotypes (F3:6) derived from various crosses as a prerequisite for GWAS and genomic selection. Genotyping-by-sequencing provided 25,566 high-quality SNPs. The polymorphism information content (PIC) across chromosomes ranged from 0.09 to 0.37 with an average of 0.23. The distribution of SNPs markers on the 21 chromosomes ranged from 319 on chromosome 3D to 2,370 on chromosome 3B. The analysis of population structure revealed three subpopulations (G1, G2, and G3). Analysis of molecular variance identified 8% variance among and 92% within subpopulations. Of the three subpopulations, G2 had the highest level of genetic diversity based on three genetic diversity indices: Shannon’s information index (I) = 0.494, diversity index (h) = 0.328 and unbiased diversity index (uh) = 0.331, while G3 had lowest level of genetic diversity (I = 0.348, h = 0.226 and uh = 0.236). This high genetic diversity identified among the subpopulations can be used to develop new wheat cultivars. PMID:29593779

  13. PCR Primers to Study the Diversity of Expressed Fungal Genes Encoding Lignocellulolytic Enzymes in Soils Using High-Throughput Sequencing

    PubMed Central

    Barbi, Florian; Bragalini, Claudia; Vallon, Laurent; Prudent, Elsa; Dubost, Audrey; Fraissinet-Tachet, Laurence; Marmeisse, Roland; Luis, Patricia

    2014-01-01

    Plant biomass degradation in soil is one of the key steps of carbon cycling in terrestrial ecosystems. Fungal saprotrophic communities play an essential role in this process by producing hydrolytic enzymes active on the main components of plant organic matter. Open questions in this field regard the diversity of the species involved, the major biochemical pathways implicated and how these are affected by external factors such as litter quality or climate changes. This can be tackled by environmental genomic approaches involving the systematic sequencing of key enzyme-coding gene families using soil-extracted RNA as material. Such an approach necessitates the design and evaluation of gene family-specific PCR primers producing sequence fragments compatible with high-throughput sequencing approaches. In the present study, we developed and evaluated PCR primers for the specific amplification of fungal CAZy Glycoside Hydrolase gene families GH5 (subfamily 5) and GH11 encoding endo-β-1,4-glucanases and endo-β-1,4-xylanases respectively as well as Basidiomycota class II peroxidases, corresponding to the CAZy Auxiliary Activity family 2 (AA2), active on lignin. These primers were experimentally validated using DNA extracted from a wide range of Ascomycota and Basidiomycota species including 27 with sequenced genomes. Along with the published primers for Glycoside Hydrolase GH7 encoding enzymes active on cellulose, the newly design primers were shown to be compatible with the Illumina MiSeq sequencing technology. Sequences obtained from RNA extracted from beech or spruce forest soils showed a high diversity and were uniformly distributed in gene trees featuring the global diversity of these gene families. This high-throughput sequencing approach using several degenerate primers constitutes a robust method, which allows the simultaneous characterization of the diversity of different fungal transcripts involved in plant organic matter degradation and may lead to the

  14. Synthetic oligonucleotide probes deduced from amino acid sequence data. Theoretical and practical considerations.

    PubMed

    Lathe, R

    1985-05-05

    Synthetic probes deduced from amino acid sequence data are widely used to detect cognate coding sequences in libraries of cloned DNA segments. The redundancy of the genetic code dictates that a choice must be made between (1) a mixture of probes reflecting all codon combinations, and (2) a single longer "optimal" probe. The second strategy is examined in detail. The frequency of sequences matching a given probe by chance alone can be determined and also the frequency of sequences closely resembling the probe and contributing to the hybridization background. Gene banks cannot be treated as random associations of the four nucleotides, and probe sequences deduced from amino acid sequence data occur more often than predicted by chance alone. Probe lengths must be increased to confer the necessary specificity. Examination of hybrids formed between unique homologous probes and their cognate targets reveals that short stretches of perfect homology occurring by chance make a significant contribution to the hybridization background. Statistical methods for improving homology are examined, taking human coding sequences as an example, and considerations of codon utilization and dinucleotide frequencies yield an overall homology of greater than 82%. Recommendations for probe design and hybridization are presented, and the choice between using multiple probes reflecting all codon possibilities and a unique optimal probe is discussed.

  15. Violation of an Evolutionarily Conserved Immunoglobulin Diversity Gene Sequence Preference Promotes Production of dsDNA-Specific IgG Antibodies

    PubMed Central

    Silva-Sanchez, Aaron; Liu, Cun Ren; Vale, Andre M.; Khass, Mohamed; Kapoor, Pratibha; Elgavish, Ada; Ivanov, Ivaylo I.; Ippolito, Gregory C.; Schelonka, Robert L.; Schoeb, Trenton R.; Burrows, Peter D.; Schroeder, Harry W.

    2015-01-01

    Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3), which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH) gene segment sequence content by reading frame (RF) is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1), which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies. PMID:25706374

  16. Triazine-based sequence-defined polymers with side-chain diversity and backbone-backbone interaction motifs

    DOE PAGES

    Grate, Jay W.; Mo, Kai -For; Daily, Michael D.

    2016-02-10

    Sequence control in polymers, well-known in nature, encodes structure and functionality. Here we introduce a new architecture, based on the nucleophilic aromatic substitution chemistry of cyanuric chloride, that creates a new class of sequence-defined polymers dubbed TZPs. Proof of concept is demonstrated with two synthesized hexamers, having neutral and ionizable side chains. Molecular dynamics simulations show backbone–backbone interactions, including H-bonding motifs and pi–pi interactions. This architecture is arguably biomimetic while differing from sequence-defined polymers having peptide bonds. In conclusion, the synthetic methodology supports the structural diversity of side chains known in peptides, as well as backbone–backbone hydrogen-bonding motifs, and willmore » thus enable new macromolecules and materials with useful functions.« less

  17. Triazine-Based Sequence-Defined Polymers with Side-Chain Diversity and Backbone-Backbone Interaction Motifs.

    PubMed

    Grate, Jay W; Mo, Kai-For; Daily, Michael D

    2016-03-14

    Sequence control in polymers, well-known in nature, encodes structure and functionality. Here we introduce a new architecture, based on the nucleophilic aromatic substitution chemistry of cyanuric chloride, that creates a new class of sequence-defined polymers dubbed TZPs. Proof of concept is demonstrated with two synthesized hexamers, having neutral and ionizable side chains. Molecular dynamics simulations show backbone-backbone interactions, including H-bonding motifs and pi-pi interactions. This architecture is arguably biomimetic while differing from sequence-defined polymers having peptide bonds. The synthetic methodology supports the structural diversity of side chains known in peptides, as well as backbone-backbone hydrogen-bonding motifs, and will thus enable new macromolecules and materials with useful functions. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Triazine-based sequence-defined polymers with side-chain diversity and backbone-backbone interaction motifs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grate, Jay W.; Mo, Kai -For; Daily, Michael D.

    Sequence control in polymers, well-known in nature, encodes structure and functionality. Here we introduce a new architecture, based on the nucleophilic aromatic substitution chemistry of cyanuric chloride, that creates a new class of sequence-defined polymers dubbed TZPs. Proof of concept is demonstrated with two synthesized hexamers, having neutral and ionizable side chains. Molecular dynamics simulations show backbone–backbone interactions, including H-bonding motifs and pi–pi interactions. This architecture is arguably biomimetic while differing from sequence-defined polymers having peptide bonds. In conclusion, the synthetic methodology supports the structural diversity of side chains known in peptides, as well as backbone–backbone hydrogen-bonding motifs, and willmore » thus enable new macromolecules and materials with useful functions.« less

  19. Transcriptome sequencing of diverse peanut (arachis) wild species and the cultivated species reveals a wealth of untapped genetic variability

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencing technologies and improved bioinformatics methods have provided opportunities to study sequence variability in complex polyploid transcriptomes. In this study, we used a diverse panel of twenty-two Arachis accessions representing seven Arachis hypogaea market classes, A-, B...

  20. Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

    PubMed

    Pietrowski, D; Förster, M

    2000-01-01

    The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).

  1. Meteoritic Amino Acids: Diversity in Compositions Reflects Parent Body Histories

    NASA Technical Reports Server (NTRS)

    Elsila, Jamie E.; Aponte, Jose C.; Blackmond, Donna G.; Burton, Aaron S.; Dworkin, Jason P.; Glavin, Daniel P.

    2016-01-01

    The analysis of amino acids in meteorites dates back over 50 years; however, it is only in recent years that research has expanded beyond investigations of a narrow set of meteorite groups (exemplied by the Murchison meteorite) into meteorites of other types and classes. These new studies have shown a wide diversity in the abundance and distribution of amino acids across carbonaceous chondrite groups, highlighting the role of parent body processes and composition in the creation, preservation, or alteration of amino acids. Although most chiral amino acids are racemic in meteorites, the enantiomeric distribution of some amino acids, particularly of the nonprotein amino acid isovaline, has also been shown to vary both within certain meteorites and across carbonaceous meteorite groups. Large -enantiomeric excesses of some extraterrestrial protein amino acids (up to 60) have also been observed in rare cases and point to nonbiological enantiomeric enrichment processes prior to the emergence of life. In this Outlook, we review these recent meteoritic analyses, focusing on variations in abundance, structural distributions, and enantiomeric distributions of amino acids and discussing possible explanations for these observations and the potential for future work.

  2. Meteoritic Amino Acids: Diversity in Compositions Reflects Parent Body Histories

    PubMed Central

    2016-01-01

    The analysis of amino acids in meteorites dates back over 50 years; however, it is only in recent years that research has expanded beyond investigations of a narrow set of meteorite groups (exemplified by the Murchison meteorite) into meteorites of other types and classes. These new studies have shown a wide diversity in the abundance and distribution of amino acids across carbonaceous chondrite groups, highlighting the role of parent body processes and composition in the creation, preservation, or alteration of amino acids. Although most chiral amino acids are racemic in meteorites, the enantiomeric distribution of some amino acids, particularly of the nonprotein amino acid isovaline, has also been shown to vary both within certain meteorites and across carbonaceous meteorite groups. Large l-enantiomeric excesses of some extraterrestrial protein amino acids (up to ∼60%) have also been observed in rare cases and point to nonbiological enantiomeric enrichment processes prior to the emergence of life. In this Outlook, we review these recent meteoritic analyses, focusing on variations in abundance, structural distributions, and enantiomeric distributions of amino acids and discussing possible explanations for these observations and the potential for future work. PMID:27413780

  3. Lactic acid bacteria involved in cocoa beans fermentation from Ivory Coast: Species diversity and citrate lyase production.

    PubMed

    Ouattara, Hadja D; Ouattara, Honoré G; Droux, Michel; Reverchon, Sylvie; Nasser, William; Niamke, Sébastien L

    2017-09-01

    Microbial fermentation is an indispensable process for high quality chocolate from cocoa bean raw material. lactic acid bacteria (LAB) are among the major microorganisms responsible for cocoa fermentation but their exact role remains to be elucidated. In this study, we analyzed the diversity of LAB in six cocoa producing regions of Ivory Coast. Ribosomal 16S gene sequence analysis showed that Lactobacillus plantarum and Leuconostoc mesenteroides are the dominant LAB species in these six regions. In addition, other species were identified as the minor microbial population, namely Lactobacillus curieae, Enterococcus faecium, Fructobacillus pseudoficulneus, Lactobacillus casei, Weissella paramesenteroides and Weissella cibaria. However, in each region, the LAB microbial population was composed of a restricted number of species (maximum 5 species), which varied between the different regions. LAB implication in the breakdown of citric acid was investigated as a fundamental property for a successful cocoa fermentation process. High citrate lyase producer strains were characterized by rapid citric acid consumption, as revealed by a 4-fold decrease in citric acid concentration in the growth medium within 12h, concomitant with an increase in acetic acid and lactic acid concentration. The production of citrate lyase was strongly dependent on environmental conditions, with optimum production at acidic pH (pH<5), and moderate temperature (30-40°C), which corresponds to conditions prevailing in the early stage of natural cocoa fermentation. This study reveals that one of the major roles of LAB in the cocoa fermentation process involves the breakdown of citric acid during the early stage of cocoa fermentation through the activity of citrate lyase. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Estimates of Soil Bacterial Ribosome Content and Diversity Are Significantly Affected by the Nucleic Acid Extraction Method Employed

    PubMed Central

    Wüst, Pia K.; Nacke, Heiko; Kaiser, Kristin; Marhan, Sven; Sikorski, Johannes; Kandeler, Ellen; Daniel, Rolf

    2016-01-01

    Modern sequencing technologies allow high-resolution analyses of total and potentially active soil microbial communities based on their DNA and RNA, respectively. In the present study, quantitative PCR and 454 pyrosequencing were used to evaluate the effects of different extraction methods on the abundance and diversity of 16S rRNA genes and transcripts recovered from three different types of soils (leptosol, stagnosol, and gleysol). The quality and yield of nucleic acids varied considerably with respect to both the applied extraction method and the analyzed type of soil. The bacterial ribosome content (calculated as the ratio of 16S rRNA transcripts to 16S rRNA genes) can serve as an indicator of the potential activity of bacterial cells and differed by 2 orders of magnitude between nucleic acid extracts obtained by the various extraction methods. Depending on the extraction method, the relative abundances of dominant soil taxa, in particular Actinobacteria and Proteobacteria, varied by a factor of up to 10. Through this systematic approach, the present study allows guidelines to be deduced for the selection of the appropriate extraction protocol according to the specific soil properties, the nucleic acid of interest, and the target organisms. PMID:26896137

  5. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers.

    PubMed

    Hou, Weiguo; Wang, Shang; Briggs, Brandon R; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

    2018-01-01

    Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  6. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    PubMed Central

    Hou, Weiguo; Wang, Shang; Briggs, Brandon R.; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

    2018-01-01

    Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  7. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

    NASA Astrophysics Data System (ADS)

    McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  8. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides.

    PubMed

    McMillen, Chelsea L; Wright, Patience M; Cassady, Carolyn J

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  9. Molecular diversity of lactic acid bacteria on ileum broiler chicken fed by bran and bran fermentation

    NASA Astrophysics Data System (ADS)

    Baniyah, Laelatul; Nur Jannah, Siti; Rukmi, Isworo; Sugiharto

    2018-05-01

    Lactic Acid Bacteria (LAB) is a digestive tract microflora that have a positive role in poultry health. The number and diversity of LAB in the digestive tract affected by several factors, among them was the kind of feed. The purpose of this research was to know diversity of Lactic Acid Bacteria (LAB) ileum broiler’s after feeding with prebiotic bran and Rhizopus oryzae fermented bran which was added to commercial feed. As much as 15 broilers were used to determine the diversity of LAB. All broilers were fed using commercial feed. The control used commercial feed no addition of bran or fermented bran, and commercial feed with fermented bran and nonfermented bran were as a treatment. To determine the diversity of LAB, T-RFLP method was applied. The Hae III and Msp I were used as restriction enzymes. The number of phylotype, relative abundance, Shannon diversity index (H '), evenness (E), and Dominance (D) were examined. The results indicated that the addition of prebiotic bran on commercial feed showed a higher diversity of lactic acid bacteria on broiler’s ileum, compared with control and addition of Rhizopus oryzae fermented bran. LAB group that dominates in the ileum is Lactobacillus sp. and L. delbruecii subs bulgaricus.

  10. T7 lytic phage-displayed peptide libraries: construction and diversity characterization.

    PubMed

    Krumpe, Lauren R H; Mori, Toshiyuki

    2014-01-01

    In this chapter, we describe the construction of T7 bacteriophage (phage)-displayed peptide libraries and the diversity analyses of random amino acid sequences obtained from the libraries. We used commercially available reagents, Novagen's T7Select system, to construct the libraries. Using a combination of biotinylated extension primer and streptavidin-coupled magnetic beads, we were able to prepare library DNA without applying gel purification, resulting in extremely high ligation efficiencies. Further, we describe the use of bioinformatics tools to characterize library diversity. Amino acid frequency and positional amino acid diversity and hydropathy are estimated using the REceptor LIgand Contacts website http://relic.bio.anl.gov. Peptide net charge analysis and peptide hydropathy analysis are conducted using the Genetics Computer Group Wisconsin Package computational tools. A comprehensive collection of the estimated number of recombinants and titers of T7 phage-displayed peptide libraries constructed in our lab is included.

  11. Genome Sequencing and Analysis of Geographically Diverse Clinical Isolates of Herpes Simplex Virus 2

    PubMed Central

    Lamers, Susanna L.; Weiner, Brian; Ray, Stuart C.; Colgrove, Robert C.; Diaz, Fernando; Jing, Lichen; Wang, Kening; Saif, Sakina; Young, Sarah; Henn, Matthew; Laeyendecker, Oliver; Tobian, Aaron A. R.; Cohen, Jeffrey I.; Koelle, David M.; Quinn, Thomas C.; Knipe, David M.

    2015-01-01

    ABSTRACT Herpes simplex virus 2 (HSV-2), the principal causative agent of recurrent genital herpes, is a highly prevalent viral infection worldwide. Limited information is available on the amount of genomic DNA variation between HSV-2 strains because only two genomes have been determined, the HG52 laboratory strain and the newly sequenced SD90e low-passage-number clinical isolate strain, each from a different geographical area. In this study, we report the nearly complete genome sequences of 34 HSV-2 low-passage-number and laboratory strains, 14 of which were collected in Uganda, 1 in South Africa, 11 in the United States, and 8 in Japan. Our analyses of these genomes demonstrated remarkable sequence conservation, regardless of geographic origin, with the maximum nucleotide divergence between strains being 0.4% across the genome. In contrast, prior studies indicated that HSV-1 genomes exhibit more sequence diversity, as well as geographical clustering. Additionally, unlike HSV-1, little viral recombination between HSV-2 strains could be substantiated. These results are interpreted in light of HSV-2 evolution, epidemiology, and pathogenesis. Finally, the newly generated sequences more closely resemble the low-passage-number SD90e than HG52, supporting the use of the former as the new reference genome of HSV-2. IMPORTANCE Herpes simplex virus 2 (HSV-2) is a causative agent of genital and neonatal herpes. Therefore, knowledge of its DNA genome and genetic variability is central to preventing and treating genital herpes. However, only two full-length HSV-2 genomes have been reported. In this study, we sequenced 34 additional HSV-2 low-passage-number and laboratory viral genomes and initiated analysis of the genetic diversity of HSV-2 strains from around the world. The analysis of these genomes will facilitate research aimed at vaccine development, diagnosis, and the evaluation of clinical manifestations and transmission of HSV-2. This information will also contribute

  12. Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo.

    PubMed

    Rodgers, Mary A; Wilkinson, Eduan; Vallari, Ana; McArthur, Carole; Sthreshley, Larry; Brennan, Catherine A; Cloherty, Gavin; de Oliveira, Tulio

    2017-03-15

    As the epidemiological epicenter of the human immunodeficiency virus (HIV) pandemic, the Democratic Republic of the Congo (DRC) is a reservoir of circulating HIV strains exhibiting high levels of diversity and recombination. In this study, we characterized HIV specimens collected in two rural areas of the DRC between 2001 and 2003 to identify rare strains of HIV. The env gp41 region was sequenced and characterized for 172 HIV-positive specimens. The env sequences were predominantly subtype A (43.02%), but 7 other subtypes (33.14%), 20 circulating recombinant forms (CRFs; 11.63%), and 20 unclassified (11.63%) sequences were also found. Of the rare and unclassified subtypes, 18 specimens were selected for next-generation sequencing (NGS) by a modified HIV-switching mechanism at the 5' end of the RNA template (SMART) method to obtain full-genome sequences. NGS produced 14 new complete genomes, which included pure subtype C ( n = 2), D ( n = 1), F1 ( n = 1), H ( n = 3), and J ( n = 1) genomes. The two subtype C genomes and one of the subtype H genomes branched basal to their respective subtype branches but had no evidence of recombination. The remaining 6 genomes were complex recombinants of 2 or more subtypes, including subtypes A1, F, G, H, J, and K and unclassified fragments, including one subtype CRF25 isolate, which branched basal to all CRF25 references. Notably, all recombinant subtype H fragments branched basal to the H clade. Spatial-geographical analysis indicated that the diverse sequences identified here did not expand globally. The full-genome and subgenomic sequences identified in our study population significantly increase the documented diversity of the strains involved in the continually evolving HIV-1 pandemic. IMPORTANCE Very little is known about the ancestral HIV-1 strains that founded the global pandemic, and very few complete genome sequences are available from patients in the Congo Basin, where HIV-1 expanded early in the global pandemic. By

  13. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing

    PubMed Central

    Manske, Magnus; Miotto, Olivo; Campino, Susana; Auburn, Sarah; Almagro-Garcia, Jacob; Maslen, Gareth; O’Brien, Jack; Djimde, Abdoulaye; Doumbo, Ogobara; Zongo, Issaka; Ouedraogo, Jean-Bosco; Michon, Pascal; Mueller, Ivo; Siba, Peter; Nzila, Alexis; Borrmann, Steffen; Kiara, Steven M.; Marsh, Kevin; Jiang, Hongying; Su, Xin-Zhuan; Amaratunga, Chanaki; Fairhurst, Rick; Socheat, Duong; Nosten, Francois; Imwong, Mallika; White, Nicholas J.; Sanders, Mandy; Anastasi, Elisa; Alcock, Dan; Drury, Eleanor; Oyola, Samuel; Quail, Michael A.; Turner, Daniel J.; Rubio, Valentin Ruano; Jyothi, Dushyanth; Amenga-Etego, Lucas; Hubbart, Christina; Jeffreys, Anna; Rowlands, Kate; Sutherland, Colin; Roper, Cally; Mangano, Valentina; Modiano, David; Tan, John C.; Ferdig, Michael T.; Amambua-Ngwa, Alfred; Conway, David J.; Takala-Harrison, Shannon; Plowe, Christopher V.; Rayner, Julian C.; Rockett, Kirk A.; Clark, Taane G.; Newbold, Chris I.; Berriman, Matthew; MacInnis, Bronwyn; Kwiatkowski, Dominic P.

    2013-01-01

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. 1,2 Here we describe methods for large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short term culture. Analysis of 86,158 exonic SNPs that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome. PMID:22722859

  14. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less

  15. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...

  16. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...

  18. Full genome virus detection in fecal samples using sensitive nucleic acid preparation, deep sequencing, and a novel iterative sequence classification algorithm.

    PubMed

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis.

  19. Full Genome Virus Detection in Fecal Samples Using Sensitive Nucleic Acid Preparation, Deep Sequencing, and a Novel Iterative Sequence Classification Algorithm

    PubMed Central

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J.; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis. PMID:24695106

  20. Identification of New Cocrystal Systems with Stoichiometric Diversity of Salicylic Acid Using Thermal Methods.

    PubMed

    Zhou, Zhengzheng; Chan, Hok Man; Sung, Herman H-Y; Tong, Henry H Y; Zheng, Ying

    2016-04-01

    The purpose of this work was to develop thermal methods to identify cocrystal systems with stoichiometric diversity. Differential scanning calorimetry (DSC) and hot stage microscopy (HSM) have been applied to study the stoichiometric diversity phenomenon on cocrystal systems of the model compound salicylic acid (SA) with different coformers (CCFs). The DSC method was particularly useful in the identification of cocrystal re-crystallization, especially to improve the temperature resolution using a slower heating rate. HSM was implemented as a complementary protocol to confirm the DSC results. The crystal structures were elucidated by single-crystal X-ray diffraction (SXRD). Two new cocrystal systems consisting of salicylic acid-benzamide (SA-BZD, 1:1, 1:2) and salicylic acid-isonicotinamide (SA-ISN, 1:1, 2:1) have been identified in the present work. The chemical structures of the newly discovered cocrystals SA-BZD (1:2) and SA-ISN (2:1) have been elucidated using X-ray single crystal and powder diffraction methods. The developed thermal methods could rapidly identify cocrystal systems with stoichiometric diversity, with the potential to discover new pharmaceutical cocrystals in the future.

  1. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.

    PubMed

    Liu, Sophia S; Hockenberry, Adam J; Lancichinetti, Andrea; Jewett, Michael C; Amaral, Luís A N

    2016-11-01

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.

  2. Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.

    PubMed

    Wong, Lai-Ping; Lai, Jason Kuan-Han; Saw, Woei-Yuh; Ong, Rick Twee-Hee; Cheng, Anthony Youzhi; Pillai, Nisha Esakimuthu; Liu, Xuanyao; Xu, Wenting; Chen, Peng; Foo, Jia-Nee; Tan, Linda Wei-Lin; Koo, Seok-Hwee; Soong, Richie; Wenk, Markus Rene; Lim, Wei-Yen; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2014-05-01

    South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.

  3. [Complete genome sequencing of polymalic acid-producing strain Aureobasidium pullulans CCTCC M2012223].

    PubMed

    Wang, Yongkang; Song, Xiaodan; Li, Xiaorong; Yang, Sang-tian; Zou, Xiang

    2017-01-04

    To explore the genome sequence of Aureobasidium pullulans CCTCC M2012223, analyze the key genes related to the biosynthesis of important metabolites, and provide genetic background for metabolic engineering. Complete genome of A. pullulans CCTCC M2012223 was sequenced by Illumina HiSeq high throughput sequencing platform. Then, fragment assembly, gene prediction, functional annotation, and GO/COG cluster were analyzed in comparison with those of other five A. pullulans varieties. The complete genome sequence of A. pullulans CCTCC M2012223 was 30756831 bp with an average GC content of 47.49%, and 9452 genes were successfully predicted. Genome-wide analysis showed that A. pullulans CCTCC M2012223 had the biggest genome assembly size. Protein sequences involved in the pullulan and polymalic acid pathway were highly conservative in all of six A. pullulans varieties. Although both A. pullulans CCTCC M2012223 and A. pullulans var. melanogenum have a close affinity, some point mutation and inserts were occurred in protein sequences involved in melanin biosynthesis. Genome information of A. pullulans CCTCC M2012223 was annotated and genes involved in melanin, pullulan and polymalic acid pathway were compared, which would provide a theoretical basis for genetic modification of metabolic pathway in A. pullulans.

  4. Endophytic bacterial diversity in grapevine (Vitis vinifera L.) leaves described by 16S rRNA gene sequence analysis and length heterogeneity-PCR.

    PubMed

    Bulgari, Daniela; Casati, Paola; Brusetti, Lorenzo; Quaglino, Fabio; Brasca, Milena; Daffonchio, Daniele; Bianco, Piero Attilio

    2009-08-01

    Diversity of bacterial endophytes associated with grapevine leaf tissues was analyzed by cultivation and cultivation-independent methods. In order to identify bacterial endophytes directly from metagenome, a protocol for bacteria enrichment and DNA extraction was optimized. Sequence analysis of 16S rRNA gene libraries underscored five diverse Operational Taxonomic Units (OTUs), showing best sequence matches with gamma-Proteobacteria, family Enterobacteriaceae, with a dominance of the genus Pantoea. Bacteria isolation through cultivation revealed the presence of six OTUs, showing best sequence matches with Actinobacteria, genus Curtobacterium, and with Firmicutes genera Bacillus and Enterococcus. Length Heterogeneity-PCR (LH-PCR) electrophoretic peaks from single bacterial clones were used to setup a database representing the bacterial endophytes identified in association with grapevine tissues. Analysis of healthy and phytoplasma-infected grapevine plants showed that LH-PCR could be a useful complementary tool for examining the diversity of bacterial endophytes especially for diversity survey on a large number of samples.

  5. Molecular sequence data of hepatitis B virus and genetic diversity after vaccination.

    PubMed

    van Ballegooijen, W Marijn; van Houdt, Robin; Bruisten, Sylvia M; Boot, Hein J; Coutinho, Roel A; Wallinga, Jacco

    2009-12-15

    The effect of vaccination programs on transmission of infectious disease is usually assessed by monitoring programs that rely on notifications of symptomatic illness. For monitoring of infectious diseases with a high proportion of asymptomatic cases or a low reporting rate, molecular sequence data combined with modern coalescent-based techniques offer a complementary tool to assess transmission. Here, the authors investigate the added value of using viral sequence data to monitor a vaccination program that was started in 1998 and was targeted against hepatitis B virus in men who have sex with men in Amsterdam, the Netherlands. The incidence in this target group, as estimated from the notifications of acute infections with hepatitis B virus, was low; therefore, there was insufficient power to show a significant change in incidence. In contrast, the genetic diversity, as estimated from the viral sequence collected from the target group, revealed a marked decrease after vaccination was introduced. Taken together, the findings suggest that introduction of vaccination coincided with a change in the target group toward behavior with a higher risk of infection. The authors argue that molecular sequence data provide a powerful additional monitoring instrument, next to conventional case registration, for assessing the impact of vaccination.

  6. Genotyping-by-Sequencing (GBS) Revealed Molecular Genetic Diversity of Iranian Wheat Landraces and Cultivars

    PubMed Central

    Alipour, Hadi; Bihamta, Mohammad R.; Mohammadi, Valiollah; Peyghambari, Seyed A.; Bai, Guihua; Zhang, Guorong

    2017-01-01

    Background: Genetic diversity is an essential resource for breeders to improve new cultivars with desirable characteristics. Recently, genotyping-by-sequencing (GBS), a next-generation sequencing (NGS) technology that can simplify complex genomes, has now be used as a high-throughput and cost-effective molecular tool for routine breeding and screening in many crop species, including the species with a large genome. Results: We genotyped a diversity panel of 369 Iranian hexaploid wheat accessions including 270 landraces collected between 1931 and 1968 in different climate zones and 99 cultivars released between 1942 to 2014 using 16,506 GBS-based single nucleotide polymorphism (GBS-SNP) markers. The B genome had the highest number of mapped SNPs while the D genome had the lowest on both the Chinese Spring and W7984 references. Structure and cluster analyses divided the panel into three groups with two landrace groups and one cultivar group, suggesting a high differentiation between landraces and cultivars and between landraces. The cultivar group can be further divided into four subgroups with one subgroup was mostly derived from Iranian ancestor(s). Similarly, landrace groups can be further divided based on years of collection and climate zones where the accessions were collected. Molecular analysis of variance indicated that the genetic variation was larger between groups than within group. Conclusion: Obvious genetic diversity in Iranian wheat was revealed by analysis of GBS-SNPs and thus breeders can select genetically distant parents for crossing in breeding. The diverse Iranian landraces provide rich genetic sources of tolerance to biotic and abiotic stresses, and they can be useful resources for the improvement of wheat production in Iran and other countries. PMID:28912785

  7. Diversity and distribution of unicellular opisthokonts along the European coast analyzed using high-throughput sequencing

    PubMed Central

    del Campo, Javier; Mallo, Diego; Massana, Ramon; de Vargas, Colomban; Richards, Thomas A.; Ruiz-Trillo, Iñaki

    2015-01-01

    Summary The opisthokonts are one of the major super-groups of eukaryotes. It comprises two major clades: 1) the Metazoa and their unicellular relatives and 2) the Fungi and their unicellular relatives. There is, however, little knowledge of the role of opisthokont microbes in many natural environments, especially among non-metazoan and non-fungal opisthokonts. Here we begin to address this gap by analyzing high throughput 18S rDNA and 18S rRNA sequencing data from different European coastal sites, sampled at different size fractions and depths. In particular, we analyze the diversity and abundance of choanoflagellates, filastereans, ichthyosporeans, nucleariids, corallochytreans and their related lineages. Our results show the great diversity of choanoflagellates in coastal waters as well as a relevant role of the ichthyosporeans and the uncultured marine opisthokonts (MAOP). Furthermore, we describe a new lineage of marine fonticulids (MAFO) that appears to be abundant in sediments. Therefore, our work points to a greater potential ecological role for unicellular opisthokonts than previously appreciated in marine environments, both in water column and sediments, and also provides evidence of novel opisthokont phylogenetic lineages. This study highlights the importance of high throughput sequencing approaches to unravel the diversity and distribution of both known and novel eukaryotic lineages. PMID:25556908

  8. [Microbial diversity and ammonia-oxidizing microorganism of a soil sample near an acid mine drainage lake].

    PubMed

    Liu, Ying; Wang, Li-Hua; Hao, Chun-Bo; Li, Lu; Li, Si-Yuan; Feng, Chuan-Ping

    2014-06-01

    The main physicochemical parameters of the soil sample which was collected near an acid mine drainage reservoir in Anhui province was analyzed. The microbial diversity and community structure was studied through the construction of bacteria and archaea 16S rRNA gene clone libraries and ammonia monooxygenase gene clone library of archaea. The functional groups which were responsible for the process of ammonia oxidation were also discussed. The results indicated that the soil sample had extreme low pH value (pH < 3) and high ions concentration, which was influenced by the acid mine drainage (AMD). All the 16S rRNA gene sequences of bacteria clone library fell into 11 phyla, and Acidobacteria played the most significant role in the ecosystem followed by Verrucomicrobia. A great number of acidophilic bacteria existed in the soil sample, such as Candidatus Koribacter versatilis and Holophaga sp.. The archaea clone library consisted of 2 phyla (Thaumarchaeota and Euryarchaeota). The abundance of Thaumarchaeota was remarkably higher than Euryarchaeota. The ammonia oxidation in the soil environment was probably driven by ammonia-oxidizing archaea, and new species of ammonia-oxidizing archaea existed in the soil sample.

  9. Diverse novel astroviruses identified in wild Himalayan marmots.

    PubMed

    Ao, Yuan-Yun; Yu, Jie-Mei; Li, Li-Li; Cao, Jing-Yuan; Deng, Hong-Yan; Xin, Yun-Yun; Liu, Meng-Meng; Lin, Lin; Lu, Shan; Xu, Jian-Guo; Duan, Zhao-Jun

    2017-04-01

    With advances in viral surveillance and next-generation sequencing, highly diverse novel astroviruses (AstVs) and different animal hosts had been discovered in recent years. However, the existence of AstVs in marmots had yet to be shown. Here, we identified two highly divergent strains of AstVs (tentatively named Qinghai Himalayanmarmot AstVs, HHMAstV1 and HHMAstV2), by viral metagenomic analysis in liver tissues isolated from wild Marmota himalayana in China. Overall, 12 of 99 (12.1 %) M. himalayana faecal samples were positive for the presence of genetically diverse AstVs, while only HHMAstV1 and HHMAstV2 were identified in 300 liver samples. The complete genomic sequences of HHMAstV1 and HHMAstV2 were 6681 and 6610 nt in length, respectively, with the typical genomic organization of AstVs. Analysis of the complete ORF 2 sequence showed that these novel AstVs are most closely related to the rabbit AstV, mamastrovirus 23 (with 31.0 and 48.0 % shared amino acid identity, respectively). Phylogenetic analysis of the amino acid sequences of ORF1a, ORF1b and ORF2 indicated that HHMAstV1 and HHMAstV2 form two distinct clusters among the mamastroviruses, and may share a common ancestor with the rabbit-specific mamastrovirus 23. These results suggest that HHMAstV1 and HHMAstV2 are two novel species of the genus Mamastrovirus in the Astroviridae. The remarkable diversity of these novel AstVs will contribute to a greater understanding of the evolution and ecology of AstVs, although additional studies will be needed to understand the clinical significance of these novel AstVs in marmots, as well as in humans.

  10. Complete amino acid sequence of ananain and a comparison with stem bromelain and other plant cysteine proteases.

    PubMed Central

    Lee, K L; Albee, K L; Bernasconi, R J; Edmunds, T

    1997-01-01

    The amino acid sequences of ananain (EC3.4.22.31) and stem bromelain (3.4.22.32), two cysteine proteases from pineapple stem, are similar yet ananain and stem bromelain possess distinct specificities towards synthetic peptide substrates and different reactivities towards the cysteine protease inhibitors E-64 and chicken egg white cystatin. We present here the complete amino acid sequence of ananain and compare it with the reported sequences of pineapple stem bromelain, papain and chymopapain from papaya and actinidin from kiwifruit. Ananain is comprised of 216 residues with a theoretical mass of 23464 Da. This primary structure includes a sequence insert between residues 170 and 174 not present in stem bromelain or papain and a hydrophobic series of amino acids adjacent to His-157. It is possible that these sequence differences contribute to the different substrate and inhibitor specificities exhibited by ananain and stem bromelain. PMID:9355753

  11. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies

    NASA Technical Reports Server (NTRS)

    Rossler, D.; Ludwig, W.; Schleifer, K. H.; Lin, C.; McGill, T. J.; Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  12. The pig gut microbial diversity: Understanding the pig gut microbial ecology through the next generation high throughput sequencing.

    PubMed

    Kim, Hyeun Bum; Isaacson, Richard E

    2015-06-12

    The importance of the gut microbiota of animals is widely acknowledged because of its pivotal roles in the health and well being of animals. The genetic diversity of the gut microbiota contributes to the overall development and metabolic needs of the animal, and provides the host with many beneficial functions including production of volatile fatty acids, re-cycling of bile salts, production of vitamin K, cellulose digestion, and development of immune system. Thus the intestinal microbiota of animals has been the subject of study for many decades. Although most of the older studies have used culture dependent methods, the recent advent of high throughput sequencing of 16S rRNA genes has facilitated in depth studies exploring microbial populations and their dynamics in the animal gut. These culture independent DNA based studies generate large amounts of data and as a result contribute to a more detailed understanding of the microbiota dynamics in the gut and the ecology of the microbial populations. Of equal importance, is being able to identify and quantify microbes that are difficult to grow or that have not been grown in the laboratory. Interpreting the data obtained from this type of study requires using basic principles of microbial diversity to understand importance of the composition of microbial populations. In this review, we summarize the literature on culture independent studies of the pig gut microbiota with an emphasis on its succession and alterations caused by diverse factors. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Diversity of DNA and RNA Viruses in Indoor Air As Assessed via Metagenomic Sequencing.

    PubMed

    Rosario, Karyna; Fierer, Noah; Miller, Shelly; Luongo, Julia; Breitbart, Mya

    2018-02-06

    Diverse bacterial and fungal communities inhabit human-occupied buildings and circulate in indoor air; however, viral diversity in these man-made environments remains largely unknown. Here we investigated DNA and RNA viruses circulating in the air of 12 university dormitory rooms by analyzing dust accumulated over a one-year period on heating, ventilation, and air conditioning (HVAC) filters. A metagenomic sequencing approach was used to determine the identity and diversity of viral particles extracted from the HVAC filters. We detected a broad diversity of viruses associated with a range of hosts, including animals, arthropods, bacteria, fungi, humans, plants, and protists, suggesting that disparate organisms can contribute to indoor airborne viral communities. Viral community composition and the distribution of human-infecting papillomaviruses and polyomaviruses were distinct in the different dormitory rooms, indicating that airborne viral communities are variable in human-occupied spaces and appear to reflect differential rates of viral shedding from room occupants. This work significantly expands the known airborne viral diversity found indoors, enabling the design of sensitive and quantitative assays to further investigate specific viruses of interest and providing new insight into the likely sources of viruses found in indoor air.

  14. [Bacterial diversity in sequencing batch biofilm reactor (SBBR) for landfill leachate treatment using PCR-DGGE].

    PubMed

    Xiao, Yong; Yang, Zhao-hui; Zeng, Guang-ming; Ma, Yan-he; Liu, You-sheng; Wang, Rong-juan; Xu, Zheng-yong

    2007-05-01

    For studying the bacterial diversity and the mechanism of denitrification in sequencing bath biofilm reactor (SBBR) treating landfill leachate to provide microbial evidence for technique improvements, total microbial DNA was extracted from samples which were collected from natural landfill leachate and biofilm of a SBBR that could efficiently remove NH4+ -N and COD of high concentration. 16S rDNA fragments were amplified from the total DNA successfully using a pair of universal bacterial 16S rDNA primer, GC341F and 907R, and then were used for denaturing gradient gel electrophoresis (DGGE) analysis. The bands in the gel were analyzed by statistical methods and excided from the gel for sequencing, and the sequences were used for homology analysis and then two phylogenetic trees were constructed using DNAStar software. Results indicated that the bacterial diversity of the biofilm in SBBR and the landfill leachate was abundant, and no obvious change of community structure happened during running in the biofilm, in which most bacteria came from the landfill leachate. There may be three different modes of denitrification in the reactor because several different nitrifying bacteria, denitrifying bacteria and anaerobic ammonia oxidation bacteria coexisted in it. The results provided some valuable references for studying microbiological mechanism of denitrification in SBBR.

  15. High levels of MHC class II allelic diversity in lake trout from Lake Superior

    USGS Publications Warehouse

    Dorschner, M.O.; Duris, T.; Bronte, C.R.; Burnham-Curtis, M. K.; Phillips, R.B.

    2000-01-01

    Sequence variation in a 216 bp portion of the major histocompatibility complex (MHC) II B1 domain was examined in 74 individual lake trout (Salvelinus namaycush) from different locations in Lake Superior. Forty-three alleles were obtained which encoded 71-72 amino acids of the mature protein. These sequences were compared with previous data obtained from five Pacific salmon species and Atlantic salmon using the same primers. Although all of the lake trout alleles clustered together in the neighbor-joining analysis of amino acid sequences, one amino acid allelic lineage was shared with Atlantic salmon (Salmo salar), a species in another genus which probably diverged from Salvelinus more than 10-20 million years ago. As shown previously in other salmonids, the level of nonsynonymous nucleotide substitution (d(N)) exceeded the level of synonymous substitution (d(S)). The level of nucleotide diversity at the MHC class II B1 locus was considerably higher in lake trout than in the Pacific salmon (genus Oncorhynchus). These results are consistent with the hypothesis that lake trout colonized Lake Superior from more than one refuge following the Wisconsin glaciation. Recent population bottlenecks may have reduced nucleotide diversity in Pacific salmon populations.

  16. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  17. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    PubMed Central

    Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  18. Understanding the complex evolution of rapidly mutating viruses with deep sequencing: Beyond the analysis of viral diversity.

    PubMed

    Leung, Preston; Eltahla, Auda A; Lloyd, Andrew R; Bull, Rowena A; Luciani, Fabio

    2017-07-15

    With the advent of affordable deep sequencing technologies, detection of low frequency variants within genetically diverse viral populations can now be achieved with unprecedented depth and efficiency. The high-resolution data provided by next generation sequencing technologies is currently recognised as the gold standard in estimation of viral diversity. In the analysis of rapidly mutating viruses, longitudinal deep sequencing datasets from viral genomes during individual infection episodes, as well as at the epidemiological level during outbreaks, now allow for more sophisticated analyses such as statistical estimates of the impact of complex mutation patterns on the evolution of the viral populations both within and between hosts. These analyses are revealing more accurate descriptions of the evolutionary dynamics that underpin the rapid adaptation of these viruses to the host response, and to drug therapies. This review assesses recent developments in methods and provide informative research examples using deep sequencing data generated from rapidly mutating viruses infecting humans, particularly hepatitis C virus (HCV), human immunodeficiency virus (HIV), Ebola virus and influenza virus, to understand the evolution of viral genomes and to explore the relationship between viral mutations and the host adaptive immune response. Finally, we discuss limitations in current technologies, and future directions that take advantage of publically available large deep sequencing datasets. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Phylogenetic Diversity of Lactic Acid Bacteria Associated with Paddy Rice Silage as Determined by 16S Ribosomal DNA Analysis

    PubMed Central

    Ennahar, Saïd; Cai, Yimin; Fujita, Yasuhito

    2003-01-01

    A total of 161 low-G+C-content gram-positive bacteria isolated from whole-crop paddy rice silage were classified and subjected to phenotypic and genetic analyses. Based on morphological and biochemical characters, these presumptive lactic acid bacterium (LAB) isolates were divided into 10 groups that included members of the genera Enterococcus, Lactobacillus, Lactococcus, Leuconostoc, Pediococcus, and Weissella. Analysis of the 16S ribosomal DNA (rDNA) was used to confirm the presence of the predominant groups indicated by phenotypic analysis and to determine the phylogenetic affiliation of representative strains. The virtually complete 16S rRNA gene was PCR amplified and sequenced. The sequences from the various LAB isolates showed high degrees of similarity to those of the GenBank reference strains (between 98.7 and 99.8%). Phylogenetic trees based on the 16S rDNA sequence displayed high consistency, with nodes supported by high bootstrap values. With the exception of one species, the genetic data was in agreement with the phenotypic identification. The prevalent LAB, predominantly homofermentative (66%), consisted of Lactobacillus plantarum (24%), Lactococcus lactis (22%), Leuconostoc pseudomesenteroides (20%), Pediococcus acidilactici (11%), Lactobacillus brevis (11%), Enterococcus faecalis (7%), Weissella kimchii (3%), and Pediococcus pentosaceus (2%). The present study, the first to fully document rice-associated LAB, showed a very diverse community of LAB with a relatively high number of species involved in the fermentation process of paddy rice silage. The comprehensive 16S rDNA-based approach to describing LAB community structure was valuable in revealing the large diversity of bacteria inhabiting paddy rice silage and enabling the future design of appropriate inoculants aimed at improving its fermentation quality. PMID:12514026

  20. Phylogenetic diversity of lactic acid bacteria associated with paddy rice silage as determined by 16S ribosomal DNA analysis.

    PubMed

    Ennahar, Saïd; Cai, Yimin; Fujita, Yasuhito

    2003-01-01

    A total of 161 low-G+C-content gram-positive bacteria isolated from whole-crop paddy rice silage were classified and subjected to phenotypic and genetic analyses. Based on morphological and biochemical characters, these presumptive lactic acid bacterium (LAB) isolates were divided into 10 groups that included members of the genera Enterococcus, Lactobacillus, Lactococcus, Leuconostoc, Pediococcus, and WEISSELLA: Analysis of the 16S ribosomal DNA (rDNA) was used to confirm the presence of the predominant groups indicated by phenotypic analysis and to determine the phylogenetic affiliation of representative strains. The virtually complete 16S rRNA gene was PCR amplified and sequenced. The sequences from the various LAB isolates showed high degrees of similarity to those of the GenBank reference strains (between 98.7 and 99.8%). Phylogenetic trees based on the 16S rDNA sequence displayed high consistency, with nodes supported by high bootstrap values. With the exception of one species, the genetic data was in agreement with the phenotypic identification. The prevalent LAB, predominantly homofermentative (66%), consisted of Lactobacillus plantarum (24%), Lactococcus lactis (22%), Leuconostoc pseudomesenteroides (20%), Pediococcus acidilactici (11%), Lactobacillus brevis (11%), Enterococcus faecalis (7%), Weissella kimchii (3%), and Pediococcus pentosaceus (2%). The present study, the first to fully document rice-associated LAB, showed a very diverse community of LAB with a relatively high number of species involved in the fermentation process of paddy rice silage. The comprehensive 16S rDNA-based approach to describing LAB community structure was valuable in revealing the large diversity of bacteria inhabiting paddy rice silage and enabling the future design of appropriate inoculants aimed at improving its fermentation quality.

  1. Comparison of the Diversity of Basidiomycetes from Dead Wood of the Manchurian fir (Abies holophylla) as Evaluated by Fruiting Body Collection, Mycelial Isolation, and 454 Sequencing.

    PubMed

    Jang, Yeongseon; Jang, Seokyoon; Min, Mihee; Hong, Joo-Hyun; Lee, Hanbyul; Lee, Hwanhwi; Lim, Young Woon; Kim, Jae-Jin

    2015-10-01

    In this study, three different methods (fruiting body collection, mycelial isolation, and 454 sequencing) were implemented to determine the diversity of wood-inhabiting basidiomycetes from dead Manchurian fir (Abies holophylla). The three methods recovered similar species richness (26 species from fruiting bodies, 32 species from mycelia, and 32 species from 454 sequencing), but Fisher's alpha, Shannon-Wiener, Simpson's diversity indices of fungal communities indicated fruiting body collection and mycelial isolation displayed higher diversity compared with 454 sequencing. In total, 75 wood-inhabiting basidiomycetes were detected. The most frequently observed species were Heterobasidion orientale (fruiting body collection), Bjerkandera adusta (mycelial isolation), and Trichaptum fusco-violaceum (454 sequencing). Only two species, Hymenochaete yasudae and Hypochnicium karstenii, were detected by all three methods. This result indicated that Manchurian fir harbors a diverse basidiomycetous fungal community and for complete estimation of fungal diversity, multiple methods should be used. Further studies are required to understand their ecology in the context of forest ecosystems.

  2. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    PubMed

    Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

    2015-01-01

    Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  3. Molecular detection and sequence characterization of diverse rhabdoviruses in bats, China.

    PubMed

    Xu, Lin; Wu, Jianmin; Jiang, Tinglei; Qin, Shaomin; Xia, Lele; Li, Xingyu; He, Biao; Tu, Changchun

    2018-01-15

    The Rhabdoviridae is among the most diverse families of RNA viruses and currently classified into 18 genera with some rhabdoviruses lethal to humans and other animals. Herein, we describe genetic characterization of three novel rhabdoviruses from bats in China. Of these, two viruses (Jinghong bat virus and Benxi bat virus) found in Rhinolophus bats showed a phylogenetic relationship with vesiculoviruses, and sequence analyses indicate that they represent two new species within the genus Vesiculovirus. The remaining Yangjiang bat virus found in Hipposideros larvatus bats were only distantly related to currently known rhabdoviruses. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Deep sequencing of the Trypanosoma cruzi GP63 surface proteases reveals diversity and diversifying selection among chronic and congenital Chagas disease patients.

    PubMed

    Llewellyn, Martin S; Messenger, Louisa A; Luquetti, Alejandro O; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B N; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A

    2015-04-01

    Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target--ND5--was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I gene family suggests a link between genetic diversity within this gene

  5. Deep Sequencing of the Trypanosoma cruzi GP63 Surface Proteases Reveals Diversity and Diversifying Selection among Chronic and Congenital Chagas Disease Patients

    PubMed Central

    Llewellyn, Martin S.; Messenger, Louisa A.; Luquetti, Alejandro O.; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B. N.; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A.

    2015-01-01

    Background Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. Methodology/ Principal Findings A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target—ND5—was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Conclusions/Significance Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I

  6. Diversity and dynamics of lactic acid bacteria in Atole agrio, a traditional maize-based fermented beverage from South-Eastern Mexico, analysed by high throughput sequencing and culturing.

    PubMed

    Pérez-Cataluña, Alba; Elizaquível, Patricia; Carrasco, Purificación; Espinosa, Judith; Reyes, Dolores; Wacher, Carmen; Aznar, Rosa

    2018-03-01

    The purpose of this work was to analyse the diversity and dynamics of lactic acid bacteria (LAB) throughout the fermentation process in Atole agrio, a traditional maize based food of Mexican origin. Samples of different fermentation times were analysed using culture-dependent and -independent approaches. Identification of LAB isolates revealed the presence of members of the genera Pediococcus, Weissella, Lactobacillus, Leuconostoc and Lactococcus, and the predominance of Pediococcus pentosaceus and Weissella confusa in liquid and solid batches, respectively. High-throughput sequencing (HTS) of the 16S rRNA gene confirmed the predominance of Lactobacillaceae and Leuconostocaceae at the beginning of the process. In liquid fermentation Acetobacteraceae dominate after 4 h as pH decreased. In contrast, Leuconostocaceae dominated the solid fermentation except at 12 h that were overgrown by Acetobacteraceae. Regarding LAB genera, Lactobacillus dominated the liquid fermentation except at 12 h when Weissella, Lactococcus and Streptococcus were the most abundant. In solid fermentation Weissella predominated all through the process. HTS determined that Lactobacillus plantarum and W. confusa dominated in the liquid and solid batches, respectively. Two oligotypes have been identified for L. plantarum and W. confusa populations, differing in a single nucleotide position each. Only one of the oligotypes was detected among the isolates obtained from each species, the biological significance of which remains unclear.

  7. CDSbank: taxonomy-aware extraction, selection, renaming and formatting of protein-coding DNA or amino acid sequences.

    PubMed

    Hazes, Bart

    2014-02-28

    Protein-coding DNA sequences and their corresponding amino acid sequences are routinely used to study relationships between sequence, structure, function, and evolution. The rapidly growing size of sequence databases increases the power of such comparative analyses but it makes it more challenging to prepare high quality sequence data sets with control over redundancy, quality, completeness, formatting, and labeling. Software tools for some individual steps in this process exist but manual intervention remains a common and time consuming necessity. CDSbank is a database that stores both the protein-coding DNA sequence (CDS) and amino acid sequence for each protein annotated in Genbank. CDSbank also stores Genbank feature annotation, a flag to indicate incomplete 5' and 3' ends, full taxonomic data, and a heuristic to rank the scientific interest of each species. This rich information allows fully automated data set preparation with a level of sophistication that aims to meet or exceed manual processing. Defaults ensure ease of use for typical scenarios while allowing great flexibility when needed. Access is via a free web server at http://hazeslab.med.ualberta.ca/CDSbank/. CDSbank presents a user-friendly web server to download, filter, format, and name large sequence data sets. Common usage scenarios can be accessed via pre-programmed default choices, while optional sections give full control over the processing pipeline. Particular strengths are: extract protein-coding DNA sequences just as easily as amino acid sequences, full access to taxonomy for labeling and filtering, awareness of incomplete sequences, and the ability to take one protein sequence and extract all synonymous CDS or identical protein sequences in other species. Finally, CDSbank can also create labeled property files to, for instance, annotate or re-label phylogenetic trees.

  8. Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data.

    PubMed

    Liu, Yang; Chiaromonte, Francesca; Ross, Howard; Malhotra, Raunaq; Elleder, Daniel; Poss, Mary

    2015-06-30

    Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various conditions, to test whether diversity of virulent FIV in lymphoid tissues is altered in the presence of PLV. Our data consisted of the 3' half of the FIV genome from three tissues of animals infected with FIV alone, or with FIV and PLV, sequenced by 454 technology. Since rare variants dominate virus populations, we had to carefully distinguish sequence variation from errors due to experimental protocols and sequencing. We considered an exponential-normal convolution model used for background correction of microarray data, and modified it to formulate an error correction approach for minor allele frequencies derived from high-throughput sequencing. Similar to accounting for over-dispersion in counts, this accounts for error-inflated variability in frequencies - and quite effectively reproduces empirically observed distributions. After obtaining error-corrected minor allele frequencies, we applied ANalysis Of VAriance (ANOVA) based on a linear mixed model and found that conserved sites and transition frequencies in FIV genes differ among tissues of dual and single infected cats. Furthermore, analysis of minor allele frequencies at individual FIV genome sites revealed 242 sites significantly affected by infection status (dual vs. single) or infection status by tissue interaction. All together, our results demonstrated a decrease in FIV diversity in bone marrow in the presence of PLV. Importantly, these effects were weakened or undetectable when error correction was performed with other approaches (thresholding of minor allele frequencies; probabilistic clustering of reads). We also queried the data for cytidine deaminase activity on the viral genome, which causes an asymmetric increase

  9. ANCAC: amino acid, nucleotide, and codon analysis of COGs--a tool for sequence bias analysis in microbial orthologs.

    PubMed

    Meiler, Arno; Klinger, Claudia; Kaufmann, Michael

    2012-09-08

    The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC's NUCOCOG dataset as the largest one available for that purpose thus far. Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.

  10. Draft Genome Sequence of Streptomyces clavuligerus NRRL 3585, a Producer of Diverse Secondary Metabolites▿

    PubMed Central

    Song, Ju Yeon; Jeong, Haeyoung; Yu, Dong Su; Fischbach, Michael A.; Park, Hong-Seog; Kim, Jae Jong; Seo, Jeong-Sun; Jensen, Susan E.; Oh, Tae Kwang; Lee, Kye Joon; Kim, Jihyun F.

    2010-01-01

    Streptomyces clavuligerus is an important industrial strain that produces a number of antibiotics, including clavulanic acid and cephamycin C. A high-quality draft genome sequence of the S. clavuligerus NRRL 3585 strain was produced by employing a hybrid approach that involved Sanger sequencing, Roche/454 pyrosequencing, optical mapping, and partial finishing. Its genome, comprising four linear replicons, one chromosome, and four plasmids, carries numerous sets of genes involved in the biosynthesis of secondary metabolites, including a variety of antibiotics. PMID:20889745

  11. High-Throughput Ligand Discovery Reveals a Sitewise Gradient of Diversity in Broadly Evolved Hydrophilic Fibronectin Domains

    PubMed Central

    Woldring, Daniel R.; Holec, Patrick V.; Zhou, Hong; Hackel, Benjamin J.

    2015-01-01

    Discovering new binding function via a combinatorial library in small protein scaffolds requires balance between appropriate mutations to introduce favorable intermolecular interactions while maintaining intramolecular integrity. Sitewise constraints exist in a non-spatial gradient from diverse to conserved in evolved antibody repertoires; yet non-antibody scaffolds generally do not implement this strategy in combinatorial libraries. Despite the fact that biased amino acid distributions, typically elevated in tyrosine, serine, and glycine, have gained wider use in synthetic scaffolds, these distributions are still predominantly applied uniformly to diversified sites. While select sites in fibronectin domains and DARPins have shown benefit from sitewise designs, they have not been deeply evaluated. Inspired by this disparity between diversity distributions in natural libraries and synthetic scaffold libraries, we hypothesized that binders resulting from discovery and evolution would exhibit a non-spatial, sitewise gradient of amino acid diversity. To identify sitewise diversities consistent with efficient evolution in the context of a hydrophilic fibronectin domain, >105 binders to six targets were evolved and sequenced. Evolutionarily favorable amino acid distributions at 25 sites reveal Shannon entropies (range: 0.3–3.9; median: 2.1; standard deviation: 1.1) supporting the diversity gradient hypothesis. Sitewise constraints in evolved sequences are consistent with complementarity, stability, and consensus biases. Implementation of sitewise constrained diversity enables direct selection of nanomolar affinity binders validating an efficient strategy to balance inter- and intra-molecular interaction demands at each site. PMID:26383268

  12. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    PubMed Central

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729

  13. Genetic diversity and molecular evolution of Naga King Chili inferred from internal transcribed spacer sequence of nuclear ribosomal DNA.

    PubMed

    Kehie, Mechuselie; Kumaria, Suman; Devi, Khumuckcham Sangeeta; Tandon, Pramod

    2016-02-01

    Sequences of the Internal Transcribed Spacer (ITS1-5.8S-ITS2) of nuclear ribosomal DNAs were explored to study the genetic diversity and molecular evolution of Naga King Chili. Our study indicated the occurrence of nucleotide polymorphism and haplotypic diversity in the ITS regions. The present study demonstrated that the variability of ITS1 with respect to nucleotide diversity and sequence polymorphism exceeded that of ITS2. Sequence analysis of 5.8S gene revealed a much conserved region in all the accessions of Naga King Chili. However, strong phylogenetic information of this species is the distinct 13 bp deletion in the 5.8S gene which discriminated Naga King Chili from the rest of the Capsicum sp. Neutrality test results implied a neutral variation, and population seems to be evolving at drift-mutation equilibrium and free from directed selection pressure. Furthermore, mismatch analysis showed multimodal curve indicating a demographic equilibrium. Phylogenetic relationships revealed by Median Joining Network (MJN) analysis denoted a clear discrimination of Naga King Chili from its closest sister species (Capsicum chinense and Capsicum frutescens). The absence of star-like network of haplotypes suggested an ancient population expansion of this chili.

  14. Two Theileria parva CD8 T cell antigen genes are more variable in buffalo than cattle parasites, but differ in pattern of sequence diversity.

    PubMed

    Pelle, Roger; Graham, Simon P; Njahira, Moses N; Osaso, Julius; Saya, Rosemary M; Odongo, David O; Toye, Philip G; Spooner, Paul R; Musoke, Anthony J; Mwangi, Duncan M; Taracha, Evans L N; Morrison, W Ivan; Weir, William; Silva, Joana C; Bishop, Richard P

    2011-04-29

    Theileria parva causes an acute fatal disease in cattle, but infections are asymptomatic in the African buffalo (Syncerus caffer). Cattle can be immunized against the parasite by infection and treatment, but immunity is partially strain specific. Available data indicate that CD8(+) T lymphocyte responses mediate protection and, recently, several parasite antigens recognised by CD8(+) T cells have been identified. This study set out to determine the nature and extent of polymorphism in two of these antigens, Tp1 and Tp2, which contain defined CD8(+) T-cell epitopes, and to analyse the sequences for evidence of selection. Partial sequencing of the Tp1 gene and the full-length Tp2 gene from 82 T. parva isolates revealed extensive polymorphism in both antigens, including the epitope-containing regions. Single nucleotide polymorphisms were detected at 51 positions (∼12%) in Tp1 and in 320 positions (∼61%) in Tp2. Together with two short indels in Tp1, these resulted in 30 and 42 protein variants of Tp1 and Tp2, respectively. Although evidence of positive selection was found for multiple amino acid residues, there was no preferential involvement of T cell epitope residues. Overall, the extent of diversity was much greater in T. parva isolates originating from buffalo than in isolates known to be transmissible among cattle. The results indicate that T. parva parasites maintained in cattle represent a subset of the overall T. parva population, which has become adapted for tick transmission between cattle. The absence of obvious enrichment for positively selected amino acid residues within defined epitopes indicates either that diversity is not predominantly driven by selection exerted by host T cells, or that such selection is not detectable by the methods employed due to unidentified epitopes elsewhere in the antigens. Further functional studies are required to address this latter point.

  15. Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

    PubMed Central

    Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

    2007-01-01

    We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688

  16. Whole-Genome Sequencing of Pharmacogenetic Drug Response in Racially Diverse Children with Asthma.

    PubMed

    Mak, Angel C Y; White, Marquitta J; Eckalbar, Walter L; Szpiech, Zachary A; Oh, Sam S; Pino-Yanes, Maria; Hu, Donglei; Goddard, Pagé; Huntsman, Scott; Galanter, Joshua; Wu, Ann Chen; Himes, Blanca E; Germer, Soren; Vogel, Julia M; Bunting, Karen L; Eng, Celeste; Salazar, Sandra; Keys, Kevin L; Liberto, Jennifer; Nuckton, Thomas J; Nguyen, Thomas A; Torgerson, Dara G; Kwok, Pui-Yan; Levin, Albert M; Celedón, Juan C; Forno, Erick; Hakonarson, Hakon; Sleiman, Patrick M; Dahlin, Amber; Tantisira, Kelan G; Weiss, Scott T; Serebrisky, Denise; Brigino-Buenaventura, Emerita; Farber, Harold J; Meade, Kelley; Lenoir, Michael A; Avila, Pedro C; Sen, Saunak; Thyne, Shannon M; Rodriguez-Cintron, William; Winkler, Cheryl A; Moreno-Estrada, Andrés; Sandoval, Karla; Rodriguez-Santana, Jose R; Kumar, Rajesh; Williams, L Keoki; Ahituv, Nadav; Ziv, Elad; Seibold, Max A; Darnell, Robert B; Zaitlen, Noah; Hernandez, Ryan D; Burchard, Esteban G

    2018-06-15

    Albuterol, a bronchodilator medication, is the first-line therapy for asthma worldwide. There are significant racial/ethnic differences in albuterol drug response. To identify genetic variants important for bronchodilator drug response (BDR) in racially diverse children. We performed the first whole-genome sequencing pharmacogenetics study from 1,441 children with asthma from the tails of the BDR distribution to identify genetic association with BDR. We identified population-specific and shared genetic variants associated with BDR, including genome-wide significant (P < 3.53 × 10 -7 ) and suggestive (P < 7.06 × 10 -6 ) loci near genes previously associated with lung capacity (DNAH5), immunity (NFKB1 and PLCB1), and β-adrenergic signaling (ADAMTS3 and COX18). Functional analyses of the BDR-associated SNP in NFKB1 revealed potential regulatory function in bronchial smooth muscle cells. The SNP is also an expression quantitative trait locus for a neighboring gene, SLC39A8. The lack of other asthma study populations with BDR and whole-genome sequencing data on minority children makes it impossible to perform replication of our rare variant associations. Minority underrepresentation also poses significant challenges to identify age-matched and population-matched cohorts of sufficient sample size for replication of our common variant findings. The lack of minority data, despite a collaboration of eight universities and 13 individual laboratories, highlights the urgent need for a dedicated national effort to prioritize diversity in research. Our study expands the understanding of pharmacogenetic analyses in racially/ethnically diverse populations and advances the foundation for precision medicine in at-risk and understudied minority populations.

  17. Human Retroviruses and AIDS. A compilation and analysis of nucleic acid and amino acid sequences: I--II; III--V

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Myers, G.; Korber, B.; Wain-Hobson, S.

    1993-12-31

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (I) HIV and SIV Nucleotide Sequences; (II) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. Information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium.

  18. Sequence Variation of the tRNALeu Intron as a Marker for Genetic Diversity and Specificity of Symbiotic Cyanobacteria in Some Lichens

    PubMed Central

    Paulsrud, Per; Lindblad, Peter

    1998-01-01

    We examined the genetic diversity of Nostoc symbionts in some lichens by using the tRNALeu (UAA) intron as a genetic marker. The nucleotide sequence was analyzed in the context of the secondary structure of the transcribed intron. Cyanobacterial tRNALeu (UAA) introns were specifically amplified from freshly collected lichen samples without previous DNA extraction. The lichen species used in the present study were Nephroma arcticum, Peltigera aphthosa, P. membranacea, and P. canina. Introns with different sizes around 300 bp were consistently obtained. Multiple clones from single PCRs were screened by using their single-stranded conformational polymorphism pattern, and the nucleotide sequence was determined. No evidence for sample heterogenity was found. This implies that the symbiont in situ is not a diverse community of cyanobionts but, rather, one Nostoc strain. Furthermore, each lichen thallus contained only one intron type, indicating that each thallus is colonized only once or that there is a high degree of specificity. The same cyanobacterial intron sequence was also found in samples of one lichen species from different localities. In a phylogenetic analysis, the cyanobacterial lichen sequences grouped together with the sequences from two free-living Nostoc strains. The size differences in the intron were due to insertions and deletions in highly variable regions. The sequence data were used in discussions concerning specificity and biology of the lichen symbiosis. It is concluded that the tRNALeu (UAA) intron can be of great value when examining cyanobacterial diversity. PMID:9435083

  19. Development of Genomic Microsatellite Markers in Carthamus tinctorius L. (Safflower) Using Next Generation Sequencing and Assessment of Their Cross-Species Transferability and Utility for Diversity Analysis

    PubMed Central

    Variath, Murali Tottekkad; Joshi, Gopal; Bali, Sapinder; Agarwal, Manu; Kumar, Amar; Jagannath, Arun; Goel, Shailendra

    2015-01-01

    Background Safflower (Carthamus tinctorius L.), an Asteraceae member, yields high quality edible oil rich in unsaturated fatty acids and is resilient to dry conditions. The crop holds tremendous potential for improvement through concerted molecular breeding programs due to the availability of significant genetic and phenotypic diversity. Genomic resources that could facilitate such breeding programs remain largely underdeveloped in the crop. The present study was initiated to develop a large set of novel microsatellite markers for safflower using next generation sequencing. Principal Findings Low throughput genome sequencing of safflower was performed using Illumina paired end technology providing ~3.5X coverage of the genome. Analysis of sequencing data allowed identification of 23,067 regions harboring perfect microsatellite loci. The safflower genome was found to be rich in dinucleotide repeats followed by tri-, tetra-, penta- and hexa-nucleotides. Primer pairs were designed for 5,716 novel microsatellite sequences with repeat length ≥ 20 bases and optimal flanking regions. A subset of 325 microsatellite loci was tested for amplification, of which 294 loci produced robust amplification. The validated primers were used for assessment of 23 safflower accessions belonging to diverse agro-climatic zones of the world leading to identification of 93 polymorphic primers (31.6%). The numbers of observed alleles at each locus ranged from two to four and mean polymorphism information content was found to be 0.3075. The polymorphic primers were tested for cross-species transferability on nine wild relatives of cultivated safflower. All primers except one showed amplification in at least two wild species while 25 primers amplified across all the nine species. The UPGMA dendrogram clustered C. tinctorius accessions and wild species separately into two major groups. The proposed progenitor species of safflower, C. oxyacantha and C. palaestinus were genetically closer to

  20. Diversity and distribution of unicellular opisthokonts along the European coast analysed using high-throughput sequencing.

    PubMed

    Del Campo, Javier; Mallo, Diego; Massana, Ramon; de Vargas, Colomban; Richards, Thomas A; Ruiz-Trillo, Iñaki

    2015-09-01

    The opisthokonts are one of the major super groups of eukaryotes. It comprises two major clades: (i) the Metazoa and their unicellular relatives and (ii) the Fungi and their unicellular relatives. There is, however, little knowledge of the role of opisthokont microbes in many natural environments, especially among non-metazoan and non-fungal opisthokonts. Here, we begin to address this gap by analysing high-throughput 18S rDNA and 18S rRNA sequencing data from different European coastal sites, sampled at different size fractions and depths. In particular, we analyse the diversity and abundance of choanoflagellates, filastereans, ichthyosporeans, nucleariids, corallochytreans and their related lineages. Our results show the great diversity of choanoflagellates in coastal waters as well as a relevant representation of the ichthyosporeans and the uncultured marine opisthokonts (MAOP). Furthermore, we describe a new lineage of marine fonticulids (MAFO) that appears to be abundant in sediments. Taken together, our work points to a greater potential ecological role for unicellular opisthokonts than previously appreciated in marine environments, both in water column and sediments, and also provides evidence of novel opisthokont phylogenetic lineages. This study highlights the importance of high-throughput sequencing approaches to unravel the diversity and distribution of both known and novel eukaryotic lineages. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  1. Modeling backbone flexibility to achieve sequence diversity: The design of novel alpha-helical ligands for Bcl-xL

    PubMed Central

    Fu, Xiaoran; Apgar, James R.; Keating, Amy E.

    2007-01-01

    Computational protein design can be used to select sequences that are compatible with a fixed-backbone template. This strategy has been used in numerous instances to engineer novel proteins. However, the fixed-backbone assumption severely restricts the sequence space that is accessible via design. For challenging problems, such as the design of functional proteins, this may not be acceptable. In this paper, we present a method for introducing backbone flexibility into protein design calculations and apply it to the design of diverse helical BH3 ligands that bind to the anti-apoptotic protein Bcl-xL, a member of the Bcl-2 protein family. We demonstrate how normal mode analysis can be used to sample different BH3 backbones, and show that this leads to a larger and more diverse set of low-energy solutions than can be achieved using a native high-resolution Bcl-xL complex crystal structure as a template. We tested several of the designed solutions experimentally and found that this approach worked well when normal mode calculations were used to deform a native BH3 helix structure, but less well when they were used to deform an idealized helix. A subsequent round of design and testing identified a likely source of the problem as inadequate sampling of the helix pitch. In all, we tested seventeen designed BH3 peptide sequences, including several point mutants. Of these, eight bound well to Bcl-xL and four others showed weak but detectable binding. The successful designs showed a diversity of sequences that would have been difficult or impossible to achieve using only a fixed backbone. Thus, introducing backbone flexibility via normal mode analysis effectively broadened the set of sequences identified by computational design, and provided insight into positions important for binding Bcl-xL. PMID:17597151

  2. Sequence and phylogenetic analysis of chicken anaemia virus obtained from backyard and commercial chickens in Nigeria.

    PubMed

    Oluwayelu, D O; Todd, D; Olaleye, O D

    2008-12-01

    This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.

  3. Application of RAD Sequencing for Evaluating the Genetic Diversity of Domesticated Panax notoginseng (Araliaceae)

    PubMed Central

    Pan, Yuezhi; Wang, Xueqin; Sun, Guiling; Li, Fusheng; Gong, Xun

    2016-01-01

    Panax notoginseng, a traditional Chinese medicinal plant, has been cultivated and domesticated for approximately 400 years, mainly in Yunnan and Guangxi, two provinces in southwest China. This species was named according to cultivated rather than wild individuals, and no wild populations had been found until now. The genetic resources available on farms are important for both breeding practices and resource conservation. In the present study, the recently developed technology RADseq, which is based on next-generation sequencing, was used to analyze the genetic variation and differentiation of P. notoginseng. The nucleotide diversity and heterozygosity results indicated that P. notoginseng had low genetic diversity at both the species and population levels. Almost no genetic differentiation has been detected, and all populations were genetically similar due to strong gene flow and insufficient splitting time. Although the genetic diversity of P. notoginseng was low at both species and population levels, several traditional plantations had relatively high genetic diversity, as revealed by the He and π values and by the private allele numbers. These valuable genetic resources should be protected as soon as possible to facilitate future breeding projects. The possible geographical origin of Sanqi domestication was discussed based on the results of the genetic diversity analysis. PMID:27846268

  4. GCPred: a web tool for guanylyl cyclase functional centre prediction from amino acid sequence.

    PubMed

    Xu, Nuo; Fu, Dongfang; Li, Shiang; Wang, Yuxuan; Wong, Aloysius

    2018-06-15

    GCPred is a webserver for the prediction of guanylyl cyclase (GC) functional centres from amino acid sequence. GCs are enzymes that generate the signalling molecule cyclic guanosine 3', 5'-monophosphate from guanosine-5'-triphosphate. A novel class of GC centres (GCCs) has been identified in complex plant proteins. Using currently available experimental data, GCPred is created to automate and facilitate the identification of similar GCCs. The server features GCC values that consider in its calculation, the physicochemical properties of amino acids constituting the GCC and the conserved amino acids within the centre. From user input amino acid sequence, the server returns a table of GCC values and graphs depicting deviations from mean values. The utility of this server is demonstrated using plant proteins and the human interleukin-1 receptor-associated kinase family of proteins as example. The GCPred server is available at http://gcpred.com. Supplementary data are available at Bioinformatics online.

  5. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  6. Raw Sewage Harbors Diverse Viral Populations

    PubMed Central

    Cantalupo, Paul G.; Calgua, Byron; Zhao, Guoyan; Hundesa, Ayalkibet; Wier, Adam D.; Katz, Josh P.; Grabe, Michael; Hendrix, Roger W.; Girones, Rosina; Wang, David; Pipas, James M.

    2011-01-01

    ABSTRACT At this time, about 3,000 different viruses are recognized, but metagenomic studies suggest that these viruses are a small fraction of the viruses that exist in nature. We have explored viral diversity by deep sequencing nucleic acids obtained from virion populations enriched from raw sewage. We identified 234 known viruses, including 17 that infect humans. Plant, insect, and algal viruses as well as bacteriophages were also present. These viruses represented 26 taxonomic families and included viruses with single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), positive-sense ssRNA [ssRNA(+)], and dsRNA genomes. Novel viruses that could be placed in specific taxa represented 51 different families, making untreated wastewater the most diverse viral metagenome (genetic material recovered directly from environmental samples) examined thus far. However, the vast majority of sequence reads bore little or no sequence relation to known viruses and thus could not be placed into specific taxa. These results show that the vast majority of the viruses on Earth have not yet been characterized. Untreated wastewater provides a rich matrix for identifying novel viruses and for studying virus diversity. Importance At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that

  7. Fungal diversity in grape must and wine fermentation assessed by massive sequencing, quantitative PCR and DGGE

    PubMed Central

    Wang, Chunxiao; García-Fernández, David; Mas, Albert; Esteve-Zarzoso, Braulio

    2015-01-01

    The diversity of fungi in grape must and during wine fermentation was investigated in this study by culture-dependent and culture-independent techniques. Carignan and Grenache grapes were harvested from three vineyards in the Priorat region (Spain) in 2012, and nine samples were selected from the grape must after crushing and during wine fermentation. From culture-dependent techniques, 362 isolates were randomly selected and identified by 5.8S-ITS-RFLP and 26S-D1/D2 sequencing. Meanwhile, genomic DNA was extracted directly from the nine samples and analyzed by qPCR, DGGE and massive sequencing. The results indicated that grape must after crushing harbored a high species richness of fungi with Aspergillus tubingensis, Aureobasidium pullulans, or Starmerella bacillaris as the dominant species. As fermentation proceeded, the species richness decreased, and yeasts such as Hanseniaspora uvarum, Starmerella bacillaris and Saccharomyces cerevisiae successively occupied the must samples. The “terroir” characteristics of the fungus population are more related to the location of the vineyard than to grape variety. Sulfur dioxide treatment caused a low effect on yeast diversity by similarity analysis. Because of the existence of large population of fungi on grape berries, massive sequencing was more appropriate to understand the fungal community in grape must after crushing than the other techniques used in this study. Suitable target sequences and databases were necessary for accurate evaluation of the community and the identification of species by the 454 pyrosequencing of amplicons. PMID:26557110

  8. Microbial Diversity and Its Relationship to Physicochemical Characteristics of the Water in Two Extreme Acidic Pit Lakes from the Iberian Pyrite Belt (SW Spain)

    PubMed Central

    López-Pamo, Enrique; Gomariz, María; Amils, Ricardo; Aguilera, Ángeles

    2013-01-01

    The Iberian Pyrite Belt (IPB) hosts one of the world’s largest accumulations of acidic mine wastes and pit lakes. The mineralogical and textural characteristics of the IPB ores have favored the oxidation and dissolution of metallic sulfides, mainly pyrite, and the subsequent formation of acidic mining drainages. This work reports the physical properties, hydrogeochemical characteristics, and microbial diversity of two pit lakes located in the IPB. Both pit lakes are acidic and showed high concentrations of sulfate and dissolved metals. Concentrations of sulfate and heavy metals were higher in the Nuestra Señora del Carmen lake (NSC) by one order of magnitude than in the Concepción (CN) lake. The hydrochemical characteristics of NSC were typical of acid mine waters and can be compared with other acidic environments. When compared to other IPB acidic pit lakes, the superficial water of CN is more diluted than that of any of the others due, probably, to the strong influence of runoff water. Both pit lakes showed chemical and thermal stratification with well defined chemoclines. One particular characteristic of NSC is that it has developed a chemocline very close to the surface (2 m depth). Microbial community composition of the water column was analyzed by 16S and 18S rRNA gene cloning and sequencing. The microorganisms detected in NSC were characteristic of acid mine drainage (AMD), including iron oxidizing bacteria (Leptospirillum, Acidithiobacillus ferrooxidans) and facultative iron reducing bacteria and archaea (Acidithiobacillus ferrooxidans, Acidiphilium, Actinobacteria, Acidimicrobiales, Ferroplasma) detected in the bottom layer. Diversity in CN was higher than in NSC. Microorganisms known from AMD systems (Acidiphilium, Acidobacteria and Ferrovum) and microorganisms never reported from AMD systems were identified. Taking into consideration the hydrochemical characteristics of these pit lakes and the spatial distribution of the identified microorganisms, a

  9. NEP: web server for epitope prediction based on antibody neutralization of viral strains with diverse sequences

    PubMed Central

    Chuang, Gwo-Yu; Liou, David; Kwong, Peter D.; Georgiev, Ivelin S.

    2014-01-01

    Delineation of the antigenic site, or epitope, recognized by an antibody can provide clues about functional vulnerabilities and resistance mechanisms, and can therefore guide antibody optimization and epitope-based vaccine design. Previously, we developed an algorithm for antibody-epitope prediction based on antibody neutralization of viral strains with diverse sequences and validated the algorithm on a set of broadly neutralizing HIV-1 antibodies. Here we describe the implementation of this algorithm, NEP (Neutralization-based Epitope Prediction), as a web-based server. The users must supply as input: (i) an alignment of antigen sequences of diverse viral strains; (ii) neutralization data for the antibody of interest against the same set of antigen sequences; and (iii) (optional) a structure of the unbound antigen, for enhanced prediction accuracy. The prediction results can be downloaded or viewed interactively on the antigen structure (if supplied) from the web browser using a JSmol applet. Since neutralization experiments are typically performed as one of the first steps in the characterization of an antibody to determine its breadth and potency, the NEP server can be used to predict antibody-epitope information at no additional experimental costs. NEP can be accessed on the internet at http://exon.niaid.nih.gov/nep. PMID:24782517

  10. 3D: diversity, dynamics, differential testing - a proposed pipeline for analysis of next-generation sequencing T cell repertoire data.

    PubMed

    Zhang, Li; Cham, Jason; Paciorek, Alan; Trager, James; Sheikh, Nadeem; Fong, Lawrence

    2017-02-27

    Cancer immunotherapy has demonstrated significant clinical activity in different cancers. T cells represent a crucial component of the adaptive immune system and are thought to mediate anti-tumoral immunity. Antigen-specific recognition by T cells is via the T cell receptor (TCR) which is unique for each T cell. Next generation sequencing (NGS) of the TCRs can be used as a platform to profile the T cell repertoire. Though there are a number of software tools available for processing repertoire data by mapping antigen receptor segments to sequencing reads and assembling the clonotypes, most of them are not designed to track and examine the dynamic nature of the TCR repertoire across multiple time points or between different biologic compartments (e.g., blood and tissue samples) in a clinical context. We integrated different diversity measures to assess the T cell repertoire diversity and examined the robustness of the diversity indices. Among those tested, Clonality was identified for its robustness as a key metric for study design and the first choice to measure TCR repertoire diversity. To evaluate the dynamic nature of T cell clonotypes across time, we utilized several binary similarity measures (such as Baroni-Urbani and Buser overlap index), relative clonality and Morisita's overlap index, as well as the intraclass correlation coefficient, and performed fold change analysis, which was further extended to investigate the transition of clonotypes among different biological compartments. Furthermore, the application of differential testing enabled the detection of clonotypes which were significantly changed across time. By applying the proposed "3D" analysis pipeline to the real example of prostate cancer subjects who received sipuleucel-T, an FDA-approved immunotherapy, we were able to detect changes in TCR sequence frequency and diversity thus demonstrating that sipuleucel-T treatment affected TCR repertoire in blood and in prostate tissue. We also found that

  11. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  12. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing.

    PubMed

    Zhang, Yanying; Yang, Qingsong; Ling, Juan; Van Nostrand, Joy D; Shi, Zhou; Zhou, Jizhong; Dong, Junde

    2017-01-01

    Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata , Avicennia marina , and Ceriops tagal , was undertaken using high - throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  13. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing

    PubMed Central

    Zhang, Yanying; Yang, Qingsong; Ling, Juan; Van Nostrand, Joy D.; Shi, Zhou; Zhou, Jizhong; Dong, Junde

    2017-01-01

    Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata, Avicennia marina, and Ceriops tagal, was undertaken using high-throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem. PMID:29093705

  14. Dimeric PROP1 binding to diverse palindromic TAAT sequences promotes its transcriptional activity.

    PubMed

    Nakayama, Michie; Kato, Takako; Susa, Takao; Sano, Akiko; Kitahara, Kousuke; Kato, Yukio

    2009-08-13

    Mutations in the Prop1 gene are responsible for murine Ames dwarfism and human combined pituitary hormone deficiency with hypogonadism. Recently, we reported that PROP1 is a possible transcription factor for gonadotropin subunit genes through plural cis-acting sites composed of AT-rich sequences containing a TAAT motif which differs from its consensus binding sequence known as PRDQ9 (TAATTGAATTA). This study aimed to verify the binding specificity and sequence of PROP1 by applying the method of SELEX (Systematic Evolution of Ligands by EXponential enrichment), EMSA (electrophoretic mobility shift assay) and transient transfection assay. SELEX, after 5, 7 and 9 generations of selection using a random sequence library, showed that nucleotides containing one or two TAAT motifs were accumulated and accounted for 98.5% at the 9th generation. Aligned sequences and EMSA demonstrated that PROP1 binds preferentially to 11 nucleotides composed of an inverted TAAT motif separated by 3 nucleotides with variation in the half site of palindromic TAAT motifs and with preferential requirement of T at the nucleotide number 5 immediately 3' to a TAAT motif. Transient transfection assay demonstrated first that dimeric binding of PROP1 to an inverted TAAT motif and its cognates resulted in transcriptional activation, whereas monomeric binding of PROP1 to a single TAAT motif and an inverted ATTA motif did not mediate activation. Thus, this study demonstrated that dimeric binding of PROP1 is able to recognize diverse palindromic TAAT sequences separated by 3 nucleotides and to exhibit its transcriptional activity.

  15. Uncultivated Microbial Eukaryotic Diversity: A Method to Link ssu rRNA Gene Sequences with Morphology

    PubMed Central

    Hirst, Marissa B.; Kita, Kelley N.; Dawson, Scott C.

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA “phylotypes” from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages

  16. Deep COI sequencing of standardized benthic samples unveils overlooked diversity of Jordanian coral reefs in the northern Red Sea.

    PubMed

    Al-Rshaidat, Mamoon M D; Snider, Allison; Rosebraugh, Sydney; Devine, Amanda M; Devine, Thomas D; Plaisance, Laetitia; Knowlton, Nancy; Leray, Matthieu

    2016-09-01

    High-throughput sequencing (HTS) of DNA barcodes (metabarcoding), particularly when combined with standardized sampling protocols, is one of the most promising approaches for censusing overlooked cryptic invertebrate communities. We present biodiversity estimates based on sequencing of the cytochrome c oxidase subunit 1 (COI) gene for coral reefs of the Gulf of Aqaba, a semi-enclosed system in the northern Red Sea. Samples were obtained from standardized sampling devices (Autonomous Reef Monitoring Structures (ARMS)) deployed for 18 months. DNA barcoding of non-sessile specimens >2 mm revealed 83 OTUs in six phyla, of which only 25% matched a reference sequence in public databases. Metabarcoding of the 2 mm - 500 μm and sessile bulk fractions revealed 1197 OTUs in 15 animal phyla, of which only 4.9% matched reference barcodes. These results highlight the scarcity of COI data for cryptobenthic organisms of the Red Sea. Compared with data obtained using similar methods, our results suggest that Gulf of Aqaba reefs are less diverse than two Pacific coral reefs but much more diverse than an Atlantic oyster reef at a similar latitude. The standardized approaches used here show promise for establishing baseline data on biodiversity, monitoring the impacts of environmental change, and quantifying patterns of diversity at regional and global scales.

  17. Microbial Culturomics Broadens Human Vaginal Flora Diversity: Genome Sequence and Description of Prevotella lascolaii sp. nov. Isolated from a Patient with Bacterial Vaginosis.

    PubMed

    Diop, Khoudia; Diop, Awa; Levasseur, Anthony; Mediannikov, Oleg; Robert, Catherine; Armstrong, Nicholas; Couderc, Carine; Bretelle, Florence; Raoult, Didier; Fournier, Pierre-Edouard; Fenollar, Florence

    2018-03-01

    Microbial culturomics is a new subfield of postgenomic medicine and omics biotechnology application that has broadened our awareness on bacterial diversity of the human microbiome, including the human vaginal flora bacterial diversity. Using culturomics, a new obligate anaerobic Gram-stain-negative rod-shaped bacterium designated strain khD1 T was isolated in the vagina of a patient with bacterial vaginosis and characterized using taxonogenomics. The most abundant cellular fatty acids were C 15:0 anteiso (36%), C 16:0 (19%), and C 15:0 iso (10%). Based on an analysis of the full-length 16S rRNA gene sequences, phylogenetic analysis showed that the strain khD1 T exhibited 90% sequence similarity with Prevotella loescheii, the phylogenetically closest validated Prevotella species. With 3,763,057 bp length, the genome of strain khD1 T contained (mol%) 48.7 G + C and 3248 predicted genes, including 3194 protein-coding and 54 RNA genes. Given the phenotypical and biochemical characteristic results as well as genome sequencing, strain khD1 T is considered to represent a novel species within the genus Prevotella, for which the name Prevotella lascolaii sp. nov. is proposed. The type strain is khD1 T ( = CSUR P0109, = DSM 101754). These results show that microbial culturomics greatly improves the characterization of the human microbiome repertoire by isolating potential putative new species. Further studies will certainly clarify the microbial mechanisms of pathogenesis of these new microbes and their role in health and disease. Microbial culturomics is an important new addition to the diagnostic medicine toolbox and warrants attention in future medical, global health, and integrative biology postgraduate teaching curricula.

  18. Phylogeny and genetic diversity of Bridgeoporus nobilissimus inferred using mitochondrial and nuclear rDNA sequences

    USGS Publications Warehouse

    Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F.; Rodriguez, R.J.

    2003-01-01

    The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.

  19. Enrichment allows identification of diverse, rare elements in metagenomic resistome-virulome sequencing.

    PubMed

    Noyes, Noelle R; Weinroth, Maggie E; Parker, Jennifer K; Dean, Chris J; Lakin, Steven M; Raymond, Robert A; Rovira, Pablo; Doster, Enrique; Abdo, Zaid; Martin, Jennifer N; Jones, Kenneth L; Ruiz, Jaime; Boucher, Christina A; Belk, Keith E; Morley, Paul S

    2017-10-17

    Shotgun metagenomic sequencing is increasingly utilized as a tool to evaluate ecological-level dynamics of antimicrobial resistance and virulence, in conjunction with microbiome analysis. Interest in use of this method for environmental surveillance of antimicrobial resistance and pathogenic microorganisms is also increasing. In published metagenomic datasets, the total of all resistance- and virulence-related sequences accounts for < 1% of all sequenced DNA, leading to limitations in detection of low-abundance resistome-virulome elements. This study describes the extent and composition of the low-abundance portion of the resistome-virulome, using a bait-capture and enrichment system that incorporates unique molecular indices to count DNA molecules and correct for enrichment bias. The use of the bait-capture and enrichment system significantly increased on-target sequencing of the resistome-virulome, enabling detection of an additional 1441 gene accessions and revealing a low-abundance portion of the resistome-virulome that was more diverse and compositionally different than that detected by more traditional metagenomic assays. The low-abundance portion of the resistome-virulome also contained resistance genes with public health importance, such as extended-spectrum betalactamases, that were not detected using traditional shotgun metagenomic sequencing. In addition, the use of the bait-capture and enrichment system enabled identification of rare resistance gene haplotypes that were used to discriminate between sample origins. These results demonstrate that the rare resistome-virulome contains valuable and unique information that can be utilized for both surveillance and population genetic investigations of resistance. Access to the rare resistome-virulome using the bait-capture and enrichment system validated in this study can greatly advance our understanding of microbiome-resistome dynamics.

  20. Activity and Phylogenetic Diversity of Bacterial Cells with High and Low Nucleic Acid Content and Electron Transport System Activity in an Upwelling Ecosystem

    PubMed Central

    Longnecker, K.; Sherr, B. F.; Sherr, E. B.

    2005-01-01

    We evaluated whether bacteria with higher cell-specific nucleic acid content (HNA) or an active electron transport system, i.e., positive for reduction of 5-cyano-2,3-ditolyl tetrazolium chloride (CTC), were responsible for the bulk of bacterioplankton metabolic activity. We also examined whether the phylogenetic diversity of HNA and CTC-positive cells differed from the diversity of Bacteria with low nucleic acid content (LNA). Bacterial assemblages were sampled both in eutrophic shelf waters and in mesotrophic offshore waters in the Oregon coastal upwelling region. Cytometrically sorted HNA, LNA, and CTC-positive cells were assayed for their cell-specific [3H]leucine incorporation rates. Phylogenetic diversity in sorted non-radioactively labeled samples was assayed using denaturing gradient gel electrophoresis (DGGE) of PCR-amplified 16S rRNA genes. Cell-specific rates of leucine incorporation of HNA and CTC-positive cells were on average only slightly greater than the cell-specific rates of LNA cells. HNA cells accounted for most bacterioplankton substrate incorporation due to high abundances, while the low abundances of CTC-positive cells resulted in only a small contribution by these cells to total bacterial activity. The proportion of the total bacterial leucine incorporation attributable to LNA cells was higher in offshore regions than in shelf waters. Sequence data obtained from DGGE bands showed broadly similar phylogenetic diversity across HNA, LNA, and CTC-positive cells, with between-sample and between-region variability in the distribution of phylotypes. Our results suggest that LNA bacteria are not substantially different from HNA bacteria in either cell-specific rates of substrate incorporation or phylogenetic composition and that they can be significant contributors to bacterial metabolism in the sea. PMID:16332746

  1. Activity and phylogenetic diversity of bacterial cells with high and low nucleic acid content and electron transport system activity in an upwelling ecosystem.

    PubMed

    Longnecker, K; Sherr, B F; Sherr, E B

    2005-12-01

    We evaluated whether bacteria with higher cell-specific nucleic acid content (HNA) or an active electron transport system, i.e., positive for reduction of 5-cyano-2,3-ditolyl tetrazolium chloride (CTC), were responsible for the bulk of bacterioplankton metabolic activity. We also examined whether the phylogenetic diversity of HNA and CTC-positive cells differed from the diversity of Bacteria with low nucleic acid content (LNA). Bacterial assemblages were sampled both in eutrophic shelf waters and in mesotrophic offshore waters in the Oregon coastal upwelling region. Cytometrically sorted HNA, LNA, and CTC-positive cells were assayed for their cell-specific [3H]leucine incorporation rates. Phylogenetic diversity in sorted non-radioactively labeled samples was assayed using denaturing gradient gel electrophoresis (DGGE) of PCR-amplified 16S rRNA genes. Cell-specific rates of leucine incorporation of HNA and CTC-positive cells were on average only slightly greater than the cell-specific rates of LNA cells. HNA cells accounted for most bacterioplankton substrate incorporation due to high abundances, while the low abundances of CTC-positive cells resulted in only a small contribution by these cells to total bacterial activity. The proportion of the total bacterial leucine incorporation attributable to LNA cells was higher in offshore regions than in shelf waters. Sequence data obtained from DGGE bands showed broadly similar phylogenetic diversity across HNA, LNA, and CTC-positive cells, with between-sample and between-region variability in the distribution of phylotypes. Our results suggest that LNA bacteria are not substantially different from HNA bacteria in either cell-specific rates of substrate incorporation or phylogenetic composition and that they can be significant contributors to bacterial metabolism in the sea.

  2. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-01-01

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363

  3. Characterization of fatty acid-producing wastewater microbial communities using next generation sequencing technologies

    EPA Science Inventory

    While wastewater represents a viable source of bacterial biodiesel production, very little is known on the composition of these microbial communities. We studied the taxonomic diversity and succession of microbial communities in bioreactors accumulating fatty acids using 454-pyro...

  4. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...

  5. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...

  6. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...

  7. What can we learn about lyssavirus genomes using 454 sequencing?

    PubMed

    Höper, Dirk; Finke, Stefan; Freuling, Conrad M; Hoffmann, Bernd; Beer, Martin

    2012-01-01

    The main task of the individual project number four"Whole genome sequencing, virus-host adaptation, and molecular epidemiological analyses of lyssaviruses "within the network" Lyssaviruses--a potential re-emerging public health threat" is to provide high quality complete genome sequences from lyssaviruses. These sequences are analysed in-depth with regard to the diversity of the viral populations as to both quasi-species and so-called defective interfering RNAs. Moreover, the sequence data will facilitate further epidemiological analyses, will provide insight into the evolution of lyssaviruses and will be the basis for the design of novel nucleic acid based diagnostics. The first results presented here indicate that not only high quality full-length lyssavirus genome sequences can be generated, but indeed efficient analysis of the viral population gets feasible.

  8. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids.

  9. Identifying functionally informative evolutionary sequence profiles.

    PubMed

    Gil, Nelson; Fiser, Andras

    2018-04-15

    Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. andras.fiser@einstein.yu.edu. Supplementary data are available at Bioinformatics online.

  10. ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs

    PubMed Central

    2012-01-01

    Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. PMID:22958836

  11. mtDNA sequence diversity in Africa.

    PubMed Central

    Watson, E.; Bauer, K.; Aman, R.; Weiss, G.; von Haeseler, A.; Pääbo, S.

    1996-01-01

    mtDNA sequences were determined from 241 individuals from nine ethnic groups in Africa. When they were compared with published data from other groups, it was found that the !Kung, Mbuti, and Biaka show on the order of 10 times more sequence differences between the three groups, as well as between those and the other groups (the Fulbe, Hausa, Tuareg, Songhai, Kanuri, Yoruba, Mandenka, Somali, Tukana, and Kikuyu), than these other groups do between one other. Furthermore, the pairwise sequence distributions, patterns of coalescence events, and numbers of variable positions relative to the mean sequence difference indicate that the former three groups have been of constant size over time, whereas the latter have expanded in size. We suggest that this reflects subsistence patterns in that the populations that have expanded in size are food producers whereas those that have not are hunters and gatherers. PMID:8755932

  12. Multilocus sequence analysis for assessment of phylogenetic diversity and biogeography in Thalassospira bacteria from diverse marine environments.

    PubMed

    Lai, Qiliang; Liu, Yang; Yuan, Jun; Du, Juan; Wang, Liping; Sun, Fengqin; Shao, Zongze

    2014-01-01

    Thalassospira bacteria are widespread and have been isolated from various marine environments. Less is known about their genetic diversity and biogeography, as well as their role in marine environments, many of them cannot be discriminated merely using the 16S rRNA gene. To address these issues, in this report, the phylogenetic analysis of 58 strains from seawater and deep sea sediments were carried out using the multilocus sequence analysis (MLSA) based on acsA, aroE, gyrB, mutL, rpoD and trpB genes, and the DNA-DNA hybridization (DDH) and average nucleotide identity (ANI) based on genome sequences. The MLSA analysis demonstrated that the 58 strains were clearly separated into 15 lineages, corresponding to seven validly described species and eight potential novel species. The DDH and ANI values further confirmed the validity of the MLSA analysis and eight potential novel species. The MLSA interspecies gap of the genus Thalassospira was determined to be 96.16-97.12% sequence identity on the basis of the combined analyses of the DDH and MLSA, while the ANIm interspecies gap was 95.76-97.20% based on the in silico DDH analysis. Meanwhile, phylogenetic analyses showed that the Thalassospira bacteria exhibited distribution pattern to a certain degree according to geographic regions. Moreover, they clustered together according to the habitats depth. For short, the phylogenetic analyses and biogeography of the Thalassospira bacteria were systematically investigated for the first time. These results will be helpful to explore further their ecological role and adaptive evolution in marine environments.

  13. Multilocus Sequence Analysis for Assessment of Phylogenetic Diversity and Biogeography in Thalassospira Bacteria from Diverse Marine Environments

    PubMed Central

    Yuan, Jun; Du, Juan; Wang, Liping; Sun, Fengqin; Shao, Zongze

    2014-01-01

    Thalassospira bacteria are widespread and have been isolated from various marine environments. Less is known about their genetic diversity and biogeography, as well as their role in marine environments, many of them cannot be discriminated merely using the 16S rRNA gene. To address these issues, in this report, the phylogenetic analysis of 58 strains from seawater and deep sea sediments were carried out using the multilocus sequence analysis (MLSA) based on acsA, aroE, gyrB, mutL, rpoD and trpB genes, and the DNA-DNA hybridization (DDH) and average nucleotide identity (ANI) based on genome sequences. The MLSA analysis demonstrated that the 58 strains were clearly separated into 15 lineages, corresponding to seven validly described species and eight potential novel species. The DDH and ANI values further confirmed the validity of the MLSA analysis and eight potential novel species. The MLSA interspecies gap of the genus Thalassospira was determined to be 96.16–97.12% sequence identity on the basis of the combined analyses of the DDH and MLSA, while the ANIm interspecies gap was 95.76–97.20% based on the in silico DDH analysis. Meanwhile, phylogenetic analyses showed that the Thalassospira bacteria exhibited distribution pattern to a certain degree according to geographic regions. Moreover, they clustered together according to the habitats depth. For short, the phylogenetic analyses and biogeography of the Thalassospira bacteria were systematically investigated for the first time. These results will be helpful to explore further their ecological role and adaptive evolution in marine environments. PMID:25198177

  14. Genetic Diversity of Ascaris in China Assessed Using Simple Sequence Repeat Markers.

    PubMed

    Zhou, Chunhua; Jian, Shaoqing; Peng, Weidong; Li, Min

    2018-04-01

    The giant roundworm Ascaris infects pigs and people worldwide and causes serious diseases. The taxonomic relationship between Ascaris suum and Ascaris lumbricoides is still unclear. The purpose of the present study was to investigate the genetic diversity and population genetic structure of 258 Ascaris specimens from humans and pigs from 6 sympatric regions in Ascaris -endemic regions of China using existing simple sequence repeat data. The microsatellite markers showed a high level of allelic richness and genetic diversity in the samples. Each of the populations demonstrated excess homozygosity (Ho0). According to a genetic differentiation index (Fst=0.0593), there was a high-level of gene flow in the Ascaris populations. A hierarchical analysis on molecular variance revealed remarkably high levels of variation within the populations. Moreover, a population structure analysis indicated that Ascaris populations fell into 3 main genetic clusters, interpreted as A. suum , A. lumbricoides , and a hybrid of the species. We speculated that humans can be infected with A. lumbricoides , A. suum , and the hybrid, but pigs were mainly infected with A. suum . This study provided new information on the genetic diversity and population structure of Ascaris from human and pigs in China, which can be used for designing Ascaris control strategies. It can also be beneficial to understand the introgression of host affiliation.

  15. Genetic Diversity and Phylogenetic Analysis of the Iranian Leishmania Parasites Based on HSP70 Gene PCR-RFLP and Sequence Analysis.

    PubMed

    Nemati, Sara; Fazaeli, Asghar; Hajjaran, Homa; Khamesipour, Ali; Anbaran, Mohsen Falahati; Bozorgomid, Arezoo; Zarei, Fatah

    2017-08-01

    Despite the broad distribution of leishmaniasis among Iranians and animals across the country, little is known about the genetic characteristics of the causative agents. Applying both HSP70 PCR-RFLP and sequence analyses, this study aimed to evaluate the genetic diversity and phylogenetic relationships among Leishmania spp. isolated from Iranian endemic foci and available reference strains. A total of 36 Leishmania isolates from almost all districts across the country were genetically analyzed for the HSP70 gene using both PCR-RFLP and sequence analysis. The original HSP70 gene sequences were aligned along with homologous Leishmania sequences retrieved from NCBI, and subjected to the phylogenetic analysis. Basic parameters of genetic diversity were also estimated. The HSP70 PCR-RFLP presented 3 different electrophoretic patterns, with no further intraspecific variation, corresponding to 3 Leishmania species available in the country, L. tropica, L. major, and L. infantum. Phylogenetic analyses presented 5 major clades, corresponding to 5 species complexes. Iranian lineages, including L. major, L. tropica, and L. infantum, were distributed among 3 complexes L. major, L. tropica, and L. donovani. However, within the L. major and L. donovani species complexes, the HSP70 phylogeny was not able to distinguish clearly between the L. major and L. turanica isolates, and between the L. infantum, L. donovani, and L. chagasi isolates, respectively. Our results indicated that both HSP70 PCR-RFLP and sequence analyses are medically applicable tools for identification of Leishmania species in Iranian patients. However, the reduced genetic diversity of the target gene makes it inevitable that its phylogeny only resolves the major groups, namely, the species complexes.

  16. Amino acid selective unlabeling for sequence specific resonance assignments in proteins

    PubMed Central

    Krishnarjuna, B.; Jaipuria, Garima; Thakur, Anushikha

    2010-01-01

    Sequence specific resonance assignment constitutes an important step towards high-resolution structure determination of proteins by NMR and is aided by selective identification and assignment of amino acid types. The traditional approach to selective labeling yields only the chemical shifts of the particular amino acid being selected and does not help in establishing a link between adjacent residues along the polypeptide chain, which is important for sequential assignments. An alternative approach is the method of amino acid selective ‘unlabeling’ or reverse labeling, which involves selective unlabeling of specific amino acid types against a uniformly 13C/15N labeled background. Based on this method, we present a novel approach for sequential assignments in proteins. The method involves a new NMR experiment named, {12COi–15Ni+1}-filtered HSQC, which aids in linking the 1HN/15N resonances of the selectively unlabeled residue, i, and its C-terminal neighbor, i + 1, in HN-detected double and triple resonance spectra. This leads to the assignment of a tri-peptide segment from the knowledge of the amino acid types of residues: i − 1, i and i + 1, thereby speeding up the sequential assignment process. The method has the advantage of being relatively inexpensive, applicable to 2H labeled protein and can be coupled with cell-free synthesis and/or automated assignment approaches. A detailed survey involving unlabeling of different amino acid types individually or in pairs reveals that the proposed approach is also robust to misincorporation of 14N at undesired sites. Taken together, this study represents the first application of selective unlabeling for sequence specific resonance assignments and opens up new avenues to using this methodology in protein structural studies. Electronic supplementary material The online version of this article (doi:10.1007/s10858-010-9459-z) contains supplementary material, which is available to authorized users. PMID:21153044

  17. Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer.

    PubMed

    Siezen, Roland J; van Hylckama Vlieg, Johan E T

    2011-08-30

    In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a "natural metabolic engineer".

  18. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus

    PubMed Central

    Kinoti, Wycliff M.; Constable, Fiona E.; Nancarrow, Narelle; Plummer, Kim M.; Rodoni, Brendan

    2017-01-01

    The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus, occurring in 48 of the 61 Ilarvirus-positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus-like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus-like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus-like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the need

  19. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus.

    PubMed

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus , occurring in 48 of the 61 Ilarvirus -positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus -like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus -like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus -like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the

  20. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...

  1. CML24, Regulated in Expression by Diverse Stimuli, Encodes a Potential Ca2+ Sensor That Functions in Responses to Abscisic Acid, Daylength, and Ion Stress1

    PubMed Central

    Delk, Nikkí A.; Johnson, Keith A.; Chowdhury, Naweed I.; Braam, Janet

    2005-01-01

    Changes in intracellular calcium (Ca2+) levels serve to signal responses to diverse stimuli. Ca2+ signals are likely perceived through proteins that bind Ca2+, undergo conformation changes following Ca2+ binding, and interact with target proteins. The 50-member calmodulin-like (CML) Arabidopsis (Arabidopsis thaliana) family encodes proteins containing the predicted Ca2+-binding EF-hand motif. The functions of virtually all these proteins are unknown. CML24, also known as TCH2, shares over 40% amino acid sequence identity with calmodulin, has four EF hands, and undergoes Ca2+-dependent changes in hydrophobic interaction chromatography and migration rate through denaturing gel electrophoresis, indicating that CML24 binds Ca2+ and, as a consequence, undergoes conformational changes. CML24 expression occurs in all major organs, and transcript levels are increased from 2- to 15-fold in plants subjected to touch, darkness, heat, cold, hydrogen peroxide, abscisic acid (ABA), and indole-3-acetic acid. However, CML24 protein accumulation changes were not detectable. The putative CML24 regulatory region confers reporter expression at sites of predicted mechanical stress; in regions undergoing growth; in vascular tissues and various floral organs; and in stomata, trichomes, and hydathodes. CML24-underexpressing transgenics are resistant to ABA inhibition of germination and seedling growth, are defective in long-day induction of flowering, and have enhanced tolerance to CoCl2, molybdic acid, ZnSO4, and MgCl2. MgCl2 tolerance is not due to reduced uptake or to elevated Ca2+ accumulation. Together, these data present evidence that CML24, a gene expressed in diverse organs and responsive to diverse stimuli, encodes a potential Ca2+ sensor that may function to enable responses to ABA, daylength, and presence of various salts. PMID:16113225

  2. Sequencing, bioinformatic characterization and expression pattern of a putative amino acid transporter from the parasitic cestode Echinococcus granulosus.

    PubMed

    Camicia, Federico; Paredes, Rodolfo; Chalar, Cora; Galanti, Norbel; Kamenetzky, Laura; Gutierrez, Ariana; Rosenzvit, Mara C

    2008-03-31

    We have sequenced and partially characterized an Echinococcus granulosus cDNA, termed egat1, from a protoscolex signal sequence trap (SST) cDNA library. The isolated 1627 bp long cDNA contains an ORF of 489 amino acids and shows an amino acid identity of 30% with neutral and excitatory amino acid transporters members of the Dicarboxylate/Amino Acid Na+ and/or H+ Cation Symporter family (DAACS) (TC 2.A.23). Additional bioinformatics analysis of EgAT1, confirmed the results obtained by similarity searches and showed the presence of 9 to 10 transmembrane domains, consensus sequences for N-glycosylation between the third and fourth transmembrane domain, a highly similar hydropathy profile with ASCT1 (a known member of DAACS family), high score with SDF (Sodium Dicarboxilate Family) and similar motifs with EDTRANSPORT, a fingerprint of excitatory amino acid transporters. The localization of the putative amino acid transporter was analyzed by in situ hybridization and immunofluorescence in protoscoleces and associated germinal layer. The in situ hybridization labelling indicates the distribution of egat1 mRNA throughout the tegument. EgAT1 protein, which showed in Western blots a molecular mass of approximately 60 kD, is localized in the subtegumental region of the metacestode, particularly around suckers and rostellum of protoscoleces and layers from brood capsules. The sequence and expression analyses of EgAT1 pave the way for functional analysis of amino acids transporters of E. granulosus and its evaluation as new drug targets against cystic echinococcosis.

  3. Amino- and carboxyl-terminal amino acid sequences of proteins coded by gag gene of murine leukemia virus

    PubMed Central

    Oroszlan, Stephen; Henderson, Louis E.; Stephenson, John R.; Copeland, Terry D.; Long, Cedric W.; Ihle, James N.; Gilden, Raymond V.

    1978-01-01

    The amino- and carboxyl-terminal amino acid sequences of proteins (p10, p12, p15, and p30) coded by the gag gene of Rauscher and AKR murine leukemia viruses were determined. Among these proteins, p15 from both viruses appears to have a blocked amino end. Proline was found to be the common NH2 terminus of both p30s and both p12s, and alanine of both p10s. The amino-terminal sequences of p30s are identical, as are those of p10s, while the p12 sequences are clearly distinctive but also show substantial homology. The carboxyl-terminal amino acids of both viral p30s and p12s are leucine and phenylalanine, respectively. Rauscher leukemia virus p15 has tyrosine as the carboxyl terminus while AKR virus p15 has phenylalanine in this position. The compositional and sequence data provide definite chemical criteria for the identification of analogous gag gene products and for the comparison of viral proteins isolated in different laboratories. On the basis of amino acid sequences and the previously proposed H-p15-p12-p30-p10-COOH peptide sequence in the precursor polyprotein, a model for cleavage sites involved in the post-translational processing of the precursor coded for by the gag gene is proposed. PMID:206897

  4. Genome-wide-analyses of Listeria monocytogenes from food-processing plants reveal clonal diversity and date the emergence of persisting sequence types.

    PubMed

    Knudsen, Gitte M; Nielsen, Jesper Boye; Marvig, Rasmus L; Ng, Yin; Worning, Peder; Westh, Henrik; Gram, Lone

    2017-08-01

    Whole genome sequencing is increasing used in epidemiology, e.g. for tracing outbreaks of food-borne diseases. This requires in-depth understanding of pathogen emergence, persistence and genomic diversity along the food production chain including in food processing plants. We sequenced the genomes of 80 isolates of Listeria monocytogenes sampled from Danish food processing plants over a time-period of 20 years, and analysed the sequences together with 10 public available reference genomes to advance our understanding of interplant and intraplant genomic diversity of L. monocytogenes. Except for three persisting sequence types (ST) based on Multi Locus Sequence Typing being ST7, ST8 and ST121, long-term persistence of clonal groups was limited, and new clones were introduced continuously, potentially from raw materials. No particular gene could be linked to the persistence phenotype. Using time-based phylogenetic analyses of the persistent STs, we estimate the L. monocytogenes evolutionary rate to be 0.18-0.35 single nucleotide polymorphisms/year, suggesting that the persistent STs emerged approximately 100 years ago, which correlates with the onset of industrialization and globalization of the food market. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  5. Genetic diversity of pneumococcal surface protein A in invasive pneumococcal isolates from Korean children, 1991-2016.

    PubMed

    Yun, Ki Wook; Choi, Eun Hwa; Lee, Hoan Jong

    2017-01-01

    Pneumococcal surface protein A (PspA) is an important virulence factor of pneumococci and has been investigated as a primary component of a capsular serotype-independent pneumococcal vaccine. Thus, we sought to determine the genetic diversity of PspA to explore its potential as a vaccine candidate. Among the 190 invasive pneumococcal isolates collected from Korean children between 1991 and 2016, two (1.1%) isolates were found to have no pspA by multiple polymerase chain reactions. The full length pspA genes from 185 pneumococcal isolates were sequenced. The length of pspA varied, ranging from 1,719 to 2,301 base pairs with 55.7-100% nucleotide identity. Based on the sequences of the clade-defining regions, 68.7% and 49.7% were in PspA family 2 and clade 3/family 2, respectively. PspA clade types were correlated with genotypes using multilocus sequence typing and divided into several subclades based on diversity analysis of the N-terminal α-helical regions, which showed nucleotide sequence identities of 45.7-100% and amino acid sequence identities of 23.1-100%. Putative antigenicity plots were also diverse among individual clades and subclades. The differences in antigenicity patterns were concentrated within the N-terminal 120 amino acids. In conclusion, the N-terminal α-helical domain, which is known to be the major immunogenic portion of PspA, is genetically variable and should be further evaluated for antigenic differences and cross-reactivity between various PspA types from pneumococcal isolates.

  6. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    PubMed

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  7. Genetic diversity of Taenia asiatica from Thailand and other geographical locations as revealed by cytochrome c oxidase subunit 1 sequences.

    PubMed

    Anantaphruti, Malinee Thairungroj; Thaenkham, Urusa; Watthanakulpanich, Dorn; Phuphisut, Orawan; Maipanich, Wanna; Yoonuan, Tippayarat; Nuamtanong, Supaporn; Pubampen, Somjit; Sanguankiat, Surapol

    2013-02-01

    Twelve 924 bp cytochrome c oxidase subunit 1 (cox1) mitochondrial DNA sequences from Taenia asiatica isolates from Thailand were aligned and compared with multiple sequence isolates from Thailand and 6 other countries from the GenBank database. The genetic divergence of T. asiatica was also compared with Taenia saginata database sequences from 6 different countries in Asia, including Thailand, and 3 countries from other continents. The results showed that there were minor genetic variations within T. asiatica species, while high intraspecies variation was found in T. saginata. There were only 2 haplotypes and 1 polymorphic site found in T. asiatica, but 8 haplotypes and 9 polymorphic sites in T. saginata. Haplotype diversity was very low, 0.067, in T. asiatica and high, 0.700, in T. saginata. The very low genetic diversity suggested that T. asiatica may be at a risk due to the loss of potential adaptive alleles, resulting in reduced viability and decreased responses to environmental changes, which may endanger the species.

  8. Genetic Diversity of Taenia asiatica from Thailand and Other Geographical Locations as Revealed by Cytochrome c Oxidase Subunit 1 Sequences

    PubMed Central

    Thaenkham, Urusa; Watthanakulpanich, Dorn; Phuphisut, Orawan; Maipanich, Wanna; Yoonuan, Tippayarat; Nuamtanong, Supaporn; Pubampen, Somjit; Sanguankiat, Surapol

    2013-01-01

    Twelve 924 bp cytochrome c oxidase subunit 1 (cox1) mitochondrial DNA sequences from Taenia asiatica isolates from Thailand were aligned and compared with multiple sequence isolates from Thailand and 6 other countries from the GenBank database. The genetic divergence of T. asiatica was also compared with Taenia saginata database sequences from 6 different countries in Asia, including Thailand, and 3 countries from other continents. The results showed that there were minor genetic variations within T. asiatica species, while high intraspecies variation was found in T. saginata. There were only 2 haplotypes and 1 polymorphic site found in T. asiatica, but 8 haplotypes and 9 polymorphic sites in T. saginata. Haplotype diversity was very low, 0.067, in T. asiatica and high, 0.700, in T. saginata. The very low genetic diversity suggested that T. asiatica may be at a risk due to the loss of potential adaptive alleles, resulting in reduced viability and decreased responses to environmental changes, which may endanger the species. PMID:23467439

  9. Helicobacter pylori Heat Shock Protein A: Serologic Responses and Genetic Diversity

    PubMed Central

    Ng, Enders K. W.; Thompson, Stuart A.; Pérez-Pérez, Guillermo I.; Kansau, Imad; van der Ende, Arie; Labigne, Agnès; Sung, Joseph J. Y.; Chung, S. C. Sydney; Blaser, Martin J.

    1999-01-01

    Helicobacter pylori synthesizes an unusual GroES homolog, heat shock protein A (HspA). The present study was aimed at an assessment of the serological response to HspA in a group of Chinese patients with defined gastroduodenal pathologies and determination of whether diversity is present in the nucleotide sequences encoding HspA in isolates from these patients. Serum samples collected from 154 patients who had an upper gastrointestinal pathology and the presence of H. pylori defined by biopsy were tested for an immunoglobulin G (IgG) serologic response to H. pylori HspA by an enzyme linked immunosorbant assay. HspA-encoding nucleotide sequences in H. pylori isolates from 14 patients (7 seropositive and 7 seronegative for HspA) were analyzed by PCR and direct sequencing of the PCR products. The sequencing results were compared to those of 48 isolates from other parts of the world. Of the 154 known H. pylori-positive patients, 54 (35.1%) were seropositive for HspA. The A domain (GroES homology) of HspA was highly conserved in the 14 isolates tested. Although the B domain (metal-binding site unique to H. pylori) resembled that in the known major variant, particular amino acid substitutions allowed definition of an HspA variant associated with isolates from East Asia. There were no associations between patient characteristics and HspA seropositivity or amino acid sequences. We confirmed in this study that the clinical outcomes of H. pylori infection are not related to HspA antigenicity or to sequence variation. However, B-domain sequence variation may be a marker for the study of the genetic diversity of H. pylori strains of different geographic origins. PMID:10225839

  10. Genetic diversity studies in pea (Pisum sativum L.) using simple sequence repeat markers.

    PubMed

    Kumari, P; Basal, N; Singh, A K; Rai, V P; Srivastava, C P; Singh, P K

    2013-03-13

    The genetic diversity among 28 pea (Pisum sativum L.) genotypes was analyzed using 32 simple sequence repeat markers. A total of 44 polymorphic bands, with an average of 2.1 bands per primer, were obtained. The polymorphism information content ranged from 0.657 to 0.309 with an average of 0.493. The variation in genetic diversity among these cultivars ranged from 0.11 to 0.73. Cluster analysis based on Jaccard's similarity coefficient using the unweighted pair-group method with arithmetic mean (UPGMA) revealed 2 distinct clusters, I and II, comprising 6 and 22 genotypes, respectively. Cluster II was further differentiated into 2 subclusters, IIA and IIB, with 12 and 10 genotypes, respectively. Principal component (PC) analysis revealed results similar to those of UPGMA. The first, second, and third PCs contributed 21.6, 16.1, and 14.0% of the variation, respectively; cumulative variation of the first 3 PCs was 51.7%.

  11. Genotyping-by-Sequencing Analysis for Determining Population Structure of Finger Millet Germplasm of Diverse Origins.

    PubMed

    Kumar, Anil; Sharma, Divya; Tiwari, Apoorv; Jaiswal, J P; Singh, N K; Sood, Salej

    2016-07-01

    Finger millet [ (L.) Gaertn.] is grown mainly by subsistence farmers in arid and semiarid regions of the world. To broaden its genetic base and to boost its production, it is of paramount importance to characterize and genotype the diverse gene pool of this important food and nutritional security crop. However, as a result of nonavailability of the genome sequence of finger millet, the progress could not be made in realizing the molecular basis of unique qualities of the crop. In the present investigation, attempts have been made to characterize the genetically diverse collection of 113 finger millet accessions through whole-genome genotyping-by-sequencing (GBS), which resulted in a genome-wide set of 23,000 single-nucleotide polymorphisms (SNPs) segregating across the entire collection and several thousand SNPs segregating within every accession. A model-based population structure analysis reveals the presence of three subpopulations among the finger millet accessions, which are in parallel with the results of phylogenetic analysis. The observed population structure is consistent with the hypothesis that finger millet was domesticated first in Africa, and from there it was introduced to India some 3000 yr ago. A total of 1128 gene ontology (GO) terms were assigned to SNP-carrying genes for three main categories: biological process, cellular component, and molecular function. Facilitated access to high-throughput genotyping and sequencing technologies are likely to improve the breeding process in developing countries, and as such, this data will be very useful to breeders who are working for the genetic improvement of finger millet. Copyright © 2016 Crop Science Society of America.

  12. Diversity of Babesia bovis merozoite surface antigen genes in the Philippines.

    PubMed

    Tattiyapong, Muncharee; Sivakumar, Thillaiampalam; Ybanez, Adrian Patalinghug; Ybanez, Rochelle Haidee Daclan; Perez, Zandro Obligado; Guswanto, Azirwan; Igarashi, Ikuo; Yokoyama, Naoaki

    2014-02-01

    Babesia bovis is the causative agent of fatal babesiosis in cattle. In the present study, we investigated the genetic diversity of B. bovis among Philippine cattle, based on the genes that encode merozoite surface antigens (MSAs). Forty-one B. bovis-positive blood DNA samples from cattle were used to amplify the msa-1, msa-2b, and msa-2c genes. In phylogenetic analyses, the msa-1, msa-2b, and msa-2c gene sequences generated from Philippine B. bovis-positive DNA samples were found in six, three, and four different clades, respectively. All of the msa-1 and most of the msa-2b sequences were found in clades that were formed only by Philippine msa sequences in the respective phylograms. While all the msa-1 sequences from the Philippines showed similarity to those formed by Australian msa-1 sequences, the msa-2b sequences showed similarity to either Australian or Mexican msa-2b sequences. In contrast, msa-2c sequences from the Philippines were distributed across all the clades of the phylogram, although one clade was formed exclusively by Philippine msa-2c sequences. Similarities among the deduced amino acid sequences of MSA-1, MSA-2b, and MSA-2c from the Philippines were 62.2-100, 73.1-100, and 67.3-100%, respectively. The present findings demonstrate that B. bovis populations are genetically diverse in the Philippines. This information will provide a good foundation for the future design and implementation of improved immunological preventive methodologies against bovine babesiosis in the Philippines. The study has also generated a set of data that will be useful for futher understanding of the global genetic diversity of this important parasite. © 2013.

  13. On the Use of Diversity Measures in Longitudinal Sequencing Studies of Microbial Communities.

    PubMed

    Wagner, Brandie D; Grunwald, Gary K; Zerbe, Gary O; Mikulich-Gilbertson, Susan K; Robertson, Charles E; Zemanick, Edith T; Harris, J Kirk

    2018-01-01

    Identification of the majority of organisms present in human-associated microbial communities is feasible with the advent of high throughput sequencing technology. As substantial variability in microbiota communities is seen across subjects, the use of longitudinal study designs is important to better understand variation of the microbiome within individual subjects. Complex study designs with longitudinal sample collection require analytic approaches to account for this additional source of variability. A common approach to assessing community changes is to evaluate the change in alpha diversity (the variety and abundance of organisms in a community) over time. However, there are several commonly used alpha diversity measures and the use of different measures can result in different estimates of magnitude of change and different inferences. It has recently been proposed that diversity profile curves are useful for clarifying these differences, and may provide a more complete picture of the community structure. However, it is unclear how to utilize these curves when interest is in evaluating changes in community structure over time. We propose the use of a bi-exponential function in a longitudinal model that accounts for repeated measures on each subject to compare diversity profiles over time. Furthermore, it is possible that no change in alpha diversity (single community/sample) may be observed despite the presence of a highly divergent community composition. Thus, it is also important to use a beta diversity measure (similarity between multiple communities/samples) that captures changes in community composition. Ecological methods developed to evaluate temporal turnover have currently only been applied to investigate changes of a single community over time. We illustrate the extension of this approach to multiple communities of interest (i.e., subjects) by modeling the beta diversity measure over time. With this approach, a rate of change in community

  14. On the Use of Diversity Measures in Longitudinal Sequencing Studies of Microbial Communities

    PubMed Central

    Wagner, Brandie D.; Grunwald, Gary K.; Zerbe, Gary O.; Mikulich-Gilbertson, Susan K.; Robertson, Charles E.; Zemanick, Edith T.; Harris, J. Kirk

    2018-01-01

    Identification of the majority of organisms present in human-associated microbial communities is feasible with the advent of high throughput sequencing technology. As substantial variability in microbiota communities is seen across subjects, the use of longitudinal study designs is important to better understand variation of the microbiome within individual subjects. Complex study designs with longitudinal sample collection require analytic approaches to account for this additional source of variability. A common approach to assessing community changes is to evaluate the change in alpha diversity (the variety and abundance of organisms in a community) over time. However, there are several commonly used alpha diversity measures and the use of different measures can result in different estimates of magnitude of change and different inferences. It has recently been proposed that diversity profile curves are useful for clarifying these differences, and may provide a more complete picture of the community structure. However, it is unclear how to utilize these curves when interest is in evaluating changes in community structure over time. We propose the use of a bi-exponential function in a longitudinal model that accounts for repeated measures on each subject to compare diversity profiles over time. Furthermore, it is possible that no change in alpha diversity (single community/sample) may be observed despite the presence of a highly divergent community composition. Thus, it is also important to use a beta diversity measure (similarity between multiple communities/samples) that captures changes in community composition. Ecological methods developed to evaluate temporal turnover have currently only been applied to investigate changes of a single community over time. We illustrate the extension of this approach to multiple communities of interest (i.e., subjects) by modeling the beta diversity measure over time. With this approach, a rate of change in community

  15. Amino acid sequence of the smaller basic protein from rat brain myelin

    PubMed Central

    Dunkley, Peter R.; Carnegie, Patrick R.

    1974-01-01

    1. The complete amino acid sequence of the smaller basic protein from rat brain myelin was determined. This protein differs from myelin basic proteins of other species in having a deletion of a polypeptide of 40 amino acid residues from the centre of the molecule. 2. A detailed comparison is made of the constant and variable regions in a group of myelin basic proteins from six species. 3. An arginine residue in the rat protein was found to be partially methylated. The ratio of methylated to unmethylated arginine at this position differed from that found for the human basic protein. 4. Three tryptic peptides were isolated in more than one form. The differences between the two forms of each peptide are discussed in relation to the electrophoretic heterogeneity of myelin basic proteins, which is known to occur at alkaline pH values. 5. Detailed evidence for the amino acid sequence of the protein has been deposited as Supplementary Publication SUP 50029 at the British Library (Lending Division) (formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1973) 131, 5. PMID:4141893

  16. Two Theileria parva CD8 T Cell Antigen Genes Are More Variable in Buffalo than Cattle Parasites, but Differ in Pattern of Sequence Diversity

    PubMed Central

    Pelle, Roger; Graham, Simon P.; Njahira, Moses N.; Osaso, Julius; Saya, Rosemary M.; Odongo, David O.; Toye, Philip G.; Spooner, Paul R.; Musoke, Anthony J.; Mwangi, Duncan M.; Taracha, Evans L. N.; Morrison, W. Ivan; Weir, William; Silva, Joana C.; Bishop, Richard P.

    2011-01-01

    Background Theileria parva causes an acute fatal disease in cattle, but infections are asymptomatic in the African buffalo (Syncerus caffer). Cattle can be immunized against the parasite by infection and treatment, but immunity is partially strain specific. Available data indicate that CD8+ T lymphocyte responses mediate protection and, recently, several parasite antigens recognised by CD8+ T cells have been identified. This study set out to determine the nature and extent of polymorphism in two of these antigens, Tp1 and Tp2, which contain defined CD8+ T-cell epitopes, and to analyse the sequences for evidence of selection. Methodology/Principal Findings Partial sequencing of the Tp1 gene and the full-length Tp2 gene from 82 T. parva isolates revealed extensive polymorphism in both antigens, including the epitope-containing regions. Single nucleotide polymorphisms were detected at 51 positions (∼12%) in Tp1 and in 320 positions (∼61%) in Tp2. Together with two short indels in Tp1, these resulted in 30 and 42 protein variants of Tp1 and Tp2, respectively. Although evidence of positive selection was found for multiple amino acid residues, there was no preferential involvement of T cell epitope residues. Overall, the extent of diversity was much greater in T. parva isolates originating from buffalo than in isolates known to be transmissible among cattle. Conclusions/Significance The results indicate that T. parva parasites maintained in cattle represent a subset of the overall T. parva population, which has become adapted for tick transmission between cattle. The absence of obvious enrichment for positively selected amino acid residues within defined epitopes indicates either that diversity is not predominantly driven by selection exerted by host T cells, or that such selection is not detectable by the methods employed due to unidentified epitopes elsewhere in the antigens. Further functional studies are required to address this latter point. PMID:21559495

  17. Genetic diversity based on 28S rDNA sequences among populations of Culex quinquefasciatus collected at different locations in Tamil Nadu, India.

    PubMed

    Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S

    2015-09-01

    The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.

  18. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, Soo-Ik; Hammes, G.G.

    1989-11-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chickenmore » and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.« less

  19. Investigation of Microbial Diversity in Geothermal Hot Springs in Unkeshwar, India, Based on 16S rRNA Amplicon Metagenome Sequencing

    PubMed Central

    Mehetre, Gajanan T.; Paranjpe, Aditi; Dastager, Syed G.

    2016-01-01

    Microbial diversity in geothermal waters of the Unkeshwar hot springs in Maharashtra, India, was studied using 16S rRNA amplicon metagenomic sequencing. Taxonomic analysis revealed the presence of Bacteroidetes, Proteobacteria, Cyanobacteria, Actinobacteria, Archeae, and OD1 phyla. Metabolic function prediction analysis indicated a battery of biological information systems indicating rich and novel microbial diversity, with potential biotechnological applications in this niche. PMID:26950332

  20. Analysis of bacterial and archaeal diversity in coastal microbial mats using massive parallel 16S rRNA gene tag sequencing.

    PubMed

    Bolhuis, Henk; Stal, Lucas J

    2011-11-01

    Coastal microbial mats are small-scale and largely closed ecosystems in which a plethora of different functional groups of microorganisms are responsible for the biogeochemical cycling of the elements. Coastal microbial mats play an important role in coastal protection and morphodynamics through stabilization of the sediments and by initiating the development of salt-marshes. Little is known about the bacterial and especially archaeal diversity and how it contributes to the ecological functioning of coastal microbial mats. Here, we analyzed three different types of coastal microbial mats that are located along a tidal gradient and can be characterized as marine (ST2), brackish (ST3) and freshwater (ST3) systems. The mats were sampled during three different seasons and subjected to massive parallel tag sequencing of the V6 region of the 16S rRNA genes of Bacteria and Archaea. Sequence analysis revealed that the mats are among the most diverse marine ecosystems studied so far and consist of several novel taxonomic levels ranging from classes to species. The diversity between the different mat types was far more pronounced than the changes between the different seasons at one location. The archaeal community for these mats have not been studied before and revealed a strong reaction on a short period of draught during summer resulting in a massive increase in halobacterial sequences, whereas the bacterial community was barely affected. We concluded that the community composition and the microbial diversity were intrinsic of the mat type and depend on the location along the tidal gradient indicating a relation with salinity.

  1. Amino acid sequence analysis of the annexin super-gene family of proteins.

    PubMed

    Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

    1991-06-15

    The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of

  2. Genetic diversity, genetic structure and demographic history of Cycas simplicipinna (Cycadaceae) assessed by DNA sequences and SSR markers

    PubMed Central

    2014-01-01

    Background Cycas simplicipinna (T. Smitinand) K. Hill. (Cycadaceae) is an endangered species in China. There were seven populations and 118 individuals that we could collect were genotyped in this study. Here, we assessed the genetic diversity, genetic structure and demographic history of this species. Results Analyses of data of DNA sequences (two maternally inherited intergenic spacers of chloroplast, cpDNA and one biparentally inherited internal transcribed spacer region ITS4-ITS5, nrDNA) and sixteen microsatellite loci (SSR) were conducted in the species. Of the 118 samples, 86 individuals from the seven populations were used for DNA sequencing and 115 individuals from six populations were used for the microsatellite study. We found high genetic diversity at the species level, low genetic diversity within each of the seven populations and high genetic differentiation among the populations. There was a clear genetic structure within populations of C. simplicipinna. A demographic history inferred from DNA sequencing data indicates that C. simplicipinna experienced a recent population contraction without retreating to a common refugium during the last glacial period. The results derived from SSR data also showed that C. simplicipinna underwent past effective population contraction, likely during the Pleistocene. Conclusions Some genetic features of C. simplicipinna such as having high genetic differentiation among the populations, a clear genetic structure and a recent population contraction could provide guidelines for protecting this endangered species from extinction. Furthermore, the genetic features with population dynamics of the species in our study would help provide insights and guidelines for protecting other endangered species effectively. PMID:25016306

  3. Benchmark Evaluation of True Single Molecular Sequencing to Determine Cystic Fibrosis Airway Microbiome Diversity.

    PubMed

    Hahn, Andrea; Bendall, Matthew L; Gibson, Keylie M; Chaney, Hollis; Sami, Iman; Perez, Geovanny F; Koumbourlis, Anastassios C; McCaffrey, Timothy A; Freishtat, Robert J; Crandall, Keith A

    2018-01-01

    Cystic fibrosis (CF) is an autosomal recessive disease associated with recurrent lung infections that can lead to morbidity and mortality. The impact of antibiotics for treatment of acute pulmonary exacerbations on the CF airway microbiome remains unclear with prior studies giving conflicting results and being limited by their use of 16S ribosomal RNA sequencing. Our primary objective was to validate the use of true single molecular sequencing (tSMS) and PathoScope in the analysis of the CF airway microbiome. Three control samples were created with differing amounts of Burkholderia cepacia , Pseudomonas aeruginosa , and Prevotella melaninogenica , three common bacteria found in cystic fibrosis lungs. Paired sputa were also obtained from three study participants with CF before and >6 days after initiation of antibiotics. Antibiotic resistant B. cepacia and P. aeruginosa were identified in concurrently obtained respiratory cultures. Direct sequencing was performed using tSMS, and filtered reads were aligned to reference genomes from NCBI using PathoScope and Kraken and unique clade-specific marker genes using MetaPhlAn. A total of 180-518 K of 6-12 million filtered reads were aligned for each sample. Detection of known pathogens in control samples was most successful using PathoScope. In the CF sputa, alpha diversity measures varied based on the alignment method used, but similar trends were found between pre- and post-antibiotic samples. PathoScope outperformed Kraken and MetaPhlAn in our validation study of artificial bacterial community controls and also has advantages over Kraken and MetaPhlAn of being able to determine bacterial strains and the presence of fungal organisms. PathoScope can be confidently used when evaluating metagenomic data to determine CF airway microbiome diversity.

  4. A not-so-big crisis: re-reading Silurian conodont diversity in a sequence-stratigraphic framework

    NASA Astrophysics Data System (ADS)

    Jarochowska, Emilia; Munnecke, Axel

    2016-04-01

    Conodonts are extensively used in Ordovician through Triassic biostratigraphy and fossil-based geochemistry. However, their distribution in rock successions is commonly taken at face value, without taking into account their diverse and poorly understood ecology. Multielement taxonomy, ontogenetic and environmental variability, difficulties in extraction, and relative rarity all contribute to the general lack of quantitative studies on conodont stratigraphic distribution and temporal turnover. With respect to Silurian conodonts, the concept of recurrent conodont extinction events - the so called Ireviken, Mulde and Lau events - has become a standard in the stratigraphic literature. The concept has been proposed based on qualitative observations of local extirpations of open-marine pelagic or nekto-benthic taxa and temporary dominance of shallow-water species in the Silurian succession of the Swedish island of Gotland. These changes coincided with positive carbon isotope excursions, abrupt facies shifts, "blooms" of benthic fauna, and changes in reef communities, which have all been combined into a general view of Silurian bio-geochemical events. This view posits a deterministic, reproducible pattern in Silurian conodont diversity, attributed to recurrent ecological or geochemical conditions. The growing body of sequence-stratigraphic interpretations across these events in Gotland and other sections worldwide indicate that in all cases the Silurian "events" are associated with rapid global regressions. This suggests that faunal changes such as the dominance of shallow-water, low-diversity conodont fauna and the increase of benthic invertebrate diversity and abundance represent predictable consequences of the variation in the completeness of the rock record and preservation potential of different environments. Our studies in Poland and Ukraine indicate that the magnitude of change in the taxonomic composition of conodont assemblages across the middle Silurian global

  5. A-to-I RNA Editing Contributes to Proteomic Diversity in Cancer. | Office of Cancer Genomics

    Cancer.gov

    Adenosine (A) to inosine (I) RNA editing introduces many nucleotide changes in cancer transcriptomes. However, due to the complexity of post-transcriptional regulation, the contribution of RNA editing to proteomic diversity in human cancers remains unclear. Here, we performed an integrated analysis of TCGA genomic data and CPTAC proteomic data. Despite limited site diversity, we demonstrate that A-to-I RNA editing contributes to proteomic diversity in breast cancer through changes in amino acid sequences. We validate the presence of editing events at both RNA and protein levels.

  6. Allelic diversity of the MHC class II DRB genes in brown bears (Ursus arctos) and a comparison of DRB sequences within the family Ursidae.

    PubMed

    Goda, N; Mano, T; Kosintsev, P; Vorobiev, A; Masuda, R

    2010-11-01

    The allelic diversity of the DRB locus in major histocompatibility complex (MHC) genes was analyzed in the brown bear (Ursus arctos) from the Hokkaido Island of Japan, Siberia, and Kodiak of Alaska. Nineteen alleles of the DRB exon 2 were identified from a total of 38 individuals of U. arctos and were highly polymorphic. Comparisons of non-synonymous and synonymous substitutions in the antigen-binding sites of deduced amino acid sequences indicated evidence for balancing selection on the bear DRB locus. The phylogenetic analysis of the DRB alleles among three genera (Ursus, Tremarctos, and Ailuropoda) in the family Ursidae revealed that DRB allelic lineages were not separated according to species. This strongly shows trans-species persistence of DRB alleles within the Ursidae. © 2010 John Wiley & Sons A/S.

  7. Solid phase sequencing of biopolymers

    DOEpatents

    Cantor, Charles; Koster, Hubert

    2010-09-28

    This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.

  8. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis.

    PubMed

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye

    2016-07-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods.

  9. Extension of the COG and arCOG databases by amino acid and nucleotide sequences

    PubMed Central

    Meereis, Florian; Kaufmann, Michael

    2008-01-01

    Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535

  10. Multilocus sequence analysis (MLSA) of Bradyrhizobium strains: revealing high diversity of tropical diazotrophic symbiotic bacteria.

    PubMed

    Delamuta, Jakeline Renata Marçon; Ribeiro, Renan Augusto; Menna, Pâmela; Bangel, Eliane Villamil; Hungria, Mariangela

    2012-04-01

    Symbiotic association of several genera of bacteria collectively called as rhizobia and plants belonging to the family Leguminosae (=Fabaceae) results in the process of biological nitrogen fixation, playing a key role in global N cycling, and also bringing relevant contributions to the agriculture. Bradyrhizobium is considered as the ancestral of all nitrogen-fixing rhizobial species, probably originated in the tropics. The genus encompasses a variety of diverse bacteria, but the diversity captured in the analysis of the 16S rRNA is often low. In this study, we analyzed twelve Bradyrhizobium strains selected from previous studies performed by our group for showing high genetic diversity in relation to the described species. In addition to the 16S rRNA, five housekeeping genes (recA, atpD, glnII, gyrB and rpoB) were analyzed in the MLSA (multilocus sequence analysis) approach. Analysis of each gene and of the concatenated housekeeping genes captured a considerably higher level of genetic diversity, with indication of putative new species. The results highlight the high genetic variability associated with Bradyrhizobium microsymbionts of a variety of legumes. In addition, the MLSA approach has proved to represent a rapid and reliable method to be employed in phylogenetic and taxonomic studies, speeding the identification of the still poorly known diversity of nitrogen-fixing rhizobia in the tropics.

  11. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  12. The amino acid motif L/IIxxFE defines a novel actin-binding sequence in PDZ-RhoGEF

    PubMed Central

    Banerjee, Jayashree; Fischer, Christopher C.; Wedegaertner, Philip B.

    2009-01-01

    PDZ-RhoGEF is a member of the regulator of G protein signaling (RGS) domain-containing RhoGEFs (RGS-RhoGEFs) that link activated heterotrimeric G protein α subunits of the G12 family to activation of the small GTPase RhoA. Unique among the RGS-RhoGEFs, PDZ-RhoGEF contains a short sequence that localizes the protein to the actin cytoskeleton. In this report, we demonstrate that the actin-binding domain, located between amino acids 561–585, directly binds to F-actin in vitro. Extensive mutagenesis identifies isoleucine 568, isoleucine 569, phenylalanine 572, and glutamic acid 573 as necessary for binding to actin and for co-localization with the actin cytoskeleton in cells. These results define a novel actin-binding sequence in PDZ-RhoGEF with a critical amino acid motif of IIxxFE. Moreover, sequence analysis identifies a similar actin-binding motif in the N-terminus of the RhoGEF frabin, and, as with PDZ-RhoGEF, mutagenesis and actin interaction experiments demonstrate a motif of LIxxFE, consisting of the key amino acids leucine 23, isoleucine 24, phenylalanine 27, and glutamic acid 28. Taken together, results with PDZ-RhoGEF and frabin identify a novel actin binding sequence. Lastly, inducible dimerization of the actin-binding region of PDZ-RhoGEF revealed a dimerization-dependent actin bundling activity in vitro. PDZ-RhoGEF exists in cells as a dimer, raising the possibility that PDZ-RhoGEF could influence actin structure independent of its ability to activate RhoA. PMID:19618964

  13. In vivo generation of DNA sequence diversity for cellular barcoding

    PubMed Central

    Peikon, Ian D.; Gizatullina, Diana I.; Zador, Anthony M.

    2014-01-01

    Heterogeneity is a ubiquitous feature of biological systems. A complete understanding of such systems requires a method for uniquely identifying and tracking individual components and their interactions with each other. We have developed a novel method of uniquely tagging individual cells in vivo with a genetic ‘barcode’ that can be recovered by DNA sequencing. Our method is a two-component system comprised of a genetic barcode cassette whose fragments are shuffled by Rci, a site-specific DNA invertase. The system is highly scalable, with the potential to generate theoretical diversities in the billions. We demonstrate the feasibility of this technique in Escherichia coli. Currently, this method could be employed to track the dynamics of populations of microbes through various bottlenecks. Advances of this method should prove useful in tracking interactions of cells within a network, and/or heterogeneity within complex biological samples. PMID:25013177

  14. alpha-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate alpha-amylases.

    PubMed Central

    Long, C M; Virolle, M J; Chang, S Y; Chang, S; Bibb, M J

    1987-01-01

    The nucleotide sequence of the coding and regulatory regions of the alpha-amylase gene (aml) of Streptomyces limosus was determined. High-resolution S1 mapping was used to locate the 5' end of the transcript and demonstrated that the gene is transcribed from a unique promoter. The predicted amino acid sequence has considerable identity to mammalian and invertebrate alpha-amylases, but not to those of plant, fungal, or eubacterial origin. Consistent with this is the susceptibility of the enzyme to an inhibitor of mammalian alpha-amylases. The amino-terminal sequence of the extracellular enzyme was determined, revealing the presence of a typical signal peptide preceding the mature form of the alpha-amylase. Images PMID:3500166

  15. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity

    PubMed Central

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952

  16. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-04-28

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.

  17. In search of actionable targets for agrigenomics and microalgal biofuel production: sequence-structural diversity studies on algal and higher plants with a focus on GPAT protein.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar

    2013-04-01

    The triacylglycerol (TAG) pathway provides several targets for genetic engineering to optimize microalgal lipid productivity. GPAT (glycerol-3-phosphate acyltransferase) is a crucial enzyme that catalyzes the initial step of TAG biosynthesis. Despite many recent biochemical studies, a comprehensive sequence-structure analysis of GPAT across diverse lipid-yielding organisms is lacking. Hence, we performed a comparative genomic analysis of plastid-located GPAT proteins from 7 microalgae and 3 higher plants species. The close evolutionary relationship observed between red algae/diatoms and green algae/plant lineages in the phylogenetic tree were further corroborated by motif and gene structure analysis. The predicted molecular weight, amino acid composition, Instability Index, and hydropathicity profile gave an overall representation of the biochemical features of GPAT protein across the species under study. Furthermore, homology models of GPAT from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Glycine max provided deep insights into the protein architecture and substrate binding sites. Despite low sequence identity found between algal and plant GPATs, the developed models exhibited strikingly conserved topology consisting of 14α helices and 9β sheets arranged in two domains. However, subtle variations in amino acids of fatty acyl binding site were identified that might influence the substrate selectivity of GPAT. Together, the results will provide useful resources to understand the functional and evolutionary relationship of GPAT and potentially benefit in development of engineered enzyme for augmenting algal biofuel production.

  18. A flexible and economical barcoding approach for highly multiplexed amplicon sequencing of diverse target genes

    PubMed Central

    Herbold, Craig W.; Pelikan, Claus; Kuzyk, Orest; Hausmann, Bela; Angel, Roey; Berry, David; Loy, Alexander

    2015-01-01

    High throughput sequencing of phylogenetic and functional gene amplicons provides tremendous insight into the structure and functional potential of complex microbial communities. Here, we introduce a highly adaptable and economical PCR approach to barcoding and pooling libraries of numerous target genes. In this approach, we replace gene- and sequencing platform-specific fusion primers with general, interchangeable barcoding primers, enabling nearly limitless customized barcode-primer combinations. Compared to barcoding with long fusion primers, our multiple-target gene approach is more economical because it overall requires lower number of primers and is based on short primers with generally lower synthesis and purification costs. To highlight our approach, we pooled over 900 different small-subunit rRNA and functional gene amplicon libraries obtained from various environmental or host-associated microbial community samples into a single, paired-end Illumina MiSeq run. Although the amplicon regions ranged in size from approximately 290 to 720 bp, we found no significant systematic sequencing bias related to amplicon length or gene target. Our results indicate that this flexible multiplexing approach produces large, diverse, and high quality sets of amplicon sequence data for modern studies in microbial ecology. PMID:26236305

  19. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D.

    2018-01-22

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  20. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Patel, Kamlesh D.

    2012-06-01

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  1. Sequence diversity of wheat mosaic virus isolates.

    PubMed

    Stewart, Lucy R

    2016-02-02

    Wheat mosaic virus (WMoV), transmitted by eriophyid wheat curl mites (Aceria tosichella) is the causal agent of High Plains disease in wheat and maize. WMoV and other members of the genus Emaravirus evaded thorough molecular characterization for many years due to the experimental challenges of mite transmission and manipulating multisegmented negative sense RNA genomes. Recently, the complete genome sequence of a Nebraska isolate of WMoV revealed eight segments, plus a variant sequence of the nucleocapsid protein-encoding segment. Here, near-complete and partial consensus sequences of five more WMoV isolates are reported and compared to the Nebraska isolate: an Ohio maize isolate (GG1), a Kansas barley isolate (KS7), and three Ohio wheat isolates (H1, K1, W1). Results show two distinct groups of WMoV isolates: Ohio wheat isolate RNA segments had 84% or lower nucleotide sequence identity to the NE isolate, whereas GG1 and KS7 had 98% or higher nucleotide sequence identity to the NE isolate. Knowledge of the sequence variability of WMoV isolates is a step toward understanding virus biology, and potentially explaining observed biological variation. Published by Elsevier B.V.

  2. Genetic diversity among Babesia rossi detected in naturally infected dogs in Abeokuta, Nigeria, based on 18S rRNA gene sequences.

    PubMed

    Takeet, Michael I; Oyewusi, Adeoye J; Abakpa, Simon A V; Daramola, Olukayode O; Peters, Sunday O

    2017-03-01

    Adequate knowledge of the genetic diversity among Babesia species infecting dogs is necessary for a better understanding of the epidemiology and control of canine babesiosis. Hence, this study determined the genetic diversity among the Babesia rossi detected in dogs presented for routine examination in Veterinary Hospitals in Abeokuta, Nigeria. Blood were randomly collected from 209 dogs. Field-stained thin smears were made and DNA extracted from the blood. Partial region of the 18S small subunit ribosomal RNA (rRNA) gene was amplified, sequenced and analysed. Babesia species was detected in 16 (7.7%) of the dogs by microscopy. Electrophoresed PCR products from 39 (18.66%) dogs revealed band size of 450 bp and 2 (0.95%) dogs had band size of 430 bp. The sequences obtained from 450 bp amplicon displayed homology of 99.74% (387/388) with partial sequences of 18S rRNA gene of Babesia rossi in the GeneBank. Of the two sequences that had 430 bp amplicon, one was identified as T. annulata and second as T. ovis. A significantly (p<0.05) higher prevalence of B. rossi was detected by PCR compared to microscopy. The mean PCV of Babesia infected dogs was significantly (p<0.05) lower than non-infected dogs. Phylogenetic analysis revealed minimal diversity among B. rossi with the exception of one sequence that was greatly divergent from the others. This study suggests that more than one genotype of B. rossi may be in circulation among the dog population in the study area and this may have potential implication on clinical outcome of canine babesiosis.

  3. Sequence diversity among badnavirus isolates infecting black pepper and related species in India.

    PubMed

    Bhat, A I; Sasi, Shina; Revathy, K A; Deeshma, K P; Saji, K V

    2014-01-01

    The badnavirus, piper yellow mottle virus (PYMoV) is known to infect black pepper (Piper nigrum), betelvine (P. betle) and Indian long pepper (P. longum) in India and other parts of the world. Occurrence of PYMoV or other badnaviruses in other species of Piper and its variability is not reported so far. We have analysed sequence variability in the conserved putative reverse transcriptase (RT)/ribonuclease H (RNase H) coding region of the virus using specific badnavirus primers from 13 virus isolates of black pepper collected from different cultivars and regions and one isolate each from 23 other species of Piper. Of these, four species failed to produce expected amplicon while amplicon from four other species showed more similarities to plant sequences than to badnaviruses. Of the remaining, isolates from black pepper, P. argyrophyllum, P. attenuatum, P. barberi, P. betle, P. colubrinum, P. galeatum, P. longum, P. ornatum, P. sarmentosum and P. trichostachyon showed an identity of >85 % at the nucleotide and >90 % at the amino acid level with PYMoV indicating that they are isolates of PYMoV. On the other hand high sequence variability of 21-43 % at nucleotide and 17-46 % at amino acid level compared to PYMoV was found among isolates infecting P. bababudani, P. chaba, P. peepuloides, P. mullesua and P. thomsonii suggesting the presence of new badnaviruses. Phylogenetic analyses showed close clustering of all PYMoV isolates that were well separated from other known badnaviruses. This is the first report of occurrence of PYMoV in eight Piper spp and likely occurrence of four new species in five Piper spp.

  4. Investigation of Microbial Diversity in Geothermal Hot Springs in Unkeshwar, India, Based on 16S rRNA Amplicon Metagenome Sequencing.

    PubMed

    Mehetre, Gajanan T; Paranjpe, Aditi; Dastager, Syed G; Dharne, Mahesh S

    2016-02-25

    Microbial diversity in geothermal waters of the Unkeshwar hot springs in Maharashtra, India, was studied using 16S rRNA amplicon metagenomic sequencing. Taxonomic analysis revealed the presence of Bacteroidetes, Proteobacteria, Cyanobacteria, Actinobacteria, Archeae, and OD1 phyla. Metabolic function prediction analysis indicated a battery of biological information systems indicating rich and novel microbial diversity, with potential biotechnological applications in this niche. Copyright © 2016 Mehetre et al.

  5. Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

    PubMed

    Tajima, K

    1988-11-01

    This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.

  6. Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer

    PubMed Central

    2011-01-01

    In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a “natural metabolic engineer”. PMID:21995294

  7. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing.

    PubMed

    Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I

    2018-01-01

    Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.

  8. Partial amino acid sequence of the branched chain amino acid aminotransferase (TmB) of E. coli JA199 pDU11

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feild, M.J.; Armstrong, F.B.

    1987-05-01

    E. coli JA199 pDU11 harbors a multicopy plasmid containing the ilv GEDAY gene cluster of S. typhimurium. TmB, gene product of ilv E, was purified, crystallized, and subjected to Edman degradation using a gas phase sequencer. The intact protein yielded an amino terminal 31 residue sequence. Both carboxymethylated apoenzyme and (/sup 3/H)-NaBH-reduced holoenzyme were then subjected to digestion by trypsin. The digests were fractionated using reversed phase HPLC, and the peptides isolated were sequenced. The borohydride-treated holoenzyme was used to isolate the cofactor-binding peptide. The peptide is 27 residues long and a comparison with known sequences of other aminotransferases revealedmore » limited homology. Peptides accounting for 211 of 288 predicted residues have been sequenced, including 9 residues of the carboxyl terminus. Comparison of peptides with the inferred amino acid sequence of the E. coli K-12 enzyme has helped determine the sequence of the amino terminal 59 residues; only two differences between the sequences are noted in this region.« less

  9. Environmental isolation explains Iberian genetic diversity in the highly homozygous model grass Brachypodium distachyon.

    PubMed

    Marques, Isabel; Shiposha, Valeriia; López-Alvarez, Diana; Manzaneda, Antonio J; Hernandez, Pilar; Olonova, Marina; Catalán, Pilar

    2017-06-15

    Brachypodium distachyon (Poaceae), an annual Mediterranean Aluminum (Al)-sensitive grass, is currently being used as a model species to provide new information on cereals and biofuel crops. The plant has a short life cycle and one of the smallest genomes in the grasses being well suited to experimental manipulation. Its genome has been fully sequenced and several genomic resources are being developed to elucidate key traits and gene functions. A reliable germplasm collection that reflects the natural diversity of this species is therefore needed for all these genomic resources. However, despite being a model plant, we still know very little about its genetic diversity. As a first step to overcome this gap, we used nuclear Simple Sequence Repeats (nSSR) to study the patterns of genetic diversity and population structure of B. distachyon in 14 populations sampled across the Iberian Peninsula (Spain), one of its best known areas. We found very low levels of genetic diversity, allelic number and heterozygosity in B. distachyon, congruent with a highly selfing system. Our results indicate the existence of at least three genetic clusters providing additional evidence for the existence of a significant genetic structure in the Iberian Peninsula and supporting this geographical area as an important genetic reservoir. Several hotspots of genetic diversity were detected and populations growing on basic soils were significantly more diverse than those growing in acidic soils. A partial Mantel test confirmed a statistically significant Isolation-By-Distance (IBD) among all studied populations, as well as a statistically significant Isolation-By-Environment (IBE) revealing the presence of environmental-driven isolation as one explanation for the genetic patterns found in the Iberian Peninsula. The finding of higher genetic diversity in eastern Iberian populations occurring in basic soils suggests that these populations can be better adapted than those occurring in western areas

  10. Increasing Sequence Diversity with Flexible Backbone Protein Design: The Complete Redesign of a Protein Hydrophobic Core

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murphy, Grant S.; Mills, Jeffrey L.; Miley, Michael J.

    2015-10-15

    Protein design tests our understanding of protein stability and structure. Successful design methods should allow the exploration of sequence space not found in nature. However, when redesigning naturally occurring protein structures, most fixed backbone design algorithms return amino acid sequences that share strong sequence identity with wild-type sequences, especially in the protein core. This behavior places a restriction on functional space that can be explored and is not consistent with observations from nature, where sequences of low identity have similar structures. Here, we allow backbone flexibility during design to mutate every position in the core (38 residues) of a four-helixmore » bundle protein. Only small perturbations to the backbone, 12 {angstrom}, were needed to entirely mutate the core. The redesigned protein, DRNN, is exceptionally stable (melting point >140C). An NMR and X-ray crystal structure show that the side chains and backbone were accurately modeled (all-atom RMSD = 1.3 {angstrom}).« less

  11. Biosynthetic multitasking facilitates thalassospiramide structural diversity in marine bacteria.

    PubMed

    Ross, Avena C; Xu, Ying; Lu, Liang; Kersten, Roland D; Shao, Zongze; Al-Suwailem, Abdulaziz M; Dorrestein, Pieter C; Qian, Pei-Yuan; Moore, Bradley S

    2013-01-23

    Thalassospiramides A and B are immunosuppressant cyclic lipopeptides first reported from the marine α-proteobacterium Thalassospira sp. CNJ-328. We describe here the discovery and characterization of an extended family of 14 new analogues from four Tistrella and Thalassospira isolates. These potent calpain 1 protease inhibitors belong to six structure classes in which the length and composition of the acylpeptide side chain varies extensively. Genomic sequence analysis of the thalassospiramide-producing microbes revealed related, genus-specific biosynthetic loci encoding hybrid nonribosomal peptide synthetase/polyketide synthases consistent with thalassospiramide assembly. The bioinformatics analysis of the gene clusters suggests that structural diversity, which ranges from the 803.4 Da thalassospiramide C to the 1291.7 Da thalassospiramide F, results from a complex sequence of reactions involving amino acid substrate channeling and enzymatic multimodule skipping and iteration. Preliminary biochemical analysis of the N-terminal nonribosomal peptide synthetase module from the Thalassospira TtcA megasynthase supports a biosynthetic model in which in cis amino acid activation competes with in trans activation to increase the range of amino acid substrates incorporated at the N terminus.

  12. Biosynthetic Multitasking Facilitates Thalassospiramide Structural Diversity in Marine Bacteria

    PubMed Central

    Ross, Avena C.; Xu, Ying; Lu, Liang; Kersten, Roland D.; Shao, Zongze; Al-Suwailem, Abdulaziz M.; Dorrestein, Pieter C.; Qian, Pei-Yuan; Moore, Bradley S.

    2013-01-01

    Thalassospiramides A and B are immunosuppressant cyclic lipopeptides first reported from the marine α-proteobacterium Thalassospira sp. CNJ-328. We describe here the discovery and characterization of an extended family of 14 new analogues from four Tistrella and Thalassospira isolates. These potent calpain 1 protease inhibitors belong to six structure classes in which the length and composition of the acylpeptide side chain varies extensively. Genomic sequence analysis of the thalassospiramide-producing microbes revealed related, genus-specific biosynthetic loci encoding hybrid nonribosomal peptide synthetase/polyketide synthases consistent with thalassospiramide assembly. The bioinformatics analysis of the gene clusters suggests that structural diversity, which ranges from the 803.4 Da thalassospiramide C to the 1291.7 Da thalassospiramide F, results from a complex sequence of reactions involving amino acid substrate channeling and enzymatic multi-module skipping and iteration. Preliminary biochemical analysis of the N-terminal NRPS module from the Thalassospira TtcA megasynthase supports a biosynthetic model in which in cis amino acid activation competes with in trans activation to increase the range of amino acid substrates incorporated at the N-terminus. PMID:23270364

  13. Rapid microsatellite marker development for African mahogany (Khaya senegalensis, Meliaceae) using next-generation sequencing and assessment of its intra-specific genetic diversity.

    PubMed

    Karan, M; Evans, D S; Reilly, D; Schulte, K; Wright, C; Innes, D; Holton, T A; Nikles, D G; Dickinson, G R

    2012-03-01

    Khaya senegalensis (African mahogany or dry-zone mahogany) is a high-value hardwood timber species with great potential for forest plantations in northern Australia. The species is distributed across the sub-Saharan belt from Senegal to Sudan and Uganda. Because of heavy exploitation and constraints on natural regeneration and sustainable planting, it is now classified as a vulnerable species. Here, we describe the development of microsatellite markers for K. senegalensis using next-generation sequencing to assess its intra-specific diversity across its natural range, which is a key for successful breeding programs and effective conservation management of the species. Next-generation sequencing yielded 93,943 sequences with an average read length of 234 bp. The assembled sequences contained 1030 simple sequence repeats, with primers designed for 522 microsatellite loci. Twenty-one microsatellite loci were tested with 11 showing reliable amplification and polymorphism in K. senegalensis. The 11 novel microsatellites, together with one previously published, were used to assess 73 accessions belonging to the Australian K. senegalensis domestication program, sampled from across the natural range of the species. STRUCTURE analysis shows two major clusters, one comprising mainly accessions from west Africa (Senegal to Benin) and the second based in the far eastern limits of the range in Sudan and Uganda. Higher levels of genetic diversity were found in material from western Africa. This suggests that new seed collections from this region may yield more diverse genotypes than those originating from Sudan and Uganda in eastern Africa. © 2011 Blackwell Publishing Ltd.

  14. Multiplex PCR-Based Next-Generation Sequencing and Global Diversity of Seoul Virus in Humans and Rats.

    PubMed

    Kim, Won-Keun; No, Jin Sun; Lee, Seung-Ho; Song, Dong Hyun; Lee, Daesang; Kim, Jeong-Ah; Gu, Se Hun; Park, Sunhye; Jeong, Seong Tae; Kim, Heung-Chul; Klein, Terry A; Wiley, Michael R; Palacios, Gustavo; Song, Jin-Won

    2018-02-01

    Seoul virus (SEOV) poses a worldwide public health threat. This virus, which is harbored by Rattus norvegicus and R. rattus rats, is the causative agent of hemorrhagic fever with renal syndrome (HFRS) in humans, which has been reported in Asia, Europe, the Americas, and Africa. Defining SEOV genome sequences plays a critical role in development of preventive and therapeutic strategies against the unique worldwide hantavirus. We applied multiplex PCR-based next-generation sequencing to obtain SEOV genome sequences from clinical and reservoir host specimens. Epidemiologic surveillance of R. norvegicus rats in South Korea during 2000-2016 demonstrated that the serologic prevalence of enzootic SEOV infections was not significant on the basis of sex, weight (age), and season. Viral loads of SEOV in rats showed wide dissemination in tissues and dynamic circulation among populations. Phylogenetic analyses showed the global diversity of SEOV and possible genomic configuration of genetic exchanges.

  15. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those listed... the Feature section. Otherwise, each occurrence of a base or amino acid not appearing in WIPO Standard...

  16. Fungal Diversity in Field Mold-Damaged Soybean Fruits and Pathogenicity Identification Based on High-Throughput rDNA Sequencing

    PubMed Central

    Liu, Jiang; Deng, Jun-cai; Yang, Cai-qiong; Huang, Ni; Chang, Xiao-li; Zhang, Jing; Yang, Feng; Liu, Wei-guo; Wang, Xiao-chun; Yong, Tai-wen; Du, Jun-bo; Shu, Kai; Yang, Wen-yu

    2017-01-01

    Continuous rain and an abnormally wet climate during harvest can easily lead to soybean plants being damaged by field mold (FM), which can reduce seed yield and quality. However, to date, the underlying pathogen and its resistance mechanism have remained unclear. The objective of the present study was to investigate the fungal diversity of various soybean varieties and to identify and confirm the FM pathogenic fungi. A total of 62,382 fungal ITS1 sequences clustered into 164 operational taxonomic units (OTUs) with 97% sequence similarity; 69 taxa were recovered from the samples by internal transcribed spacer (ITS) region sequencing. The fungal community compositions differed among the tested soybeans, with 42 OTUs being amplified from all varieties. The quadratic relationships between fungal diversity and organ-specific mildew indexes were analyzed, confirming that mildew on soybean pods can mitigate FM damage to the seeds. In addition, four potentially pathogenic fungi were isolated from FM-damaged soybean fruits; morphological and molecular identification confirmed these fungi as Aspergillus flavus, A. niger, Fusarium moniliforme, and Penicillium chrysogenum. Further re-inoculation experiments demonstrated that F. moniliforme is dominant among these FM pathogenic fungi. These results lay the foundation for future studies on mitigating or preventing FM damage to soybean. PMID:28515718

  17. A-to-I RNA Editing Contributes to Proteomic Diversity in Cancer.

    PubMed

    Peng, Xinxin; Xu, Xiaoyan; Wang, Yumeng; Hawke, David H; Yu, Shuangxing; Han, Leng; Zhou, Zhicheng; Mojumdar, Kamalika; Jeong, Kang Jin; Labrie, Marilyne; Tsang, Yiu Huen; Zhang, Minying; Lu, Yiling; Hwu, Patrick; Scott, Kenneth L; Liang, Han; Mills, Gordon B

    2018-05-14

    Adenosine (A) to inosine (I) RNA editing introduces many nucleotide changes in cancer transcriptomes. However, due to the complexity of post-transcriptional regulation, the contribution of RNA editing to proteomic diversity in human cancers remains unclear. Here, we performed an integrated analysis of TCGA genomic data and CPTAC proteomic data. Despite limited site diversity, we demonstrate that A-to-I RNA editing contributes to proteomic diversity in breast cancer through changes in amino acid sequences. We validate the presence of editing events at both RNA and protein levels. The edited COPA protein increases proliferation, migration, and invasion of cancer cells in vitro. Our study suggests an important contribution of A-to-I RNA editing to protein diversity in cancer and highlights its translational potential. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. Describing the diversity of Ag specific receptors in vertebrates: Contribution of repertoire deep sequencing.

    PubMed

    Castro, Rosario; Navelsaker, Sofie; Krasnov, Aleksei; Du Pasquier, Louis; Boudinot, Pierre

    2017-10-01

    During the last decades, gene and cDNA cloning identified TCR and Ig genes across vertebrates; genome sequencing of TCR and Ig loci in many species revealed the different organizations selected during evolution under the pressure of generating diverse repertoires of Ag receptors. By detecting clonotypes over a wide range of frequency, deep sequencing of Ig and TCR transcripts provides a new way to compare the structure of expressed repertoires in species of various sizes, at different stages of development, with different physiologies, and displaying multiple adaptations to the environment. In this review, we provide a short overview of the technologies currently used to produce global description of immune repertoires, describe how they have already been used in comparative immunology, and we discuss the future potential of such approaches. The development of these methodologies in new species holds promise for new discoveries concerning particular adaptations. As an example, understanding the development of adaptive immunity across metamorphosis in frogs has been made possible by such approaches. Repertoire sequencing is now widely used, not only in basic research but also in the context of immunotherapy and vaccination. Analysis of fish responses to pathogens and vaccines has already benefited from these methods. Finally, we also discuss potential advances based on repertoire sequencing of multigene families of immune sensors and effectors in invertebrates. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Constancy and diversity in the flavivirus fusion peptide.

    PubMed

    Seligman, Stephen J

    2008-02-14

    Flaviviruses include the mosquito-borne dengue, Japanese encephalitis, yellow fever and West Nile and the tick-borne encephalitis viruses. They are responsible for considerable world-wide morbidity and mortality. Viral entry is mediated by a conserved fusion peptide containing 16 amino acids located in domain II of the envelope protein E. Highly orchestrated conformational changes initiated by exposure to acidic pH accompany the fusion process and are important factors limiting amino acid changes in the fusion peptide that still permit fusion with host cell membranes in both arthropod and vertebrate hosts. The cell-fusing related agents, growing only in mosquitoes or insect cell lines, possess a different homologous peptide. Analysis of 46 named flaviviruses deposited in the Entrez Nucleotides database extended the constancy in the canonical fusion peptide sequences of mosquito-borne, tick-borne and viruses with no known vector to include more recently-sequenced viruses. The mosquito-borne signature amino acid, G104, was also found in flaviviruses with no known vector and with the cell-fusion related viruses. Despite the constancy in the canonical sequences in pathogenic flaviviruses, mutations were surprisingly frequent with a 27% prevalence of nonsynonymous mutations in yellow fever virus fusion peptide sequences, and 0 to 7.4% prevalence in the others. Six of seven yellow fever patients whose virus had fusion peptide mutations died. In the cell-fusing related agents, not enough sequences have been deposited to estimate reliably the prevalence of fusion peptide mutations. However, the canonical sequences homologous to the fusion peptide and the pattern of disulfide linkages in protein E differed significantly from the other flaviviruses. The constancy of the canonical fusion peptide sequences in the arthropod-borne flaviviruses contrasts with the high prevalence of mutations in most individual viruses. The discrepancy may be the result of a survival advantage

  20. Microwave-assisted acid and base hydrolysis of intact proteins containing disulfide bonds for protein sequence analysis by mass spectrometry.

    PubMed

    Reiz, Bela; Li, Liang

    2010-09-01

    Controlled hydrolysis of proteins to generate peptide ladders combined with mass spectrometric analysis of the resultant peptides can be used for protein sequencing. In this paper, two methods of improving the microwave-assisted protein hydrolysis process are described to enable rapid sequencing of proteins containing disulfide bonds and increase sequence coverage, respectively. It was demonstrated that proteins containing disulfide bonds could be sequenced by MS analysis by first performing hydrolysis for less than 2 min, followed by 1 h of reduction to release the peptides originally linked by disulfide bonds. It was shown that a strong base could be used as a catalyst for microwave-assisted protein hydrolysis, producing complementary sequence information to that generated by microwave-assisted acid hydrolysis. However, using either acid or base hydrolysis, amide bond breakages in small regions of the polypeptide chains of the model proteins (e.g., cytochrome c and lysozyme) were not detected. Dynamic light scattering measurement of the proteins solubilized in an acid or base indicated that protein-protein interaction or aggregation was not the cause of the failure to hydrolyze certain amide bonds. It was speculated that there were some unknown local structures that might play a role in preventing an acid or base from reacting with the peptide bonds therein. 2010 American Society for Mass Spectrometry. Published by Elsevier Inc. All rights reserved.

  1. Sequence Based Structural Characterization and Genetic Diversity Analysis of Full Length TLR4 CDS in Crossbred and Indigenous Cattle.

    PubMed

    Mishra, Chinmoy; Kumar, Subodh; Sonwane, Arvind Asaram; Yathish, H M; Chaudhary, Rajni

    2017-01-02

    The exploration of candidate genes for immune response in cattle may be vital for improving our understanding regarding the species specific response to pathogens. Toll-like receptor 4 (TLR4) is mostly involved in protection against the deleterious effects of Gram negative pathogens. Approximately 2.6 kb long cDNA sequence of TLR4 gene covering the entire coding region was characterized in two Indian milk cattle (Vrindavani and Tharparkar). The phylogenetic analysis confirmed that the bovine TLR4 was apparently evolved from an ancestral form that predated the appearance of vertebrates, and it is grouped with buffalo, yak, and mithun TLR4s. Sequence analysis revealed a 2526-nucleotide long open reading frame (ORF) encoding 841 amino acids, similar to other cattle breeds. The calculated molecular weight of the translated ORF was 96144 and 96040.9 Da; the isoelectric point was 6.35 and 6.42 in Vrindavani and Tharparkar cattle, respectively. The Simple Modular Architecture Research Tool (SMART) analysis identified 14 leucine rich repeats (LRR) motifs in bovine TLR4 protein. The deduced TLR4 amino acid sequence of Tharparkar had 4 different substitutions as compared to Bos taurus, Sahiwal, and Vrindavani. The signal peptide cleavage site predicted to lie between 16th and 17th amino acid of mature peptide. The transmebrane helix was identified between 635-657 amino acids in the mature peptide.

  2. Multilocus sequence typing, biochemical and antibiotic resistance characterizations reveal diversity of North American strains of the honey bee pathogen Paenibacillus larvae.

    PubMed

    Krongdang, Sasiprapa; Evans, Jay D; Pettis, Jeffery S; Chantawannakul, Panuwan

    2017-01-01

    Paenibacillus larvae is a Gram positive bacterium and the causative agent of the most widespread fatal brood disease of honey bees, American foulbrood (AFB). A total of thirty-three independent Paenibacillus larvae isolates from various geographical origins in North America and five reference strains were investigated for genetic diversity using multilocus sequence typing (MLST). This technique is regarded to be a powerful tool for epidemiological studies of pathogenic bacteria and is widely used in genotyping assays. For MLST, seven housekeeping gene loci, ilvD (dihydroxy-acid dyhydrogenase), tri (triosephosphate isomerase), purH (phospharibosyl-aminoimidazolecarboxamide), recF (DNA replication and repair protein), pyrE (orotate phosphoribosyltransferase), sucC (succinyl coenzyme A synthetase β subunit) and glpF (glycerol uptake facilitator protein) were studied and applied for primer designs. Previously, ERIC type DNA fingerprinting was applied to these same isolates and the data showed that almost all represented the ERIC I type, whereas using BOX-PCR gave an indication of more diversity. All isolates were screened for resistance to four antibiotics used by U.S. beekeepers, showing extensive resistance to tetracycline and the first records of resistance to tylosin and lincomycin. Our data highlight the intraspecies relationships of P. larvae and the potential application of MLST methods in enhancing our understanding of epidemiological relationships among bacterial isolates of different origins.

  3. Size and sequence polymorphisms in the glutamate-rich protein gene of the human malaria parasite Plasmodium falciparum in Thailand.

    PubMed

    Pattaradilokrat, Sittiporn; Trakoolsoontorn, Chawinya; Simpalipan, Phumin; Warrit, Natapot; Kaewthamasorn, Morakot; Harnyuttanakorn, Pongchai

    2018-01-22

    The glutamate-rich protein (GLURP) of the malaria parasite Plasmodium falciparum is a key surface antigen that serves as a component of a clinical vaccine. Moreover, the GLURP gene is also employed routinely as a genetic marker for malarial genotyping in epidemiological studies. While extensive size polymorphisms in GLURP are well recorded, the extent of the sequence diversity of this gene is rarely investigated. The present study aimed to explore the genetic diversity of GLURP in natural populations of P. falciparum. The polymorphic C-terminal repetitive R2 region of GLURP sequences from 65 P. falciparum isolates in Thailand were generated and combined with the data from 103 worldwide isolates to generate a GLURP database. The collection was comprised of 168 alleles, encoding 105 unique GLURP subtypes, characterized by 18 types of amino acid repeat units (AAU). Of these, 28 GLURP subtypes, formed by 10 AAU types, were detected in P. falciparum in Thailand. Among them, 19 GLURP subtypes and 2 AAU types are described for the first time in the Thai parasite population. The AAU sequences were highly conserved, which is likely due to negative selection. Standard Fst analysis revealed the shared distributions of GLURP types among the P. falciparum populations, providing evidence of gene flow among the different demographic populations. Sequence diversity causing size variations in GLURP in Thai P. falciparum populations were detected, and caused by non-synonymous substitutions in repeat units and some insertion/deletion of aspartic acid or glutamic acid codons between repeat units. The P. falciparum population structure based on GLURP showed promising implications for the development of GLURP-based vaccines and for monitoring vaccine efficacy.

  4. Prospects for Fungal Bioremediation of Acidic Radioactive Waste Sites: Characterization and Genome Sequence of Rhodotorula taiwanensis MD1149.

    PubMed

    Tkavc, Rok; Matrosova, Vera Y; Grichenko, Olga E; Gostinčar, Cene; Volpe, Robert P; Klimenkova, Polina; Gaidamakova, Elena K; Zhou, Carol E; Stewart, Benjamin J; Lyman, Mathew G; Malfatti, Stephanie A; Rubinfeld, Bonnee; Courtot, Melanie; Singh, Jatinder; Dalgard, Clifton L; Hamilton, Theron; Frey, Kenneth G; Gunde-Cimerman, Nina; Dugan, Lawrence; Daly, Michael J

    2017-01-01

    Highly concentrated radionuclide waste produced during the Cold War era is stored at US Department of Energy (DOE) production sites. This radioactive waste was often highly acidic and mixed with heavy metals, and has been leaking into the environment since the 1950s. Because of the danger and expense of cleanup of such radioactive sites by physicochemical processes, in situ bioremediation methods are being developed for cleanup of contaminated ground and groundwater. To date, the most developed microbial treatment proposed for high-level radioactive sites employs the radiation-resistant bacterium Deinococcus radiodurans . However, the use of Deinococcus spp. and other bacteria is limited by their sensitivity to low pH. We report the characterization of 27 diverse environmental yeasts for their resistance to ionizing radiation (chronic and acute), heavy metals, pH minima, temperature maxima and optima, and their ability to form biofilms. Remarkably, many yeasts are extremely resistant to ionizing radiation and heavy metals. They also excrete carboxylic acids and are exceptionally tolerant to low pH. A special focus is placed on Rhodotorula taiwanensis MD1149, which was the most resistant to acid and gamma radiation. MD1149 is capable of growing under 66 Gy/h at pH 2.3 and in the presence of high concentrations of mercury and chromium compounds, and forming biofilms under high-level chronic radiation and low pH. We present the whole genome sequence and annotation of R. taiwanensis strain MD1149, with a comparison to other Rhodotorula species. This survey elevates yeasts to the frontier of biology's most radiation-resistant representatives, presenting a strong rationale for a role of fungi in bioremediation of acidic radioactive waste sites.

  5. Prospects for Fungal Bioremediation of Acidic Radioactive Waste Sites: Characterization and Genome Sequence of Rhodotorula taiwanensis MD1149

    PubMed Central

    Tkavc, Rok; Matrosova, Vera Y.; Grichenko, Olga E.; Gostinčar, Cene; Volpe, Robert P.; Klimenkova, Polina; Gaidamakova, Elena K.; Zhou, Carol E.; Stewart, Benjamin J.; Lyman, Mathew G.; Malfatti, Stephanie A.; Rubinfeld, Bonnee; Courtot, Melanie; Singh, Jatinder; Dalgard, Clifton L.; Hamilton, Theron; Frey, Kenneth G.; Gunde-Cimerman, Nina; Dugan, Lawrence; Daly, Michael J.

    2018-01-01

    Highly concentrated radionuclide waste produced during the Cold War era is stored at US Department of Energy (DOE) production sites. This radioactive waste was often highly acidic and mixed with heavy metals, and has been leaking into the environment since the 1950s. Because of the danger and expense of cleanup of such radioactive sites by physicochemical processes, in situ bioremediation methods are being developed for cleanup of contaminated ground and groundwater. To date, the most developed microbial treatment proposed for high-level radioactive sites employs the radiation-resistant bacterium Deinococcus radiodurans. However, the use of Deinococcus spp. and other bacteria is limited by their sensitivity to low pH. We report the characterization of 27 diverse environmental yeasts for their resistance to ionizing radiation (chronic and acute), heavy metals, pH minima, temperature maxima and optima, and their ability to form biofilms. Remarkably, many yeasts are extremely resistant to ionizing radiation and heavy metals. They also excrete carboxylic acids and are exceptionally tolerant to low pH. A special focus is placed on Rhodotorula taiwanensis MD1149, which was the most resistant to acid and gamma radiation. MD1149 is capable of growing under 66 Gy/h at pH 2.3 and in the presence of high concentrations of mercury and chromium compounds, and forming biofilms under high-level chronic radiation and low pH. We present the whole genome sequence and annotation of R. taiwanensis strain MD1149, with a comparison to other Rhodotorula species. This survey elevates yeasts to the frontier of biology's most radiation-resistant representatives, presenting a strong rationale for a role of fungi in bioremediation of acidic radioactive waste sites. PMID:29375494

  6. Microbial Ecology and Evolution in the Acid Mine Drainage Model System.

    PubMed

    Huang, Li-Nan; Kuang, Jia-Liang; Shu, Wen-Sheng

    2016-07-01

    Acid mine drainage (AMD) is a unique ecological niche for acid- and toxic-metals-adapted microorganisms. These low-complexity systems offer a special opportunity for the ecological and evolutionary analyses of natural microbial assemblages. The last decade has witnessed an unprecedented interest in the study of AMD communities using 16S rRNA high-throughput sequencing and community genomic and postgenomic methodologies, significantly advancing our understanding of microbial diversity, community function, and evolution in acidic environments. This review describes new data on AMD microbial ecology and evolution, especially dynamics of microbial diversity, community functions, and population genomes, and further identifies gaps in our current knowledge that future research, with integrated applications of meta-omics technologies, will fill. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Evaluation of genetic diversity amongst Descurainia sophia L. genotypes by inter-simple sequence repeat (ISSR) marker.

    PubMed

    Saki, Sahar; Bagheri, Hedayat; Deljou, Ali; Zeinalabedini, Mehrshad

    2016-01-01

    Descurainia sophia is a valuable medicinal plant in family of Brassicaceae. To determine the range of diversity amongst D. sophia in Iran, 32 naturally distributed plants belonging to six natural populations of the Iranian plateau were investigated by inter-simple sequence repeat (ISSR) markers. The average percentage of polymorphism produced by 12 ISSR primers was 86 %. The PIC values for primers ranged from 0.22 to 0.40 and Rp values ranged between 6.5 and 19.9. The relative genetic diversity of the populations was not high (Gst =0.32). However, the value of gene flow revealed by the ISSR marker was high (Nm = 1.03). UPGMA clustering method based on Jaccard similarity coefficient grouped the genotypes into two major clusters. Graph results from Neighbor-Net Network generated after a 1000 bootstrap test using Jaccard coefficient, and STRUCTURE analysis confirmed the UPGMA clustering. The first three PCAs represented 57.31 % of the total variation. The high levels of genetic diversity were observed within populations, which is useful in breeding and conservation programs. ISSR is found to be an eligible marker to study genetic diversity of D. sophia.

  8. Nucleic Acid Immunity.

    PubMed

    Hartmann, G

    2017-01-01

    Organisms throughout biology need to maintain the integrity of their genome. From bacteria to vertebrates, life has established sophisticated mechanisms to detect and eliminate foreign genetic material or to restrict its function and replication. Tremendous progress has been made in the understanding of these mechanisms which keep foreign or unwanted nucleic acids from viruses or phages in check. Mechanisms reach from restriction-modification systems and CRISPR/Cas in bacteria and archaea to RNA interference and immune sensing of nucleic acids, altogether integral parts of a system which is now appreciated as nucleic acid immunity. With inherited receptors and acquired sequence information, nucleic acid immunity comprises innate and adaptive components. Effector functions include diverse nuclease systems, intrinsic activities to directly restrict the function of foreign nucleic acids (e.g., PKR, ADAR1, IFIT1), and extrinsic pathways to alert the immune system and to elicit cytotoxic immune responses. These effects act in concert to restrict viral replication and to eliminate virus-infected cells. The principles of nucleic acid immunity are highly relevant for human disease. Besides its essential contribution to antiviral defense and restriction of endogenous retroelements, dysregulation of nucleic acid immunity can also lead to erroneous detection and response to self nucleic acids then causing sterile inflammation and autoimmunity. Even mechanisms of nucleic acid immunity which are not established in vertebrates are relevant for human disease when they are present in pathogens such as bacteria, parasites, or helminths or in pathogen-transmitting organisms such as insects. This review aims to provide an overview of the diverse mechanisms of nucleic acid immunity which mostly have been looked at separately in the past and to integrate them under the framework nucleic acid immunity as a basic principle of life, the understanding of which has great potential to

  9. Marked Genomic Diversity of Norovirus Genogroup I Strains in a Waterborne Outbreak

    PubMed Central

    Hannoun, Charles; Larsson, Charlotte U.; Bergström, Tomas

    2012-01-01

    Marked norovirus (NoV) diversity was detected in patient samples from a large community outbreak of gastroenteritis with waterborne epidemiology affecting approximately 2,400 people. NoV was detected in 33 of 50 patient samples examined by group-specific real-time reverse transcription-PCR. NoV genotype I (GI) strains predominated in 31 patients, with mixed GI infections occurring in 5 of these patients. Sequence analysis of RNA-dependent polymerase-N/S capsid-coding regions (∼900 nucleotides in length) confirmed the dominance of the GI strains (n = 36). Strains of NoV GI.4 (n = 21) and GI.7 (n = 9) were identified, but six strains required full capsid amino acid analyses (530 to 550 amino acids) based on control sequencing of cloned amplicons before the virus genotype could be determined. Three strains were assigned to a new NoV GI genotype, proposed as GI.9, based on capsid amino acid analyses showing 26% dissimilarity from the established genotypes GI.1 to GI.8. Three other strains grouped in a sub-branch of GI.3 with 13 to 15% amino acid dissimilarity to GI.3 GenBank reference strains. Phylogenetic analysis (2.1 kb) of 10 representative strains confirmed these genotype clusters. Strains of NoV GII.4 (n = 1), NoV GII.6 (n = 2), sapovirus GII.2 (n = 1), rotavirus (n = 3), adenovirus (n = 1), and Campylobacter spp. (n = 2) were detected as single infections or as mixtures with NoV GI. Marked NoV GI diversity detected in patients was consistent with epidemiologic evidence of waterborne NoV infections, suggesting human fecal contamination of the water supply. Recognition of NoV diversity in a cluster of patients provided a useful warning marker of waterborne contamination in the Lilla Edet outbreak. PMID:22247153

  10. Penicillin-resistant, ampicillin-susceptible Enterococcus faecalis of hospital origin: pbp4 gene polymorphism and genetic diversity.

    PubMed

    Conceição, Natália; da Silva, Lucas Emanuel Pinheiro; Darini, Ana Lúcia da Costa; Pitondo-Silva, André; de Oliveira, Adriana Gonçalves

    2014-12-01

    Despite the spread of penicillin-resistant, ampicillin-susceptible Enterococcus faecalis (PRASEF) isolates in diverse countries, the mechanisms leading to this unusual resistance phenotype have not yet been investigated. The aim of this study was to evaluate whether polymorphism in the pbp4 gene is associated with penicillin resistance in PRASEF isolates and to determine their genetic diversity. E. faecalis isolates were recovered from different clinical specimens of hospitalized patients from February 2006 to June 2010. The β-lactam minimal inhibitory concentrations (MICs) were determined by E-test®. The PCR-amplified pbp4 gene was sequenced with an automated sequencer. The genetic diversities of the isolates were established by PFGE (pulsed-field gel electrophoresis) and MLST (multilocus sequencing typing). Seventeen non-producing β-lactamase PRASEF and 10 penicillin-susceptible, ampicillin-susceptible E. faecalis (PSASEF) strains were analyzed. A single-amino-acid substitution (Asp-573→Glu) in the penicillin-binding domain was significantly found in all PRASEF isolates by sequencing of the pbp4 gene but not in the penicillin-susceptible isolates. In contrast to the PSASEF isolates, a majority of the PRASEFs had similar PFGE profiles. Six representative PRASEF isolates were resolved by MLST into ST9 and ST524 and belong to the globally dispersed clonal complex 9 (CC9). In conclusion, it appears quite likely that the amino acid alteration (Asp-573→Glu) found in the PBP4 of the Brazilian PRASEF isolates may account for their reduced susceptibility to penicillin, although other resistance mechanisms remain to be investigated. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. Sequence-Dependent Self-Assembly and Structural Diversity of Islet Amyloid Polypeptide-Derived β-Sheet Fibrils

    DOE PAGES

    Wang, Shih-Ting; Lin, Yiyang; Spencer, Ryan K.; ...

    2017-08-03

    Determining the structural origins of amyloid fibrillation is essential for understanding both the pathology of amyloidosis and the rational design of inhibitors to prevent or reverse amyloid formation. In this work, the decisive roles of peptide structures on amyloid self-assembly and morphological diversity were investigated by the design of eight amyloidogenic peptides derived from islet amyloid polypeptide. Among the segments, two distinct morphologies were highlighted in the form of twisted and planar (untwisted) ribbons with varied diameters, thicknesses, and lengths. In particular, transformation of amyloid fibrils from twisted ribbons into untwisted structures was triggered by substitution of the C-terminal serinemore » with threonine, where the side chain methyl group was responsible for the distinct morphological change. This effect was confirmed following serine substitution with alanine and valine and was ascribed to the restriction of intersheet torsional strain through the increased hydrophobic interactions and hydrogen bonding. We also studied the variation of fibril morphology (i.e., association and helicity) and peptide aggregation propensity by increasing the hydrophobicity of the peptide side group, capping the N-terminus, and extending sequence length. Lastly, we anticipate that our insights into sequence-dependent fibrillation and morphological diversity will shed light on the structural interpretation of amyloidogenesis and development of structure-specific imaging agents and aggregation inhibitors.« less

  12. Statistical distribution of amino acid sequences: a proof of Darwinian evolution.

    PubMed

    Eitner, Krystian; Koch, Uwe; Gaweda, Tomasz; Marciniak, Jedrzej

    2010-12-01

    The article presents results of the listing of the quantity of amino acids, dipeptides and tripeptides for all proteins available in the UNIPROT-TREMBL database and the listing for selected species and enzymes. UNIPROT-TREMBL contains protein sequences associated with computationally generated annotations and large-scale functional characterization. Due to the distinct metabolic pathways of amino acid syntheses and their physicochemical properties, the quantities of subpeptides in proteins vary. We have proved that the distribution of amino acids, dipeptides and tripeptides is statistical which confirms that the evolutionary biodiversity development model is subject to the theory of independent events. It seems interesting that certain short peptide combinations occur relatively rarely or even not at all. First, it confirms the Darwinian theory of evolution and second, it opens up opportunities for designing pharmaceuticals among rarely represented short peptide combinations. Furthermore, an innovative approach to the mass analysis of bioinformatic data is presented. eitner@amu.edu.pl Supplementary data are available at Bioinformatics online.

  13. Ultrasmall Peptides Self-Assemble into Diverse Nanostructures: Morphological Evaluation and Potential Implications

    PubMed Central

    Lakshmanan, Anupama; Hauser, Charlotte A.E.

    2011-01-01

    In this study, we perform a morphological evaluation of the diverse nanostructures formed by varying concentration and amino acid sequence of a unique class of ultrasmall self-assembling peptides. We modified these peptides by replacing the aliphatic amino acid at the C-aliphatic terminus with different aromatic amino acids. We tracked the effect of introducing aromatic residues on self-assembly and morphology of resulting nanostructures. Whereas aliphatic peptides formed long, helical fibers that entangle into meshes and entrap >99.9% water, the modified peptides contrastingly formed short, straight fibers with a flat morphology. No helical fibers were observed for the modified peptides. For the aliphatic peptides at low concentrations, different supramolecular assemblies such as hollow nanospheres and membrane blebs were found. Since the ultrasmall peptides are made of simple, aliphatic amino acids, considered to have existed in the primordial soup, study of these supramolecular assemblies could be relevant to understanding chemical evolution leading to the origin of life on Earth. In particular, we propose a variety of potential applications in bioengineering and nanotechnology for the diverse self-assembled nanostructures. PMID:22016623

  14. Deep sequencing reveals exceptional diversity and modes of transmission for bacterial sponge symbionts

    PubMed Central

    Webster, Nicole S; Taylor, Michael W; Behnam, Faris; Lücker, Sebastian; Rattei, Thomas; Whalan, Stephen; Horn, Matthias; Wagner, Michael

    2010-01-01

    Marine sponges contain complex bacterial communities of considerable ecological and biotechnological importance, with many of these organisms postulated to be specific to sponge hosts. Testing this hypothesis in light of the recent discovery of the rare microbial biosphere, we investigated three Australian sponges by massively parallel 16S rRNA gene tag pyrosequencing. Here we show bacterial diversity that is unparalleled in an invertebrate host, with more than 250 000 sponge-derived sequence tags being assigned to 23 bacterial phyla and revealing up to 2996 operational taxonomic units (95% sequence similarity) per sponge species. Of the 33 previously described ‘sponge-specific’ clusters that were detected in this study, 48% were found exclusively in adults and larvae – implying vertical transmission of these groups. The remaining taxa, including ‘Poribacteria’, were also found at very low abundance among the 135 000 tags retrieved from surrounding seawater. Thus, members of the rare seawater biosphere may serve as seed organisms for widely occurring symbiont populations in sponges and their host association might have evolved much more recently than previously thought. PMID:21966903

  15. Purification, characterization, gene cloning and nucleotide sequencing of D: -stereospecific amino acid amidase from soil bacterium: Delftia acidovorans.

    PubMed

    Hongpattarakere, Tipparat; Komeda, Hidenobu; Asano, Yasuhisa

    2005-12-01

    The D-amino acid amidase-producing bacterium was isolated from soil samples using an enrichment culture technique in medium broth containing D-phenylalanine amide as a sole source of nitrogen. The strain exhibiting the strongest activity was identified as Delftia acidovorans strain 16. This strain produced intracellular D-amino acid amidase constitutively. The enzyme was purified about 380-fold to homogeneity and its molecular mass was estimated to be about 50 kDa, on sodium dodecyl sulfate polyacrylamide gel electrophoresis. The enzyme was active preferentially toward D-amino acid amides rather than their L-counterparts. It exhibited strong amino acid amidase activity toward aromatic amino acid amides including D-phenylalanine amide, D-tryptophan amide and D-tyrosine amide, yet it was not specifically active toward low-molecular-weight D-amino acid amides such as D-alanine amide, L-alanine amide and L-serine amide. Moreover, it was not specifically active toward oligopeptides. The enzyme showed maximum activity at 40 degrees C and pH 8.5 and appeared to be very stable, with 92.5% remaining activity after the reaction was performed at 45 degrees C for 30 min. However, it was mostly inactivated in the presence of phenylmethanesulfonyl fluoride or Cd2+, Ag+, Zn2+, Hg2+ and As3+ . The NH2 terminal and internal amino acid sequences of the enzyme were determined; and the gene was cloned and sequenced. The enzyme gene damA encodes a 466-amino-acid protein (molecular mass 49,860.46 Da); and the deduced amino acid sequence exhibits homology to the D-amino acid amidase from Variovorax paradoxus (67.9% identity), the amidotransferase A subunit from Burkholderia fungorum (50% identity) and other enantioselective amidases.

  16. A Novel Phytase with Sequence Similarity to Purple Acid Phosphatases Is Expressed in Cotyledons of Germinating Soybean Seedlings 1

    PubMed Central

    Hegeman, Carla E.; Grabau, Elizabeth A.

    2001-01-01

    Phytic acid (myo-inositol hexakisphosphate) is the major storage form of phosphorus in plant seeds. During germination, stored reserves are used as a source of nutrients by the plant seedling. Phytic acid is degraded by the activity of phytases to yield inositol and free phosphate. Due to the lack of phytases in the non-ruminant digestive tract, monogastric animals cannot utilize dietary phytic acid and it is excreted into manure. High phytic acid content in manure results in elevated phosphorus levels in soil and water and accompanying environmental concerns. The use of phytases to degrade seed phytic acid has potential for reducing the negative environmental impact of livestock production. A phytase was purified to electrophoretic homogeneity from cotyledons of germinated soybeans (Glycine max L. Merr.). Peptide sequence data generated from the purified enzyme facilitated the cloning of the phytase sequence (GmPhy) employing a polymerase chain reaction strategy. The introduction of GmPhy into soybean tissue culture resulted in increased phytase activity in transformed cells, which confirmed the identity of the phytase gene. It is surprising that the soybean phytase was unrelated to previously characterized microbial or maize (Zea mays) phytases, which were classified as histidine acid phosphatases. The soybean phytase sequence exhibited a high degree of similarity to purple acid phosphatases, a class of metallophosphoesterases. PMID:11500558

  17. The isolation, purification and amino-acid sequence of insulin from the teleost fish Cottus scorpius (daddy sculpin).

    PubMed

    Cutfield, J F; Cutfield, S M; Carne, A; Emdin, S O; Falkmer, S

    1986-07-01

    Insulin from the principal islets of the teleost fish, Cottus scorpius (daddy sculpin), has been isolated and sequenced. Purification involved acid/alcohol extraction, gel filtration, and reverse-phase high-performance liquid chromatography to yield nearly 1 mg pure insulin/g wet weight islet tissue. Biological potency was estimated as 40% compared to porcine insulin. The sculpin insulin crystallised in the absence of zinc ions although zinc is known to be present in the islets in significant amounts. Two other hormones, glucagon and pancreatic polypeptide, were copurified with the insulin, and an N-terminal sequence for pancreatic polypeptide was determined. The primary structure of sculpin insulin shows a number of sequence changes unique so far amongst teleost fish. These changes occur at A14 (Arg), A15 (Val), and B2 (Asp). The B chain contains 29 amino acids and there is no N-terminal extension as seen with several other fish. Presumably as a result of the amino acid substitutions, sculpin insulin does not readily form crystals containing zinc-insulin hexamers, despite the presence of the coordinating B10 His.

  18. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

    PubMed Central

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762

  19. The genetic diversity of merozoite surface antigen 1 (MSA-1) among Babesia bovis detected from cattle populations in Thailand, Brazil and Ghana.

    PubMed

    Nagano, Daisuke; Sivakumar, Thillaiampalam; De De Macedo, Alane Caine Costa; Inpankaew, Tawin; Alhassan, Andy; Igarashi, Ikuo; Yokoyama, Naoaki

    2013-11-01

    In the present study, we screened blood DNA samples obtained from cattle bred in Brazil (n=164) and Ghana (n=80) for Babesia bovis using a diagnostic PCR assay and found prevalences of 14.6% and 46.3%, respectively. Subsequently, the genetic diversity of B. bovis in Thailand, Brazil and Ghana was analyzed, based on the DNA sequence of merozoite surface antigen-1 (MSA-1). In Thailand, MSA-1 sequences were relatively conserved and found in a single clade of the phylogram, while Brazilian MSA-1 sequences showed high genetic diversity and were dispersed across three different clades. In contrast, the sequences from Ghanaian samples were detected in two different clades, one of which contained only a single Ghanaian sequence. The identities among the MSA-1 sequences from Thailand, Brazil and Ghana were 99.0-100%, 57.5-99.4% and 60.3-100%, respectively, while the similarities among the deduced MSA-1 amino acid sequences within the respective countries were 98.4-100%, 59.4-99.7% and 58.7-100%, respectively. These observations suggested that the genetic diversity of B. bovis based on MSA-1 sequences was higher in Brazil and Ghana than in Thailand. The current data highlight the importance of conducting extensive studies on the genetic diversity of B. bovis before designing immune control strategies in each surveyed country.

  20. Optimizing the specificity of nucleic acid hybridization.

    PubMed

    Zhang, David Yu; Chen, Sherry Xi; Yin, Peng

    2012-01-22

    The specific hybridization of complementary sequences is an essential property of nucleic acids, enabling diverse biological and biotechnological reactions and functions. However, the specificity of nucleic acid hybridization is compromised for long strands, except near the melting temperature. Here, we analytically derived the thermodynamic properties of a hybridization probe that would enable near-optimal single-base discrimination and perform robustly across diverse temperature, salt and concentration conditions. We rationally designed 'toehold exchange' probes that approximate these properties, and comprehensively tested them against five different DNA targets and 55 spurious analogues with energetically representative single-base changes (replacements, deletions and insertions). These probes produced discrimination factors between 3 and 100+ (median, 26). Without retuning, our probes function robustly from 10 °C to 37 °C, from 1 mM Mg(2+) to 47 mM Mg(2+), and with nucleic acid concentrations from 1 nM to 5 µM. Experiments with RNA also showed effective single-base change discrimination.

  1. Optimizing the specificity of nucleic acid hybridization

    PubMed Central

    Zhang, David Yu; Chen, Sherry Xi; Yin, Peng

    2014-01-01

    The specific hybridization of complementary sequences is an essential property of nucleic acids, enabling diverse biological and biotechnological reactions and functions. However, the specificity of nucleic acid hybridization is compromised for long strands, except near the melting temperature. Here, we analytically derived the thermodynamic properties of a hybridization probe that would enable near-optimal single-base discrimination and perform robustly across diverse temperature, salt and concentration conditions. We rationally designed ‘toehold exchange’ probes that approximate these properties, and comprehensively tested them against five different DNA targets and 55 spurious analogues with energetically representative single-base changes (replacements, deletions and insertions). These probes produced discrimination factors between 3 and 100+ (median, 26). Without retuning, our probes function robustly from 10 °C to 37 °C, from 1 mM Mg2+ to 47 mM Mg2+, and with nucleic acid concentrations from 1 nM to 5 μM. Experiments with RNA also showed effective single-base change discrimination. PMID:22354435

  2. Application of Ion Torrent Sequencing to the Assessment of the Effect of Alkali Ballast Water Treatment on Microbial Community Diversity

    PubMed Central

    Fujimoto, Masanori; Moyerbrailean, Gregory A.; Noman, Sifat; Gizicki, Jason P.; Ram, Michal L.; Green, Phyllis A.; Ram, Jeffrey L.

    2014-01-01

    The impact of NaOH as a ballast water treatment (BWT) on microbial community diversity was assessed using the 16S rRNA gene based Ion Torrent sequencing with its new 400 base chemistry. Ballast water samples from a Great Lakes ship were collected from the intake and discharge of both control and NaOH (pH 12) treated tanks and were analyzed in duplicates. One set of duplicates was treated with the membrane-impermeable DNA cross-linking reagent propidium mono-azide (PMA) prior to PCR amplification to differentiate between live and dead microorganisms. Ion Torrent sequencing generated nearly 580,000 reads for 31 bar-coded samples and revealed alterations of the microbial community structure in ballast water that had been treated with NaOH. Rarefaction analysis of the Ion Torrent sequencing data showed that BWT using NaOH significantly decreased microbial community diversity relative to control discharge (p<0.001). UniFrac distance based principal coordinate analysis (PCoA) plots and UPGMA tree analysis revealed that NaOH-treated ballast water microbial communities differed from both intake communities and control discharge communities. After NaOH treatment, bacteria from the genus Alishewanella became dominant in the NaOH-treated samples, accounting for <0.5% of the total reads in intake samples but more than 50% of the reads in the treated discharge samples. The only apparent difference in microbial community structure between PMA-processed and non-PMA samples occurred in intake water samples, which exhibited a significantly higher amount of PMA-sensitive cyanobacteria/chloroplast 16S rRNA than their corresponding non-PMA total DNA samples. The community assembly obtained using Ion Torrent sequencing was comparable to that obtained from a subset of samples that were also subjected to 454 pyrosequencing. This study showed the efficacy of alkali ballast water treatment in reducing ballast water microbial diversity and demonstrated the application of new Ion Torrent

  3. Application of ion torrent sequencing to the assessment of the effect of alkali ballast water treatment on microbial community diversity.

    PubMed

    Fujimoto, Masanori; Moyerbrailean, Gregory A; Noman, Sifat; Gizicki, Jason P; Ram, Michal L; Green, Phyllis A; Ram, Jeffrey L

    2014-01-01

    The impact of NaOH as a ballast water treatment (BWT) on microbial community diversity was assessed using the 16S rRNA gene based Ion Torrent sequencing with its new 400 base chemistry. Ballast water samples from a Great Lakes ship were collected from the intake and discharge of both control and NaOH (pH 12) treated tanks and were analyzed in duplicates. One set of duplicates was treated with the membrane-impermeable DNA cross-linking reagent propidium mono-azide (PMA) prior to PCR amplification to differentiate between live and dead microorganisms. Ion Torrent sequencing generated nearly 580,000 reads for 31 bar-coded samples and revealed alterations of the microbial community structure in ballast water that had been treated with NaOH. Rarefaction analysis of the Ion Torrent sequencing data showed that BWT using NaOH significantly decreased microbial community diversity relative to control discharge (p<0.001). UniFrac distance based principal coordinate analysis (PCoA) plots and UPGMA tree analysis revealed that NaOH-treated ballast water microbial communities differed from both intake communities and control discharge communities. After NaOH treatment, bacteria from the genus Alishewanella became dominant in the NaOH-treated samples, accounting for <0.5% of the total reads in intake samples but more than 50% of the reads in the treated discharge samples. The only apparent difference in microbial community structure between PMA-processed and non-PMA samples occurred in intake water samples, which exhibited a significantly higher amount of PMA-sensitive cyanobacteria/chloroplast 16S rRNA than their corresponding non-PMA total DNA samples. The community assembly obtained using Ion Torrent sequencing was comparable to that obtained from a subset of samples that were also subjected to 454 pyrosequencing. This study showed the efficacy of alkali ballast water treatment in reducing ballast water microbial diversity and demonstrated the application of new Ion Torrent

  4. cis-β-Bromostyrene derivatives from cinnamic acids via a tandem substitutive bromination-decarboxylation sequence.

    PubMed

    Tang, Khanh G; Kent, Greggory T; Erden, Ihsan; Wu, Weiming

    2017-10-04

    cis -β-Bromostyrene derivatives were synthesized stereospecifically from cinnamic acids through β-lactone intermediates. The synthetic sequence did not require the purification of the β-lactone intermediates although they were found to be stable and readily purified in most cases.

  5. A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package.

    PubMed

    Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M

    2013-01-01

    Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.

  6. Diversity of Secondary Structure in Catalytic Peptides with β-Turn-Biased Sequences

    PubMed Central

    2016-01-01

    X-ray crystallography has been applied to the structural analysis of a series of tetrapeptides that were previously assessed for catalytic activity in an atroposelective bromination reaction. Common to the series is a central Pro-Xaa sequence, where Pro is either l- or d-proline, which was chosen to favor nucleation of canonical β-turn secondary structures. Crystallographic analysis of 35 different peptide sequences revealed a range of conformational states. The observed differences appear not only in cases where the Pro-Xaa loop-region is altered, but also when seemingly subtle alterations to the flanking residues are introduced. In many instances, distinct conformers of the same sequence were observed, either as symmetry-independent molecules within the same unit cell or as polymorphs. Computational studies using DFT provided additional insight into the analysis of solid-state structural features. Select X-ray crystal structures were compared to the corresponding solution structures derived from measured proton chemical shifts, 3J-values, and 1H–1H-NOESY contacts. These findings imply that the conformational space available to simple peptide-based catalysts is more diverse than precedent might suggest. The direct observation of multiple ground state conformations for peptides of this family, as well as the dynamic processes associated with conformational equilibria, underscore not only the challenge of designing peptide-based catalysts, but also the difficulty in predicting their accessible transition states. These findings implicate the advantages of low-barrier interconversions between conformations of peptide-based catalysts for multistep, enantioselective reactions. PMID:28029251

  7. Lipoxygenase in Caragana jubata responds to low temperature, abscisic acid, methyl jasmonate and salicylic acid.

    PubMed

    Bhardwaj, Pardeep Kumar; Kaur, Jagdeep; Sobti, Ranbir Chander; Ahuja, Paramvir Singh; Kumar, Sanjay

    2011-09-01

    Lipoxygenase (LOX) catalyses oxygenation of free polyunsaturated fatty acids into oxylipins, and is a critical enzyme of the jasmonate signaling pathway. LOX has been shown to be associated with biotic and abiotic stress responses in diverse plant species, though limited data is available with respect to low temperature and the associated cues. Using rapid amplification of cDNA ends, a full-length cDNA (CjLOX) encoding lipoxygenase was cloned from apical buds of Caragana jubata, a temperate plant species that grows under extreme cold. The cDNA obtained was 2952bp long consisting of an open reading frame of 2610bp encoding 869 amino acids protein. Multiple alignment of the deduced amino acid sequence with those of other plants demonstrated putative LH2/ PLAT domain, lipoxygenase iron binding catalytic domain and lipoxygenase_2 signature sequences. CjLOX exhibited up- and down-regulation of gene expression pattern in response to low temperature (LT), abscisic acid (ABA), methyl jasmonate (MJ) and salicylic acid (SA). Among all the treatments, a strong up-regulation was observed in response to MJ. Data suggests an important role of jasmonate signaling pathway in response to LT in C. jubata. Copyright © 2011 Elsevier B.V. All rights reserved.

  8. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections.

    PubMed

    Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe; Avarre, Jean-Christophe

    2016-01-01

    Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×10 7 . The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.

  9. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections

    PubMed Central

    Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe

    2016-01-01

    Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3. PMID:27703859

  10. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  11. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  12. Analysis of diversity of diazotrophic bacteria associated with the rhizosphere of a tropical Arbor, Melastoma malabathricum L.

    PubMed

    Sato, Atsuya; Watanabe, Toshihiro; Unno, Yusuke; Purnomo, Erry; Osaki, Mitsuru; Shinano, Takuro

    2009-01-01

    The diversity of diazotrophic bacteria in the rhizosphere of Melastoma malabathricum L. was investigated by cloning-sequencing of the nifH gene directly amplified from DNA extracted from soil. Samples were obtained from the rhizosphere and bulk soil of M. malabathricum growing in three different soil types (acid sulfate, peat and sandy clay soils) located very close to each other in south Kalimantan, Indonesia. Six clone libraries were constructed, generated from bulk and rhizosphere soil samples, and 300 nifH clones were produced, then assembled into 29 operational taxonomic units (OTUs) based on percent identity values. Our results suggested that nifH gene diversity is mainly dependent on soil properties, and did not differ remarkably between the rhizosphere and bulk soil of M. malabathricum except in acid sulfate soil. In acid sulfate soil, as the Shannon diversity index was lower in rhizosphere than in bulk soil, it is suggested that particular bacterial species might accumulate in the rhizosphere.

  13. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  14. Hydroquinone: O-glucosyltransferase from cultivated Rauvolfia cells: enrichment and partial amino acid sequences.

    PubMed

    Arend, J; Warzecha, H; Stöckigt, J

    2000-01-01

    Plant cell suspension cultures of Rauvolfia are able to produce a high amount of arbutin by glucosylation of exogenously added hydroquinone. A four step purification procedure using anion exchange, hydrophobic interaction, hydroxyapatite-chromatography and chromatofocusing delivered in a yield of 0.5%, an approximately 390 fold enrichment of the involved glucosyltransferase. SDS-PAGE showed a M(r) for the enzyme of 52 kDa. Proteolysis of the pure enzyme with endoproteinase LysC revealed six peptide fragments with 9-23 amino acids which were sequenced. Sequence alignment of the six peptides showed high homologies to glycosyltransferases from other higher plants.

  15. Sequence diversity and differential expression of major phenylpropanoid-flavonoid biosynthetic genes among three mango varieties.

    PubMed

    Hoang, Van L T; Innes, David J; Shaw, P Nicholas; Monteith, Gregory R; Gidley, Michael J; Dietzgen, Ralf G

    2015-07-30

    Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles. A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316 bp. Variety IW had the highest SNP frequency (one SNP every 258 bp) while KP and NDM had similar frequencies (one SNP every 369 bp and 360 bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3'-hydroxylase (F3'H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties. The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and

  16. Theileria parva antigens recognized by CD8+ T cells show varying degrees of diversity in buffalo-derived infected cell lines.

    PubMed

    Sitt, Tatjana; Pelle, Roger; Chepkwony, Maurine; Morrison, W Ivan; Toye, Philip

    2018-05-06

    The extent of sequence diversity among the genes encoding 10 antigens (Tp1-10) known to be recognized by CD8+ T lymphocytes from cattle immune to Theileria parva was analysed. The sequences were derived from parasites in 23 buffalo-derived cell lines, three cattle-derived isolates and one cloned cell line obtained from a buffalo-derived stabilate. The results revealed substantial variation among the antigens through sequence diversity. The greatest nucleotide and amino acid diversity were observed in Tp1, Tp2 and Tp9. Tp5 and Tp7 showed the least amount of allelic diversity, and Tp5, Tp6 and Tp7 had the lowest levels of protein diversity. Tp6 was the most conserved protein; only a single non-synonymous substitution was found in all obtained sequences. The ratio of non-synonymous: synonymous substitutions varied from 0.84 (Tp1) to 0.04 (Tp6). Apart from Tp2 and Tp9, we observed no variation in the other defined CD8+ T cell epitopes (Tp4, 5, 7 and 8), indicating that epitope variation is not a universal feature of T. parva antigens. In addition to providing markers that can be used to examine the diversity in T. parva populations, the results highlight the potential for using conserved antigens to develop vaccines that provide broad protection against T. parva.

  17. Making sense of deep sequencing

    PubMed Central

    Goldman, D.; Domschke, K.

    2016-01-01

    This review, the first of an occasional series, tries to make sense of the concepts and uses of deep sequencing of polynucleic acids (DNA and RNA). Deep sequencing, synonymous with next-generation sequencing, high-throughput sequencing and massively parallel sequencing, includes whole genome sequencing but is more often and diversely applied to specific parts of the genome captured in different ways, for example the highly expressed portion of the genome known as the exome and portions of the genome that are epigenetically marked either by DNA methylation, the binding of proteins including histones, or that are in different configurations and thus more or less accessible to enzymes that cleave DNA. Deep sequencing of RNA (RNASeq) reverse-transcribed to complementary DNA is invaluable for measuring RNA expression and detecting changes in RNA structure. Important concepts in deep sequencing include the length and depth of sequence reads, mapping and assembly of reads, sequencing error, haplotypes, and the propensity of deep sequencing, as with other types of ‘big data’, to generate large numbers of errors, requiring monitoring for methodologic biases and strategies for replication and validation. Deep sequencing yields a unique genetic fingerprint that can be used to identify a person, and a trove of predictors of genetic medical diseases. Deep sequencing to identify epigenetic events including changes in DNA methylation and RNA expression can reveal the history and impact of environmental exposures. Because of the power of sequencing to identify and deliver biomedically significant information about a person and their blood relatives, it creates ethical dilemmas and practical challenges in research and clinical care, for example the decision and procedures to report incidental findings that will increasingly and frequently be discovered. PMID:24925306

  18. Partial amino-acid sequence of the precursor of an immunoglobulin light chain containing NH2-terminal pyroglutamic acid.

    PubMed Central

    Burstein, Y; Kantour, F; Schechter, I

    1976-01-01

    Analyses of amino-acid sequences of the total cell-free products programmed by the mRNA of MOPC-104E gamma light (L)-chain show that over 95% of the products have sequences of a distinct protein that correspond to the L-chain precursor. In this precursor an extra piece is coupled to the NH2-terminus of the mature L-chain. Analyses of products labeled with [3H]alanine, [3H]leucine, and [3H]proline demonstrate that the extra piece is composed of at least 18 residues. Analyses of [35S]methione-labeled product indicate that the extra piece may contain an additional NH2-terminal methionine, which is detected in about 10% of the molecules. Partial recovery of the NJ2-terminal methionine (alanine, leucine, and proline are recovered in yields close to theoretical, greater than 95%) suggests that it is the initiator methionine, which is known to be short lived in eukaryotes due to rapid hydrolysis. Thus, the extra piece seems to be 19 residues in length, and it contains one methionine at the NH2-terminus, three alanines at positions 2, 12, and 17, and five leucines at positions 6, 8, 10, 11, and 13. The close gathering of leucine residues, as well as their abundance (26%), suggest that the extra piece would be quite hydrophobic. Hydrophobicity seems to be a general property of the extra piece, since similar clusters of leucine were found in the precursors of 3 KL-chains (Burstein, Y. & Schechter, I. (1976) Biochem. J. 157, 145-151). The NH2-terminus of the mature MOPC-104E gamma L-chain is blocked by pyroglutamic acid. The fact that in the precursor a peptide segment precedes this NH2-terminus establishes that pyroglutamic acid is not the initiator residue for synthesis of the L-chain. Apparently, the pyroglutamic acid is formed by cyclization of glutamic acid or glutamine during cleavage of the extra piece to yield the mature L-chain. Images PMID:822420

  19. Partial bisulfite conversion for unique template sequencing.

    PubMed

    Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael; Levy, Dan

    2018-01-25

    We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Microbial diversity and metabolic networks in acid mine drainage habitats

    PubMed Central

    Méndez-García, Celia; Peláez, Ana I.; Mesa, Victoria; Sánchez, Jesús; Golyshina, Olga V.; Ferrer, Manuel

    2015-01-01

    Acid mine drainage (AMD) emplacements are low-complexity natural systems. Low-pH conditions appear to be the main factor underlying the limited diversity of the microbial populations thriving in these environments, although temperature, ionic composition, total organic carbon, and dissolved oxygen are also considered to significantly influence their microbial life. This natural reduction in diversity driven by extreme conditions was reflected in several studies on the microbial populations inhabiting the various micro-environments present in such ecosystems. Early studies based on the physiology of the autochthonous microbiota and the growing success of omics-based methodologies have enabled a better understanding of microbial ecology and function in low-pH mine outflows; however, complementary omics-derived data should be included to completely describe their microbial ecology. Furthermore, recent updates on the distribution of eukaryotes and archaea recovered through sterile filtering (herein referred to as filterable fraction) in these environments demand their inclusion in the microbial characterization of AMD systems. In this review, we present a complete overview of the bacterial, archaeal (including filterable fraction), and eukaryotic diversity in these ecosystems, and include a thorough depiction of the metabolism and element cycling in AMD habitats. We also review different metabolic network structures at the organismal level, which is necessary to disentangle the role of each member of the AMD communities described thus far. PMID:26074887

  1. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing.

    PubMed

    Naveed, Muhammad; Mubeen, Samavia; Khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization.

  2. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    PubMed Central

    Naveed, Muhammad; Mubeen, Samavia; khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization. PMID:25477935

  3. Genotypic diversity of stress response in Lactobacillus plantarum, Lactobacillus paraplantarum and Lactobacillus pentosus.

    PubMed

    Ricciardi, Annamaria; Parente, Eugenio; Guidone, Angela; Ianniello, Rocco Gerardo; Zotta, Teresa; Abu Sayem, S M; Varcamonti, Mario

    2012-07-02

    Lactobacillus plantarum, Lactobacillus pentosus and Lactobacillus paraplantarum are three closely related species which are widespread in food and non-food environments, and are important as starter bacteria or probiotics. In order to evaluate the phenotypic diversity of stress tolerance in the L. plantarum group and the ability to mount an adaptive heat shock response, the survival of exponential and stationary phase and of heat adapted exponential phase cells of six L. plantarum subsp. plantarum, one L. plantarum subsp. argentoratensis, one L. pentosus and two L. paraplantarum strains selected in a previous work upon exposure to oxidative, heat, detergent, starvation and acid stresses was compared to that of the L. plantarum WCFS1 strain. Furthermore, to evaluate the genotypic diversity in stress response genes, ten genes (encoding for chaperones DnaK, GroES and GroEL, regulators CtsR, HrcA and CcpA, ATPases/proteases ClpL, ClpP, ClpX and protease FtsH) were amplified using primers derived from the WCFS1 genome sequence and submitted to restriction with one or two endonucleases. The results were compared by univariate and multivariate statistical methods. In addition, the amplicons for hrcA and ctsR were sequenced and compared by multiple sequence alignment and polymorphism analysis. Although there was evidence of a generalized stress response in the stationary phase, with increase of oxidative, heat, and, to a lesser extent, starvation stress tolerance, and for adaptive heat stress response, with increased tolerance to heat, acid and detergent, different growth phases and adaptation patterns were found. Principal component analysis showed that while heat, acid and detergent stresses respond similarly to growth phase and adaptation, tolerance to oxidative and starvation stresses implies completely unrelated mechanisms. A dendrogram obtained using the data from multilocus restriction typing (MLRT) of stress response genes clearly separated two groups of L

  4. High‑throughput sequencing analyses of oral microbial diversity in healthy people and patients with dental caries and periodontal disease.

    PubMed

    Chen, Tingtao; Shi, Yan; Wang, Xiaolei; Wang, Xin; Meng, Fanjing; Yang, Shaoguo; Yang, Jian; Xin, Hongbo

    2017-07-01

    Recurrence of oral diseases caused by antibiotics has brought about an urgent requirement to explore the oral microbial diversity in the human oral cavity. In the present study, the high‑throughput sequencing method was adopted to compare the microbial diversity of healthy people and oral patients and sequence analysis was performed by UPARSE software package. The Venn results indicated that a mean of 315 operational taxonomic units (OTUs) was obtained, and 73, 64, 53, 19 and 18 common OTUs belonging to Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria and Fusobacteria, respectively, were identified in healthy people. Moreover, the reduction of Firmicutes and the increase of Proteobacteria in the children group, and the increase of Firmicutes and the reduction of Proteobacteria in the youth and adult groups, indicated that the age bracket and oral disease had largely influenced the tooth development and microbial development in the oral cavity. In addition, the traditional 'pathogenic bacteria' of Firmicutes, Proteobacteria and Bacteroidetes (accounted for >95% of the total sequencing number in each group) indicated that the 'harmful' bacteria may exert beneficial effects on oral health. Therefore, the data will provide certain clues for curing some oral diseases by the strategy of adjusting the disturbed microbial compositions in oral disease to healthy level.

  5. An Alignment-Free Algorithm in Comparing the Similarity of Protein Sequences Based on Pseudo-Markov Transition Probabilities among Amino Acids

    PubMed Central

    Li, Yushuang; Yang, Jiasheng; Zhang, Yi

    2016-01-01

    In this paper, we have proposed a novel alignment-free method for comparing the similarity of protein sequences. We first encode a protein sequence into a 440 dimensional feature vector consisting of a 400 dimensional Pseudo-Markov transition probability vector among the 20 amino acids, a 20 dimensional content ratio vector, and a 20 dimensional position ratio vector of the amino acids in the sequence. By evaluating the Euclidean distances among the representing vectors, we compare the similarity of protein sequences. We then apply this method into the ND5 dataset consisting of the ND5 protein sequences of 9 species, and the F10 and G11 datasets representing two of the xylanases containing glycoside hydrolase families, i.e., families 10 and 11. As a result, our method achieves a correlation coefficient of 0.962 with the canonical protein sequence aligner ClustalW in the ND5 dataset, much higher than those of other 5 popular alignment-free methods. In addition, we successfully separate the xylanases sequences in the F10 family and the G11 family and illustrate that the F10 family is more heat stable than the G11 family, consistent with a few previous studies. Moreover, we prove mathematically an identity equation involving the Pseudo-Markov transition probability vector and the amino acids content ratio vector. PMID:27918587

  6. Amino acid sequence of human cholinesterase. Annual report, 30 September 1984-30 September 1985

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lockridge, O.

    1985-10-01

    The active-site serine residue is located 198 amino acids from the N-terminal. The active-site peptide was isolated from three different genetic types of human serum cholinesterase: from usual, atypical, and atypical-silent genotypes. It was found that the amino acid sequence of the active-site peptide was identical in all three genotypes. Comparison of the complete sequences of cholinesterase from human serum and acetylcholinesterase from the electric organ of Torpedo californica shows an identity of 53%. Cholinesterase is of interest to the Department of Defense because cholinesterase protects against organophosphate poisons of the type used in chemical warfare. The structural results presentedmore » here will serve as the basis for cloning the gene for cholinesterase. The potential uses of large amounts of cholinesterase would be for cleaning up spills of organophosphates and possibly for detoxifying exposed personnel.« less

  7. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  8. DNA tetrominoes: the construction of DNA nanostructures using self-organised heterogeneous deoxyribonucleic acids shapes.

    PubMed

    Ong, Hui San; Rahim, Mohd Syafiq; Firdaus-Raih, Mohd; Ramlan, Effirul Ikhwan

    2015-01-01

    The unique programmability of nucleic acids offers alternative in constructing excitable and functional nanostructures. This work introduces an autonomous protocol to construct DNA Tetris shapes (L-Shape, B-Shape, T-Shape and I-Shape) using modular DNA blocks. The protocol exploits the rich number of sequence combinations available from the nucleic acid alphabets, thus allowing for diversity to be applied in designing various DNA nanostructures. Instead of a deterministic set of sequences corresponding to a particular design, the protocol promotes a large pool of DNA shapes that can assemble to conform to any desired structures. By utilising evolutionary programming in the design stage, DNA blocks are subjected to processes such as sequence insertion, deletion and base shifting in order to enrich the diversity of the resulting shapes based on a set of cascading filters. The optimisation algorithm allows mutation to be exerted indefinitely on the candidate sequences until these sequences complied with all the four fitness criteria. Generated candidates from the protocol are in agreement with the filter cascades and thermodynamic simulation. Further validation using gel electrophoresis indicated the formation of the designed shapes. Thus, supporting the plausibility of constructing DNA nanostructures in a more hierarchical, modular, and interchangeable manner.

  9. Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andersen, Mikael R.; Salazar, Margarita; Schaap, Peter

    2011-06-01

    The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases and protein transporters.« less

  10. Two-level QSAR network (2L-QSAR) for peptide inhibitor design based on amino acid properties and sequence positions.

    PubMed

    Du, Q S; Ma, Y; Xie, N Z; Huang, R B

    2014-01-01

    In the design of peptide inhibitors the huge possible variety of the peptide sequences is of high concern. In collaboration with the fast accumulation of the peptide experimental data and database, a statistical method is suggested for peptide inhibitor design. In the two-level peptide prediction network (2L-QSAR) one level is the physicochemical properties of amino acids and the other level is the peptide sequence position. The activity contributions of amino acids are the functions of physicochemical properties and the sequence positions. In the prediction equation two weight coefficient sets {ak} and {bl} are assigned to the physicochemical properties and to the sequence positions, respectively. After the two coefficient sets are optimized based on the experimental data of known peptide inhibitors using the iterative double least square (IDLS) procedure, the coefficients are used to evaluate the bioactivities of new designed peptide inhibitors. The two-level prediction network can be applied to the peptide inhibitor design that may aim for different target proteins, or different positions of a protein. A notable advantage of the two-level statistical algorithm is that there is no need for host protein structural information. It may also provide useful insight into the amino acid properties and the roles of sequence positions.

  11. A novel process of viral vector barcoding and library preparation enables high-diversity library generation and recombination-free paired-end sequencing

    PubMed Central

    Davidsson, Marcus; Diaz-Fernandez, Paula; Schwich, Oliver D.; Torroba, Marcos; Wang, Gang; Björklund, Tomas

    2016-01-01

    Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode. PMID:27874090

  12. Viral morphogenesis is the dominant source of sequence censorship in M13 combinatorial peptide phage display.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rodi, D. J.; Soares, A. S.; Makowski, L.

    Novel statistical methods have been developed and used to quantitate and annotate the sequence diversity within combinatorial peptide libraries on the basis of small numbers (1-200) of sequences selected at random from commercially available M13 p3-based phage display libraries. These libraries behave statistically as though they correspond to populations containing roughly 4.0{+-}1.6% of the random dodecapeptides and 7.9{+-}2.6% of the random constrained heptapeptides that are theoretically possible within the phage populations. Analysis of amino acid residue occurrence patterns shows no demonstrable influence on sequence censorship by Escherichia coli tRNA isoacceptor profiles or either overall codon or Class II codon usagemore » patterns, suggesting no metabolic constraints on recombinant p3 synthesis. There is an overall depression in the occurrence of cysteine, arginine and glycine residues and an overabundance of proline, threonine and histidine residues. The majority of position-dependent amino acid sequence bias is clustered at three positions within the inserted peptides of the dodecapeptide library, +1, +3 and +12 downstream from the signal peptidase cleavage site. Conformational tendency measures of the peptides indicate a significant preference for inserts favoring a {beta}-turn conformation. The observed protein sequence limitations can primarily be attributed to genetic codon degeneracy and signal peptidase cleavage preferences. These data suggest that for applications in which maximal sequence diversity is essential, such as epitope mapping or novel receptor identification, combinatorial peptide libraries should be constructed using codon-corrected trinucleotide cassettes within vector-host systems designed to minimize morphogenesis-related censorship.« less

  13. A Score of the Ability of a Three-Dimensional Protein Model to Retrieve Its Own Sequence as a Quantitative Measure of Its Quality and Appropriateness

    PubMed Central

    Martínez-Castilla, León P.; Rodríguez-Sotres, Rogelio

    2010-01-01

    Background Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. Principal Findings The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. Conclusion Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the

  14. Amino-acid sequence and predicted three-dimensional structure of pea seed (Pisum sativum) ferritin.

    PubMed Central

    Lobreaux, S; Yewdall, S J; Briat, J F; Harrison, P M

    1992-01-01

    The iron storage protein, ferritin, is widely distributed in the living kingdom. Here the complete cDNA and derived amino-acid sequence of pea seed ferritin are described, together with its predicted secondary structure, namely a four-helix-bundle fold similar to those of mammalian ferritins, with a fifth short helix at the C-terminus. An N-terminal extension of 71 residues contains a transit peptide (first 47 residues) responsible for plastid targetting as in other plant ferritins, and this is cleaved before assembly. The second part of the extension (24 residues) belongs to the mature subunit; it is cleaved during germination. The amino-acid sequence of pea seed ferritin is aligned with those of other ferritins (49% amino-acid identity with H-chains and 40% with L-chains of human liver ferritin in the aligned region). A three-dimensional model has been constructed by fitting the aligned sequence to the coordinates of human H-chains, with appropriate modifications. A folded conformation with an 11-residue helix is predicted for the N-terminal extension. As in mammalian ferritins, 24 subunits assemble into a hollow shell. In pea seed ferritin, its N-terminal extension is exposed on the outside surface of the shell. Within each pea subunit is a ferroxidase centre resembling those of human ferritin H-chains except for a replacement of Glu-62 by His. The channel at the 4-fold-symmetry axes defined by E-helices, is predicted to be hydrophilic in plant ferritins, whereas it is hydrophobic in mammalian ferritins. Images Fig. 3. Fig. 5. Fig. 6. PMID:1472006

  15. Binning of shallowly sampled metagenomic sequence fragments reveals that low abundance bacteria play important roles in sulfur cycling and degradation of complex organic polymers in an acid mine drainage community

    NASA Astrophysics Data System (ADS)

    Dick, G. J.; Andersson, A.; Banfield, J. F.

    2007-12-01

    Our understanding of environmental microbiology has been greatly enhanced by community genome sequencing of DNA recovered directly the environment. Community genomics provides insights into the diversity, community structure, metabolic function, and evolution of natural populations of uncultivated microbes, thereby revealing dynamics of how microorganisms interact with each other and their environment. Recent studies have demonstrated the potential for reconstructing near-complete genomes from natural environments while highlighting the challenges of analyzing community genomic sequence, especially from diverse environments. A major challenge of shotgun community genome sequencing is identification of DNA fragments from minor community members for which only low coverage of genomic sequence is present. We analyzed community genome sequence retrieved from biofilms in an acid mine drainage (AMD) system in the Richmond Mine at Iron Mountain, CA, with an emphasis on identification and assembly of DNA fragments from low-abundance community members. The Richmond mine hosts an extensive, relatively low diversity subterranean chemolithoautotrophic community that is sustained entirely by oxidative dissolution of pyrite. The activity of these microorganisms greatly accelerates the generation of AMD. Previous and ongoing work in our laboratory has focused on reconstrucing genomes of dominant community members, including several bacteria and archaea. We binned contigs from several samples (including one new sample and two that had been previously analyzed) by tetranucleotide frequency with clustering by Self-Organizing Maps (SOM). The binning, evaluated by comparison with information from the manually curated assembly of the dominant organisms, was found to be very effective: fragments were correctly assigned with 95% accuracy. Improperly assigned fragments often contained sequences that are either evolutionarily constrained (e.g. 16S rRNA genes) or mobile elements that are

  16. Deep sequencing in library selection projects: what insight does it bring?

    PubMed Central

    Glanville, J; D’Angelo, S; Khan, T.A.; Reddy, S. T.; Naranjo, L.; Ferrara, F.; Bradbury, A.R.M.

    2015-01-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. PMID:26451649

  17. Structural and functional diversity in Listeria cell wall teichoic acids.

    PubMed

    Shen, Yang; Boulos, Samy; Sumrall, Eric; Gerber, Benjamin; Julian-Rodero, Alicia; Eugster, Marcel R; Fieseler, Lars; Nyström, Laura; Ebert, Marc-Olivier; Loessner, Martin J

    2017-10-27

    Wall teichoic acids (WTAs) are the most abundant glycopolymers found on the cell wall of many Gram-positive bacteria, whose diverse surface structures play key roles in multiple biological processes. Despite recent technological advances in glycan analysis, structural elucidation of WTAs remains challenging due to their complex nature. Here, we employed a combination of ultra-performance liquid chromatography-coupled electrospray ionization tandem-MS/MS and NMR to determine the structural complexity of WTAs from Listeria species. We unveiled more than 10 different types of WTA polymers that vary in their linkage and repeating units. Disparity in GlcNAc to ribitol connectivity, as well as variable O -acetylation and glycosylation of GlcNAc contribute to the structural diversity of WTAs. Notably, SPR analysis indicated that constitution of WTA determines the recognition by bacteriophage endolysins. Collectively, these findings provide detailed insight into Listeria cell wall-associated carbohydrates, and will guide further studies on the structure-function relationship of WTAs. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  18. Linking wine lactic acid bacteria diversity with wine aroma and flavour.

    PubMed

    Cappello, Maria Stella; Zapparoli, Giacomo; Logrieco, Antonio; Bartowsky, Eveline J

    2017-02-21

    In the last two decades knowledge on lactic acid bacteria (LAB) associated with wine has increased considerably. Investigations on genetic and biochemistry of species involved in malolactic fermentation, such as Oenococcus oeni and of Lactobacillus have enabled a better understand of their role in aroma modification and microbial stability of wine. In particular, the use of molecular techniques has provided evidence on the high diversity at species and strain level, thus improving the knowledge on wine LAB taxonomy and ecology. These tools demonstrated to also be useful to detect strains with potential desirable or undesirable traits for winemaking purposes. At the same time, advances on the enzymatic properties of wine LAB responsible for the development of wine aroma molecules have been undertaken. Interestingly, it has highlighted the high intraspecific variability of enzymatic activities such as glucosidase, esterase, proteases and those related to citrate metabolism within the wine LAB species. This genetic and biochemistry diversity that characterizes wine LAB populations can generate a wide spectrum of wine sensory outcomes. This review examines some of these interesting aspects as a way to elucidate the link between LAB diversity with wine aroma and flavour. In particular, the correlation between inter- and intra-species diversity and bacterial metabolic traits that affect the organoleptic properties of wines is highlighted with emphasis on the importance of enzymatic potential of bacteria for the selection of starter cultures to control MLF and to enhance wine aroma. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Spontaneous organic cocoa bean box fermentations in Brazil are characterized by a restricted species diversity of lactic acid bacteria and acetic acid bacteria.

    PubMed

    Papalexandratou, Zoi; Vrancken, Gino; De Bruyne, Katrien; Vandamme, Peter; De Vuyst, Luc

    2011-10-01

    Spontaneous organic cocoa bean box fermentations were carried out on two different farms in Brazil. Physical parameters, microbial growth, bacterial species diversity [mainly lactic acid bacteria (LAB) and acetic acid bacteria (AAB)], and metabolite kinetics were monitored, and chocolates were produced from the fermented dry cocoa beans. The main end-products of the catabolism of the pulp substrates (glucose, fructose, and citric acid) by yeasts, LAB, and AAB were ethanol, lactic acid, mannitol, and/or acetic acid. Lactobacillus fermentum and Acetobacter pasteurianus were the predominating bacterial species of the fermentations as revealed through (GTG)(5)-PCR fingerprinting of isolates and PCR-DGGE of 16S rRNA gene PCR amplicons of DNA directly extracted from fermentation samples. Fructobacillus pseudoficulneus, Lactobacillus plantarum, and Acetobacter senegalensis were among the prevailing species during the initial phase of the fermentations. Also, three novel LAB species were found. This study emphasized the possible participation of Enterobacteriaceae in the cocoa bean fermentation process. Tatumella ptyseos and Tatumella citrea were the prevailing enterobacterial species in the beginning of the fermentations as revealed by 16S rRNA gene-PCR-DGGE. Finally, it turned out that control over a restricted bacterial species diversity during fermentation through an ideal post-harvest handling of the cocoa beans will allow the production of high-quality cocoa and chocolates produced thereof, independent of the fermentation method or farm. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions.

    PubMed

    Nishizawa, M; Nishizawa, K

    2000-10-01

    The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.

  1. Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions

    PubMed Central

    Nishizawa, Manami; Nishizawa, Kazuhisa

    2000-01-01

    The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the ‘between gene’ GC content heterogeneity, which is linked to ‘isochores’, is a principal factor associated with the bias in substitution patterns in human, ‘within gene’ heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed. PMID:11000273

  2. Single molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae

    PubMed Central

    Conlan, Sean; Thomas, Pamela J.; Deming, Clayton; Park, Morgan; Lau, Anna F.; Dekker, John P.; Snitkin, Evan S.; Clark, Tyson A.; Luong, Khai; Song, Yi; Tsai, Yu-Chih; Boitano, Matthew; Gupta, Jyoti; Brooks, Shelise Y.; Schmidt, Brian; Young, Alice C.; Thomas, James W.; Bouffard, Gerard G.; Blakesley, Robert W.; Mullikin, James C.; Korlach, Jonas; Henderson, David K.; Frank, Karen M.; Palmore, Tara N.; Segre, Julia A.

    2014-01-01

    Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common healthcare-associated infections nearly impossible to treat. We performed comprehensive surveillance and genomic sequencing to identify carbapenem-resistant Enterobacteriaceae in the NIH Clinical Center patient population and hospital environment in order to to articulate the diversity of carbapenemase-encoding plasmids and survey the mobility of and assess the mobility of these plasmids between bacterial species. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem-resistance genes on a wide array of plasmids. Klebsiella pneumoniae and Enterobacter cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, overriding the epidemiological scenario of plasmid transfer between organisms within this patient. We did, however, find evidence supporting horizontal transfer of carbapenemase-encoding plasmids between Klebsiella pneumoniae, Enterobacter cloacae and Citrobacter freundii in the hospital environment. Our comprehensive sequence data, with full plasmid identification, challenges assumptions about horizontal gene transfer events within patients and identified wider possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by Klebsiella pneumoniae, Escherichia coli, Enterobacter cloacae and Pantoea species, from unrelated patients and the hospital environment. PMID:25232178

  3. RNAblueprint: flexible multiple target nucleic acid sequence design.

    PubMed

    Hammer, Stefan; Tschiatschek, Birgit; Flamm, Christoph; Hofacker, Ivo L; Findeiß, Sven

    2017-09-15

    Realizing the value of synthetic biology in biotechnology and medicine requires the design of molecules with specialized functions. Due to its close structure to function relationship, and the availability of good structure prediction methods and energy models, RNA is perfectly suited to be synthetically engineered with predefined properties. However, currently available RNA design tools cannot be easily adapted to accommodate new design specifications. Furthermore, complicated sampling and optimization methods are often developed to suit a specific RNA design goal, adding to their inflexibility. We developed a C ++  library implementing a graph coloring approach to stochastically sample sequences compatible with structural and sequence constraints from the typically very large solution space. The approach allows to specify and explore the solution space in a well defined way. Our library also guarantees uniform sampling, which makes optimization runs performant by not only avoiding re-evaluation of already found solutions, but also by raising the probability of finding better solutions for long optimization runs. We show that our software can be combined with any other software package to allow diverse RNA design applications. Scripting interfaces allow the easy adaption of existing code to accommodate new scenarios, making the whole design process very flexible. We implemented example design approaches written in Python to demonstrate these advantages. RNAblueprint , Python implementations and benchmark datasets are available at github: https://github.com/ViennaRNA . s.hammer@univie.ac.at, ivo@tbi.univie.ac.at or sven@tbi.univie.ac.at. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  4. Intracellular diversity of the V4 and V9 regions of the 18S rRNA in marine protists (radiolarians) assessed by high-throughput sequencing.

    PubMed

    Decelle, Johan; Romac, Sarah; Sasaki, Eriko; Not, Fabrice; Mahé, Frédéric

    2014-01-01

    Metabarcoding is a powerful tool for exploring microbial diversity in the environment, but its accurate interpretation is impeded by diverse technical (e.g. PCR and sequencing errors) and biological biases (e.g. intra-individual polymorphism) that remain poorly understood. To help interpret environmental metabarcoding datasets, we investigated the intracellular diversity of the V4 and V9 regions of the 18S rRNA gene from Acantharia and Nassellaria (radiolarians) using 454 pyrosequencing. Individual cells of radiolarians were isolated, and PCRs were performed with generalist primers to amplify the V4 and V9 regions. Different denoising procedures were employed to filter the pyrosequenced raw amplicons (Acacia, AmpliconNoise, Linkage method). For each of the six isolated cells, an average of 541 V4 and 562 V9 amplicons assigned to radiolarians were obtained, from which one numerically dominant sequence and several minor variants were found. At the 97% identity, a diversity metrics commonly used in environmental surveys, up to 5 distinct OTUs were detected in a single cell. However, most amplicons grouped within a single OTU whereas other OTUs contained very few amplicons. Different analytical methods provided evidence that most minor variants forming different OTUs correspond to PCR and sequencing artifacts. Duplicate PCR and sequencing from the same DNA extract of a single cell had only 9 to 16% of unique amplicons in common, and alignment visualization of V4 and V9 amplicons showed that most minor variants contained substitutions in highly-conserved regions. We conclude that intracellular variability of the 18S rRNA in radiolarians is very limited despite its multi-copy nature and the existence of multiple nuclei in these protists. Our study recommends some technical guidelines to conservatively discard artificial amplicons from metabarcoding datasets, and thus properly assess the diversity and richness of protists in the environment.

  5. Genetic diversity and population structure analysis of spinach by single-nucleotide polymorphisms identified through genotyping-by-sequencing.

    PubMed

    Shi, Ainong; Qin, Jun; Mou, Beiquan; Correll, James; Weng, Yuejin; Brenner, David; Feng, Chunda; Motes, Dennis; Yang, Wei; Dong, Lingdi; Bhattarai, Gehendra; Ravelombola, Waltram

    2017-01-01

    Spinach (Spinacia oleracea L., 2n = 2x = 12) is an economically important vegetable crop worldwide and one of the healthiest vegetables due to its high concentrations of nutrients and minerals. The objective of this research was to conduct genetic diversity and population structure analysis of a collection of world-wide spinach genotypes using single nucleotide polymorphisms (SNPs) markers. Genotyping by sequencing (GBS) was used to discover SNPs in spinach genotypes. Three sets of spinach genotypes were used: 1) 268 USDA GRIN spinach germplasm accessions originally collected from 30 countries; 2) 45 commercial spinach F1 hybrids from three countries; and 3) 30 US Arkansas spinach cultivars/breeding lines. The results from this study indicated that there was genetic diversity among the 343 spinach genotypes tested. Furthermore, the genetic background in improved commercial F1 hybrids and in Arkansas cultivars/lines had a different structured populations from the USDA germplasm. In addition, the genetic diversity and population structures were associated with geographic origin and germplasm from the US Arkansas breeding program had a unique genetic background. These data could provide genetic diversity information and the molecular markers for selecting parents in spinach breeding programs.

  6. Genetic diversity and population structure analysis of spinach by single-nucleotide polymorphisms identified through genotyping-by-sequencing

    PubMed Central

    Qin, Jun; Mou, Beiquan; Correll, James; Weng, Yuejin; Brenner, David; Feng, Chunda; Motes, Dennis; Yang, Wei; Dong, Lingdi; Bhattarai, Gehendra; Ravelombola, Waltram

    2017-01-01

    Spinach (Spinacia oleracea L., 2n = 2x = 12) is an economically important vegetable crop worldwide and one of the healthiest vegetables due to its high concentrations of nutrients and minerals. The objective of this research was to conduct genetic diversity and population structure analysis of a collection of world-wide spinach genotypes using single nucleotide polymorphisms (SNPs) markers. Genotyping by sequencing (GBS) was used to discover SNPs in spinach genotypes. Three sets of spinach genotypes were used: 1) 268 USDA GRIN spinach germplasm accessions originally collected from 30 countries; 2) 45 commercial spinach F1 hybrids from three countries; and 3) 30 US Arkansas spinach cultivars/breeding lines. The results from this study indicated that there was genetic diversity among the 343 spinach genotypes tested. Furthermore, the genetic background in improved commercial F1 hybrids and in Arkansas cultivars/lines had a different structured populations from the USDA germplasm. In addition, the genetic diversity and population structures were associated with geographic origin and germplasm from the US Arkansas breeding program had a unique genetic background. These data could provide genetic diversity information and the molecular markers for selecting parents in spinach breeding programs. PMID:29190770

  7. Amino acid sequence of bovine muzzle epithelial desmocollin derived from cloned cDNA: a novel subtype of desmosomal cadherins.

    PubMed

    Koch, P J; Goldschmidt, M D; Walsh, M J; Zimbelmann, R; Schmelz, M; Franke, W W

    1991-05-01

    Desmosomes are cell-type-specific intercellular junctions found in epithelium, myocardium and certain other tissues. They consist of assemblies of molecules involved in the adhesion of specific cell types and in the anchorage of cell-type-specific cytoskeletal elements, the intermediate-size filaments, to the plasma membrane. To explore the individual desmosomal components and their functions we have isolated DNA clones encoding the desmosomal glycoprotein, desmocollin, using antibodies and a cDNA expression library from bovine muzzle epithelium. The cDNA-deduced amino-acid sequence of desmocollin (presently we cannot decide to which of the two desmocollins, DC I or DC II, this clone relates) defines a polypeptide with a calculated molecular weight of 85,000, with a single candidate sequence of 24 amino acids sufficiently long for a transmembrane arrangement, and an extracellular aminoterminal portion of 561 amino acid residues, compared to a cytoplasmic part of only 176 amino acids. Amino acid sequence comparisons have revealed that desmocollin is highly homologous to members of the cadherin family of cell adhesion molecules, including the previously sequenced desmoglein, another desmosome-specific cadherin. Using riboprobes derived from cDNAs for Northern-blot analyses, we have identified an mRNA of approximately 6 kb in stratified epithelia such as muzzle epithelium and tongue mucosa but not in two epithelial cell culture lines containing desmosomes and desmoplakins. The difference may indicate drastic differences in mRNA concentration or the existence of cell-type-specific desmocollin subforms. The molecular topology of desmocollin(s) is discussed in relation to possible functions of the individual molecular domains.

  8. The amino acid sequence around the active-site cysteine and histidine residues of stem bromelain

    PubMed Central

    Husain, S. S.; Lowe, G.

    1970-01-01

    Stem bromelain that had been irreversibly inhibited with 1,3-dibromo[2-14C]-acetone was reduced with sodium borohydride and carboxymethylated with iodoacetic acid. After digestion with trypsin and α-chymotrypsin three radioactive peptides were isolated chromatographically. The amino acid sequences around the cross-linked cysteine and histidine residues were determined and showed a high degree of homology with those around the active-site cysteine and histidine residues of papain and ficin. PMID:5420046

  9. Predicted secondary structure similarity in the absence of primary amino acid sequence homology: hepatitis B virus open reading frames.

    PubMed Central

    Schaeffer, E; Sninsky, J J

    1984-01-01

    Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835

  10. The sequence of sequencers: The history of sequencing DNA.

    PubMed

    Heather, James M; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Deep sequencing reveals exceptional diversity and modes of transmission for bacterial sponge symbionts.

    PubMed

    Webster, Nicole S; Taylor, Michael W; Behnam, Faris; Lücker, Sebastian; Rattei, Thomas; Whalan, Stephen; Horn, Matthias; Wagner, Michael

    2010-08-01

    Marine sponges contain complex bacterial communities of considerable ecological and biotechnological importance, with many of these organisms postulated to be specific to sponge hosts. Testing this hypothesis in light of the recent discovery of the rare microbial biosphere, we investigated three Australian sponges by massively parallel 16S rRNA gene tag pyrosequencing. Here we show bacterial diversity that is unparalleled in an invertebrate host, with more than 250,000 sponge-derived sequence tags being assigned to 23 bacterial phyla and revealing up to 2996 operational taxonomic units (95% sequence similarity) per sponge species. Of the 33 previously described 'sponge-specific' clusters that were detected in this study, 48% were found exclusively in adults and larvae - implying vertical transmission of these groups. The remaining taxa, including 'Poribacteria', were also found at very low abundance among the 135,000 tags retrieved from surrounding seawater. Thus, members of the rare seawater biosphere may serve as seed organisms for widely occurring symbiont populations in sponges and their host association might have evolved much more recently than previously thought. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd.

  12. Microbial Diversity of Acidic Hot Spring (Kawah Hujan B) in Geothermal Field of Kamojang Area, West Java-Indonesia

    PubMed Central

    Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka

    2009-01-01

    Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria. PMID:19440252

  13. Microbial diversity of acidic hot spring (kawah hujan B) in geothermal field of kamojang area, west java-indonesia.

    PubMed

    Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka

    2009-01-01

    Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria.

  14. Low Diversity Cryptococcus neoformans Variety grubii Multilocus Sequence Types from Thailand Are Consistent with an Ancestral African Origin

    PubMed Central

    Simwami, Sitali P.; Khayhan, Kantarawee; Henk, Daniel A.; Aanensen, David M.; Boekhout, Teun; Hagen, Ferry; Brouwer, Annemarie E.; Harrison, Thomas S.; Donnelly, Christl A.; Fisher, Matthew C.

    2011-01-01

    The global burden of HIV-associated cryptococcal meningitis is estimated at nearly one million cases per year, causing up to a third of all AIDS-related deaths. Molecular epidemiology constitutes the main methodology for understanding the factors underpinning the emergence of this understudied, yet increasingly important, group of pathogenic fungi. Cryptococcus species are notable in the degree that virulence differs amongst lineages, and highly-virulent emerging lineages are changing patterns of human disease both temporally and spatially. Cryptococcus neoformans variety grubii (Cng, serotype A) constitutes the most ubiquitous cause of cryptococcal meningitis worldwide, however patterns of molecular diversity are understudied across some regions experiencing significant burdens of disease. We compared 183 clinical and environmental isolates of Cng from one such region, Thailand, Southeast Asia, against a global MLST database of 77 Cng isolates. Population genetic analyses showed that Thailand isolates from 11 provinces were highly homogenous, consisting of the same genetic background (globally known as VNI) and exhibiting only ten nearly identical sequence types (STs), with three (STs 44, 45 and 46) dominating our sample. This population contains significantly less diversity when compared against the global population of Cng, specifically Africa. Genetic diversity in Cng was significantly subdivided at the continental level with nearly half (47%) of the global STs unique to a genetically diverse and recombining population in Botswana. These patterns of diversity, when combined with evidence from haplotypic networks and coalescent analyses of global populations, are highly suggestive of an expansion of the Cng VNI clade out of Africa, leading to a limited number of genotypes founding the Asian populations. Divergence time testing estimates the time to the most common ancestor between the African and Asian populations to be 6,920 years ago (95% HPD 122.96 - 27

  15. Low diversity Cryptococcus neoformans variety grubii multilocus sequence types from Thailand are consistent with an ancestral African origin.

    PubMed

    Simwami, Sitali P; Khayhan, Kantarawee; Henk, Daniel A; Aanensen, David M; Boekhout, Teun; Hagen, Ferry; Brouwer, Annemarie E; Harrison, Thomas S; Donnelly, Christl A; Fisher, Matthew C

    2011-04-01

    The global burden of HIV-associated cryptococcal meningitis is estimated at nearly one million cases per year, causing up to a third of all AIDS-related deaths. Molecular epidemiology constitutes the main methodology for understanding the factors underpinning the emergence of this understudied, yet increasingly important, group of pathogenic fungi. Cryptococcus species are notable in the degree that virulence differs amongst lineages, and highly-virulent emerging lineages are changing patterns of human disease both temporally and spatially. Cryptococcus neoformans variety grubii (Cng, serotype A) constitutes the most ubiquitous cause of cryptococcal meningitis worldwide, however patterns of molecular diversity are understudied across some regions experiencing significant burdens of disease. We compared 183 clinical and environmental isolates of Cng from one such region, Thailand, Southeast Asia, against a global MLST database of 77 Cng isolates. Population genetic analyses showed that Thailand isolates from 11 provinces were highly homogenous, consisting of the same genetic background (globally known as VNI) and exhibiting only ten nearly identical sequence types (STs), with three (STs 44, 45 and 46) dominating our sample. This population contains significantly less diversity when compared against the global population of Cng, specifically Africa. Genetic diversity in Cng was significantly subdivided at the continental level with nearly half (47%) of the global STs unique to a genetically diverse and recombining population in Botswana. These patterns of diversity, when combined with evidence from haplotypic networks and coalescent analyses of global populations, are highly suggestive of an expansion of the Cng VNI clade out of Africa, leading to a limited number of genotypes founding the Asian populations. Divergence time testing estimates the time to the most common ancestor between the African and Asian populations to be 6,920 years ago (95% HPD 122.96 - 27

  16. The diversity of H3 loops determines the antigen-binding tendencies of antibody CDR loops.

    PubMed

    Tsuchiya, Yuko; Mizuguchi, Kenji

    2016-04-01

    Of the complementarity-determining regions (CDRs) of antibodies, H3 loops, with varying amino acid sequences and loop lengths, adopt particularly diverse loop conformations. The diversity of H3 conformations produces an array of antigen recognition patterns involving all the CDRs, in which the residue positions actually in contact with the antigen vary considerably. Therefore, for a deeper understanding of antigen recognition, it is necessary to relate the sequence and structural properties of each residue position in each CDR loop to its ability to bind antigens. In this study, we proposed a new method for characterizing the structural features of the CDR loops and obtained the antigen-binding ability of each residue position in each CDR loop. This analysis led to a simple set of rules for identifying probable antigen-binding residues. We also found that the diversity of H3 loop lengths and conformations affects the antigen-binding tendencies of all the CDR loops. © 2016 The Protein Society.

  17. Sequence diversity of hepatitis C virus 6a within the extended interferon sensitivity-determining region correlates with interferon-alpha/ribavirin treatment outcomes.

    PubMed

    Zhou, Daniel X M; Chan, Paul K S; Zhang, Tiejun; Tully, Damien C; Tam, John S

    2010-10-01

    Studies on the association between sequence variability of the interferon sensitivity-determining region (ISDR) of hepatitis C virus and the outcome of treatment have reached conflicting results. In this study, 25 patients infected with HCV 6a who had received interferon-alpha/ribavirin combination treatment were analyzed for the sequence variations. 14 of them had the full genome sequences obtained from a previous study, whereas the other 11 samples were sequenced for the extended ISDR (eISDR). This eISDR fragment covers 192 bp (64 amino acids) upstream and 201 bp (67 amino acids) downstream from the ISDR previously defined for HCV 1b. The comparison between interferon-alpha resistance and response groups for the amino acid mutations located in the full genome (6 and 8 patients respectively) as well as the mutations located in the eISDR (10 and 15 patients respectively) showed that the mutations I2160V, I2256V, V2292I (P<0.05) within eISDR were significantly associated with resistance to treatment. However, the extent of amino acid variations within previously defined ISDR was not associated with resistance to treatment as previously reported. Four amino acid variations I248V (P=0.03-0.06) within E1, R445K (P=0.02-0.05) and S747T (P=0.03) within E2, I861V (P=0.01) within NS2 which located outside the eISDR may also associate with treatment outcome as identified by a prescreening of variations within 14 HCV 6a full genomes. (c) 2010 Elsevier B.V. All rights reserved.

  18. Elucidating the substrate specificities of acyl-lipid thioesterases from diverse plant taxa.

    PubMed

    Kalinger, Rebecca S; Pulsifer, Ian P; Rowland, Owen

    2018-06-01

    Acyl-ACP thioesterase enzymes, which cleave fatty acyl thioester bonds to release free fatty acids, contribute to much of the fatty acid diversity in plants. In Arabidopsis thaliana, a family of four single hot-dog fold domain, plastid-localized acyl-lipid thioesterases (AtALT1-4) generate medium-chain (C6-C14) fatty and β-keto fatty acids as secondary metabolites. These volatile products may serve to attract insect pollinators or deter predatory insects. Homologs of AtALT1-4 are present in all plant taxa, but are nearly all uncharacterized. Despite high sequence identity, AtALT1-4 generate different lipid products, suggesting that ALT homologs in other plants also have highly varied activities. We investigated the catalytic diversity of ALT-like thioesterases by screening the substrate specificities of 15 ALT homologs from monocots, eudicots, a lycophyte, a green microalga, and the ancient gymnosperm Gingko biloba, via expression in Escherichia coli. Overall, these enzymes had highly varied substrate preferences compared to one another and to AtALT1-4, and could be classified into four catalytic groups comprising members from diverse taxa. Group 1 ALTs primarily generated 14:1 β-keto fatty acids, Group 2 ALTs produced 6-10 carbon fatty/β-keto fatty acids, Group 3 ALTs predominantly produced 12-14 carbon fatty acids, and Group 4 ALTs mainly generated 16 carbon fatty acids. Enzymes in each group differed significantly in the quantities of lipids and types of minor products they generated in E. coli. Medium-chain fatty acids are used to manufacture insecticides, pharmaceuticals, and biofuels, and ALT-like proteins are ideal candidates for metabolic engineering to produce specific fatty acids in significant quantities. Copyright © 2018 Elsevier Masson SAS. All rights reserved.

  19. A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package

    PubMed Central

    Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.

    2013-01-01

    Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703

  20. Diversity of lactic acid bacteria associated with traditional fermented dairy products in Mongolia.

    PubMed

    Yu, J; Wang, W H; Menghe, B L G; Jiri, M T; Wang, H M; Liu, W J; Bao, Q H; Lu, Q; Zhang, J C; Wang, F; Xu, H Y; Sun, T S; Zhang, H P

    2011-07-01

    Spontaneous milk fermentation has a long history in Mongolia, and beneficial microorganisms have been handed down from one generation to the next for use in fermented dairy products. The objective of this study was to investigate the diversity of lactic acid bacteria (LAB) communities in fermented yak, mare, goat, and cow milk products by analyzing 189 samples collected from 13 different regions in Mongolia. The LAB counts in these samples varied from 3.41 to 9.03 log cfu/mL. Fermented yak and mare milks had almost identical mean numbers of LAB, which were significantly higher than those in fermented goat milk but slightly lower than those in fermented cow milk. In total, 668 isolates were obtained from these samples using de Man, Rogosa, and Sharpe agar and M17 agar. Each isolate was considered to be presumptive LAB based on gram-positive and catalase-negative properties, and was identified at the species level by 16S rRNA gene sequencing, multiplex PCR assay, and restriction fragment length polymorphism analysis. All isolates from Mongolian dairy products were accurately identified as Enterococcus faecalis (1 strain), Enterococcus durans (3 strains), Lactobacillus brevis (3 strains), Lactobacillus buchneri (2 strains), Lactobacillus casei (16 strains), Lactobacillus delbrueckii ssp. bulgaricus (142 strains), Lactobacillus diolivorans (17 strains), Lactobacillus fermentum (42 strains), Lactobacillus helveticus (183 strains), Lactobacillus kefiri (6 strains), Lactobacillus plantarum ssp. plantarum (7 strains), Lactococcus lactis ssp. lactis (7 strains), Leuconostoc lactis (22 strains), Leuconostoc mesenteroides (21 strains), Streptococcus thermophilus (195 strains), and Weissella cibaria (1 strain). The predominant LAB were Strep. thermophilus and Lb. helveticus, which were isolated from all sampling sites. The results demonstrate that traditional fermented dairy products from different regions of Mongolia have complex compositions of LAB species. Such diversity of

  1. Archaeal β diversity patterns under the seafloor along geochemical gradients

    NASA Astrophysics Data System (ADS)

    Koyano, Hitoshi; Tsubouchi, Taishi; Kishino, Hirohisa; Akutsu, Tatsuya

    2014-09-01

    Recently, deep drilling into the seafloor has revealed that there are vast sedimentary ecosystems of diverse microorganisms, particularly archaea, in subsurface areas. We investigated the β diversity patterns of archaeal communities in sediment layers under the seafloor and their determinants. This study was accomplished by analyzing large environmental samples of 16S ribosomal RNA gene sequences and various geochemical data collected from a sediment core of 365.3 m, obtained by drilling into the seafloor off the east coast of the Shimokita Peninsula. To extract the maximum amount of information from these environmental samples, we first developed a method for measuring β diversity using sequence data by applying probability theory on a set of strings developed by two of the authors in a previous publication. We introduced an index of β diversity between sequence populations from which the sequence data were sampled. We then constructed an estimator of the β diversity index based on the sequence data and demonstrated that it converges to the β diversity index between sequence populations with probability of 1 as the number of sampled sequences increases. Next, we applied this new method to quantify β diversities between archaeal sequence populations under the seafloor and constructed a quantitative model of the estimated β diversity patterns. Nearly 90% of the variation in the archaeal β diversity was explained by a model that included as variables the differences in the abundances of chlorine, iodine, and carbon between the sediment layers.

  2. Diversity of Pneumolysin and Pneumococcal Histidine Triad Protein D of Streptococcus pneumoniae Isolated from Invasive Diseases in Korean Children.

    PubMed

    Yun, Ki Wook; Lee, Hyunju; Choi, Eun Hwa; Lee, Hoan Jong

    2015-01-01

    Pneumolysin (Ply) and pneumococcal histidine triad protein D (PhtD) are candidate proteins for a next-generation pneumococcal vaccine. We aimed to analyze the genetic diversity and antigenic heterogeneity of Ply and PhtD for 173 pneumococci isolated from invasive diseases in Korean children. Allele was designated based on the variation of amino acid sequence. Antigenicity was predicted by the amino acid hydrophobicity of the region. There were seven and 39 allele types for the ply and phtD genes, respectively. The nucleotide sequence identity was 97.2%-99.9% for ply and 91.4%-98.0% for phtD gene. Only minor variations in hydrophobicity were noted among the antigenicity plots of Ply and PhtD. Overall, the allele types of the ply and phtD genes were remarkably homogeneous, and the antigenic diversity of the corresponding proteins was very limited. The Ply and PhtD could be useful antigens for universal pneumococcal vaccines.

  3. Sequence diversity of the leukotoxin (lktA) gene in caprine and ovine strains of Mannheimia haemolytica.

    PubMed

    Vougidou, C; Sandalakis, V; Psaroulaki, A; Petridou, E; Ekateriniadou, L

    2013-04-20

    Mannheimia haemolytica is the aetiological agent of pneumonic pasteurellosis in small ruminants. The primary virulence factor of the bacterium is a leukotoxin (LktA), which induces apoptosis in susceptible cells via mitochondrial targeting. It has been previously shown that certain lktA alleles are associated either with cattle or sheep. The objective of the present study was to investigate lktA sequence variation among ovine and caprine M haemolytica strains isolated from pneumonic lungs, revealing any potential adaptation for the caprine host, for which there is no available data. Furthermore, we investigated amino acid variation in the N-terminal part of the sequences and its effect on targeting mitochondria. Data analysis showed that the prevalent caprine genotype differed at a single non-synonymous site from a previously described uncommon bovine allele, whereas the ovine sequences represented new, distinct alleles. N-terminal sequence differences did not affect the mitochondrial targeting ability of the isolates; interestingly enough in one case, mitochondrial matrix targeting was indicated rather than membrane association, suggesting an alternative LktA trafficking pattern.

  4. Next-generation sequencing based genotyping, cytometry and phenotyping for understanding diversity and evolution of Guinea yams.

    PubMed

    Girma, Gezahegn; Hyma, Katie E; Asiedu, Robert; Mitchell, Sharon E; Gedil, Melaku; Spillane, Charles

    2014-08-01

    Genotyping by sequencing (GBS) is used to understand the origin and domestication of guinea yams, including the contribution of wild relatives and polyploidy events to the cultivated guinea yams. Patterns of genetic diversity within and between two cultivated guinea yams (Dioscorea rotundata and D. cayenensis) and five wild relatives (D. praehensilis, D. mangenotiana, D. abyssinica, D. togoensis and D. burkilliana) were investigated using next-generation sequencing (genotyping by sequencing, GBS). Additionally, the two cultivated species were assessed for intra-specific morphological and ploidy variation. In guinea yams, ploidy level is correlated with species identity. Using flow cytometry a single ploidy level was inferred across D. cayenensis (3x, N = 21), D. praehensilis (2x, N = 7), and D. mangenotiana (3x, N = 5) accessions, whereas both diploid and triploid (or aneuploid) accessions were present in D. rotundata (N = 11 and N = 32, respectively). Multi-dimensional scaling and maximum parsimony analyses of 2,215 SNPs revealed that wild guinea yam populations form discrete genetic groupings according to species. D. togoensis and D. burkilliana were most distant from the two cultivated yam species, whereas D. abyssinica, D. mangenotiana, and D. praehensilis were closest to cultivated yams. In contrast, cultivated species were genetically less clearly defined at the intra-specific level. While D. cayenensis formed a single genetic group, D. rotundata comprised three separate groups consisting of; (1) a set of diploid individuals genetically similar to D. praehensilis, (2) a set of diploid individuals genetically similar to D. cayenensis, and (3) a set of triploid individuals. The current study demonstrates the utility of GBS for assessing yam genomic diversity. Combined with morphological and biological data, GBS provides a powerful tool for testing hypotheses regarding the evolution, domestication and breeding of guinea yams.

  5. Phylogenetic Diversity of Koala Retrovirus within a Wild Koala Population.

    PubMed

    Chappell, K J; Brealey, J C; Amarilla, A A; Watterson, D; Hulse, L; Palmieri, C; Johnston, S D; Holmes, E C; Meers, J; Young, P R

    2017-02-01

    Koala populations are in serious decline across many areas of mainland Australia, with infectious disease a contributing factor. Koala retrovirus (KoRV) is a gammaretrovirus present in most wild koala populations and captive colonies. Five subtypes of KoRV (A to E) have been identified based on amino acid sequence divergence in a hypervariable region of the receptor binding domain of the envelope protein. However, analysis of viral genetic diversity has been conducted primarily on KoRV in captive koalas housed in zoos in Japan, the United States, and Germany. Wild koalas within Australia have not been comparably assessed. Here we report a detailed analysis of KoRV genetic diversity in samples collected from 18 wild koalas from southeast Queensland. By employing deep sequencing we identified 108 novel KoRV envelope sequences and determined their phylogenetic diversity. Genetic diversity in KoRV was abundant and fell into three major groups; two comprised the previously identified subtypes A and B, while the third contained the remaining hypervariable region subtypes (C, D, and E) as well as four hypervariable region subtypes that we newly define here (F, G, H, and I). In addition to the ubiquitous presence of KoRV-A, which may represent an exclusively endogenous variant, subtypes B, D, and F were found to be at high prevalence, while subtypes G, H, and I were present in a smaller number of animals. Koala retrovirus (KoRV) is thought to be a significant contributor to koala disease and population decline across mainland Australia. This study is the first to determine KoRV subtype prevalence among a wild koala population, and it significantly expands the total number of KoRV sequences available, providing a more precise picture of genetic diversity. This understanding of KoRV subtype prevalence and genetic diversity will be important for conservation efforts attempting to limit the spread of KoRV. Furthermore, KoRV is one of the only retroviruses shown to exist in

  6. Genome-wide diversity and selective pressure in the human rhinovirus

    PubMed Central

    Kistler, Amy L; Webster, Dale R; Rouskin, Silvi; Magrini, Vince; Credle, Joel J; Schnurr, David P; Boushey, Homer A; Mardis, Elaine R; Li, Hao; DeRisi, Joseph L

    2007-01-01

    Background The human rhinoviruses (HRV) are one of the most common and diverse respiratory pathogens of humans. Over 100 distinct HRV serotypes are known, yet only 6 genomes are available. Due to the paucity of HRV genome sequence, little is known about the genetic diversity within HRV or the forces driving this diversity. Previous comparative genome sequence analyses indicate that recombination drives diversification in multiple genera of the picornavirus family, yet it remains unclear if this holds for HRV. Results To resolve this and gain insight into the forces driving diversification in HRV, we generated a representative set of 34 fully sequenced HRVs. Analysis of these genomes shows consistent phylogenies across the genome, conserved non-coding elements, and only limited recombination. However, spikes of genetic diversity at both the nucleotide and amino acid level are detectable within every locus of the genome. Despite this, the HRV genome as a whole is under purifying selective pressure, with islands of diversifying pressure in the VP1, VP2, and VP3 structural genes and two non-structural genes, the 3C protease and 3D polymerase. Mapping diversifying residues in these factors onto available 3-dimensional structures revealed the diversifying capsid residues partition to the external surface of the viral particle in statistically significant proximity to antigenic sites. Diversifying pressure in the pleconaril binding site is confined to a single residue known to confer drug resistance (VP1 191). In contrast, diversifying pressure in the non-structural genes is less clear, mapping both nearby and beyond characterized functional domains of these factors. Conclusion This work provides a foundation for understanding HRV genetic diversity and insight into the underlying biology driving evolution in HRV. It expands our knowledge of the genome sequence space that HRV reference serotypes occupy and how the pattern of genetic diversity across HRV genomes differs

  7. Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands

    USGS Publications Warehouse

    Jarvi, S.I.; Farias, M.E.; Lapointe, D.A.; Belcaid, M.; Atkinson, C.T.

    2013-01-01

    Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.

  8. Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands.

    PubMed

    Jarvi, S I; Farias, M E; Lapointe, D A; Belcaid, M; Atkinson, C T

    2013-12-01

    Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.

  9. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

    PubMed

    Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

    2013-04-01

    Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.

  10. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

    PubMed

    Blanco-Míguez, Aitor; Gutiérrez-Jácome, Alberto; Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Catalán-García, Sandra; Fdez-Riverola, Florentino; Lourenço, Anália; Sánchez, Borja

    2016-06-01

    Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as "antiproliferative," "antitumoral," or "apoptosis" among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed. © 2016 The Protein Society.

  11. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides

    PubMed Central

    Blanco‐Míguez, Aitor; Gutiérrez‐Jácome, Alberto; Pérez‐Pérez, Martín; Pérez‐Rodríguez, Gael; Catalán‐García, Sandra; Fdez‐Riverola, Florentino; Lourenço, Anália

    2016-01-01

    Abstract Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as “antiproliferative,” “antitumoral,” or “apoptosis” among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed. PMID:27010507

  12. Phylogenetic and ecological analyses of soil and sporocarp DNA sequences reveal high diversity and strong habitat partitioning in the boreal ectomycorrhizal genus Russula (Russulales; Basidiomycota)

    Treesearch

    József Geml; Gary A. Laursen; Ian C. Herriott; Jack M. McFarland; Michael G. Booth; Niall Lennon; H. Chad Nusbaum; D. Lee Taylor

    2010-01-01

    Although critical for the functioning of ecosystems, fungi are poorly known in high-latitude regions. Here, we provide the first genetic diversity assessment of one of the most diverse and abundant ectomycorrhizal genera in Alaska: Russula. We analyzed internal transcribed spacer rDNA sequences from sporocarps and soil samples using phylogenetic...

  13. High diversity of airborne fungi in the hospital environment as revealed by meta-sequencing-based microbiome analysis

    PubMed Central

    Tong, Xunliang; Xu, Hongtao; Zou, Lihui; Cai, Meng; Xu, Xuefeng; Zhao, Zuotao; Xiao, Fei; Li, Yanming

    2017-01-01

    Invasive fungal infections acquired in the hospital have progressively emerged as an important cause of life-threatening infection. In particular, airborne fungi in hospitals are considered critical pathogens of hospital-associated infections. To identify the causative airborne microorganisms, high-volume air samplers were utilized for collection, and species identification was performed using a culture-based method and DNA sequencing analysis with the Illumina MiSeq and HiSeq 2000 sequencing systems. Few bacteria were grown after cultivation in blood agar. However, using microbiome sequencing, the relative abundance of fungi, Archaea species, bacteria and viruses was determined. The distribution characteristics of fungi were investigated using heat map analysis of four departments, including the Respiratory Intensive Care Unit, Intensive Care Unit, Emergency Room and Outpatient Department. The prevalence of Aspergillus among fungi was the highest at the species level, approximately 17% to 61%, and the prevalence of Aspergillus fumigatus among Aspergillus species was from 34% to 50% in the four departments. Draft genomes of microorganisms isolated from the hospital environment were obtained by sequence analysis, indicating that investigation into the diversity of airborne fungi may provide reliable results for hospital infection control and surveillance. PMID:28045065

  14. Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

    PubMed Central

    Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

    1986-01-01

    A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461

  15. Diversity of Ligninolytic Enzymes and Their Genes in Strains of the Genus Ganoderma: Applicable for Biodegradation of Xenobiotic Compounds?

    PubMed Central

    Torres-Farradá, Giselle; Manzano León, Ana M.; Rineau, François; Ledo Alonso, Lucía L.; Sánchez-López, María I.; Thijs, Sofie; Colpaert, Jan; Ramos-Leal, Miguel; Guerra, Gilda; Vangronsveld, Jaco

    2017-01-01

    White-rot fungi (WRF) and their ligninolytic enzymes (laccases and peroxidases) are considered promising biotechnological tools to remove lignin related Persistent Organic Pollutants from industrial wastewaters and contaminated ecosystems. A high diversity of the genus Ganoderma has been reported in Cuba; in spite of this, the diversity of ligninolytic enzymes and their genes remained unexplored. In this study, 13 native WRF strains were isolated from decayed wood in urban ecosystems in Havana (Cuba). All strains were identified as Ganoderma sp. using a multiplex polymerase chain reaction (PCR)-method based on ITS sequences. All Ganoderma sp. strains produced laccase enzymes at higher levels than non-specific peroxidases. Native-PAGE of extracellular enzymatic extracts revealed a high diversity of laccase isozymes patterns between the strains, suggesting the presence of different amino acid sequences in the laccase enzymes produced by these Ganoderma strains. We determined the diversity of genes encoding laccases and peroxidases using a PCR and cloning approach with basidiomycete-specific primers. Between two and five laccase genes were detected in each strain. In contrast, only one gene encoding manganese peroxidase or versatile peroxidase was detected in each strain. The translated laccases and peroxidases amino acid sequences have not been described before. Extracellular crude enzymatic extracts produced by the Ganoderma UH strains, were able to degrade model chromophoric compounds such as anthraquinone and azo dyes. These findings hold promises for the development of a practical application for the treatment of textile industry wastewaters and also for bioremediation of polluted ecosystems by well-adapted native WRF strains. PMID:28588565

  16. Unraveling Haplotype Diversity of the Apical Membrane Antigen-1 Gene in Plasmodium falciparum Populations in Thailand

    PubMed Central

    Lumkul, Lalita; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Harnyuttanakorn, Pongchai; Pattaradilokrat, Sittiporn

    2018-01-01

    Development of an effective vaccine is critically needed for the prevention of malaria. One of the key antigens for malaria vaccines is the apical membrane antigen 1 (AMA-1) of the human malaria parasite Plasmodium falciparum, the surface protein for erythrocyte invasion of the parasite. The gene encoding AMA-1 has been sequenced from populations of P. falciparum worldwide, but the haplotype diversity of the gene in P. falciparum populations in the Greater Mekong Subregion (GMS), including Thailand, remains to be characterized. In the present study, the AMA-1 gene was PCR amplified and sequenced from the genomic DNA of 65 P. falciparum isolates from 5 endemic areas in Thailand. The nearly full-length 1,848 nucleotide sequence of AMA-1 was subjected to molecular analyses, including nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity and neutrality tests. Phylogenetic analysis and pairwise population differentiation (Fst indices) were performed to infer the population structure. The analyses identified 60 single nucleotide polymorphic loci, predominately located in domain I of AMA-1. A total of 31 unique AMA-1 haplotypes were identified, which included 11 novel ones. The phylogenetic tree of the AMA-1 haplotypes revealed multiple clades of AMA-1, each of which contained parasites of multiple geographical origins, consistent with the Fst indices indicating genetic homogeneity or gene flow among geographically distinct populations of P. falciparum in Thailand’s borders with Myanmar, Laos and Cambodia. In summary, the study revealed novel haplotypes and population structure needed for the further advancement of AMA-1-based malaria vaccines in the GMS. PMID:29742870

  17. Genome Sequence of Lactobacillus rhamnosus Strain CASL, an Efficient l-Lactic Acid Producer from Cheap Substrate Cassava

    PubMed Central

    Yu, Bo; Su, Fei; Wang, Limin; Zhao, Bo; Qin, Jiayang; Ma, Cuiqing; Xu, Ping; Ma, Yanhe

    2011-01-01

    Lactobacillus rhamnosus is a type of probiotic bacteria with industrial potential for l-lactic acid production. We announce the draft genome sequence of L. rhamnosus CASL (2,855,156 bp with a G+C content of 46.6%), which is an efficient producer of l-lactic acid from cheap, nonfood substrate cassava with a high production titer. PMID:22123765

  18. Fluorescence energy transfer as a probe for nucleic acid structures and sequences.

    PubMed Central

    Mergny, J L; Boutorine, A S; Garestier, T; Belloc, F; Rougée, M; Bulychev, N V; Koshkin, A A; Bourson, J; Lebedev, A V; Valeur, B

    1994-01-01

    The primary or secondary structure of single-stranded nucleic acids has been investigated with fluorescent oligonucleotides, i.e., oligonucleotides covalently linked to a fluorescent dye. Five different chromophores were used: 2-methoxy-6-chloro-9-amino-acridine, coumarin 500, fluorescein, rhodamine and ethidium. The chemical synthesis of derivatized oligonucleotides is described. Hybridization of two fluorescent oligonucleotides to adjacent nucleic acid sequences led to fluorescence excitation energy transfer between the donor and the acceptor dyes. This phenomenon was used to probe primary and secondary structures of DNA fragments and the orientation of oligodeoxynucleotides synthesized with the alpha-anomers of nucleoside units. Fluorescence energy transfer can be used to reveal the formation of hairpin structures and the translocation of genes between two chromosomes. PMID:8152922

  19. Sequence-Specific Recognition of DNA by Proteins: Binding Motifs Discovered Using a Novel Statistical/Computational Analysis

    PubMed Central

    Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri

    2016-01-01

    Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774

  20. Genetic diversity and antigenicity variation of Babesia bovis merozoite surface antigen-1 (MSA-1) in Thailand.

    PubMed

    Tattiyapong, Muncharee; Sivakumar, Thillaiampalam; Takemae, Hitoshi; Simking, Pacharathon; Jittapalapong, Sathaporn; Igarashi, Ikuo; Yokoyama, Naoaki

    2016-07-01

    Babesia bovis, an intraerythrocytic protozoan parasite, causes severe clinical disease in cattle worldwide. The genetic diversity of parasite antigens often results in different immune profiles in infected animals, hindering efforts to develop immune control methodologies against the B. bovis infection. In this study, we analyzed the genetic diversity of the merozoite surface antigen-1 (msa-1) gene using 162 B. bovis-positive blood DNA samples sourced from cattle populations reared in different geographical regions of Thailand. The identity scores shared among 93 msa-1 gene sequences isolated by PCR amplification were 43.5-100%, and the similarity values among the translated amino acid sequences were 42.8-100%. Of 23 total clades detected in our phylogenetic analysis, Thai msa-1 gene sequences occurred in 18 clades; seven among them were composed of sequences exclusively from Thailand. To investigate differential antigenicity of isolated MSA-1 proteins, we expressed and purified eight recombinant MSA-1 (rMSA-1) proteins, including an rMSA-1 from B. bovis Texas (T2Bo) strain and seven rMSA-1 proteins based on the Thai msa-1 sequences. When these antigens were analyzed in a western blot assay, anti-T2Bo cattle serum strongly reacted with the rMSA-1 from T2Bo, as well as with three other rMSA-1 proteins that shared 54.9-68.4% sequence similarity with T2Bo MSA-1. In contrast, no or weak reactivity was observed for the remaining rMSA-1 proteins, which shared low sequence similarity (35.0-39.7%) with T2Bo MSA-1. While demonstrating the high genetic diversity of the B. bovis msa-1 gene in Thailand, the present findings suggest that the genetic diversity results in antigenicity variations among the MSA-1 antigens of B. bovis in Thailand. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Deep sequencing in library selection projects: what insight does it bring?

    PubMed

    Glanville, J; D'Angelo, S; Khan, T A; Reddy, S T; Naranjo, L; Ferrara, F; Bradbury, A R M

    2015-08-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Single sea urchin phagocytes express messages of a single sequence from the diverse Sp185/333 gene family in response to bacterial challenge.

    PubMed

    Majeske, Audrey J; Oren, Matan; Sacchi, Sandro; Smith, L Courtney

    2014-12-01

    Immune systems in animals rely on fast and efficient responses to a wide variety of pathogens. The Sp185/333 gene family in the purple sea urchin, Strongylocentrotus purpuratus, consists of an estimated 50 (±10) members per genome that share a basic gene structure but show high sequence diversity, primarily due to the mosaic appearance of short blocks of sequence called elements. The genes show significantly elevated expression in three subpopulations of phagocytes responding to marine bacteria. The encoded Sp185/333 proteins are highly diverse and have central effector functions in the immune system. In this study we report the Sp185/333 gene expression in single sea urchin phagocytes. Sea urchins challenged with heat-killed marine bacteria resulted in a typical increase in coelomocyte concentration within 24 h, which included an increased proportion of phagocytes expressing Sp185/333 proteins. Phagocyte fractions enriched from coelomocytes were used in limiting dilutions to obtain samples of single cells that were evaluated for Sp185/333 gene expression by nested RT-PCR. Amplicon sequences showed identical or nearly identical Sp185/333 amplicon sequences in single phagocytes with matches to six known Sp185/333 element patterns, including both common and rare element patterns. This suggested that single phagocytes show restricted expression from the Sp185/333 gene family and infers a diverse, flexible, and efficient response to pathogens. This type of expression pattern from a family of immune response genes in single cells has not been identified previously in other invertebrates. Copyright © 2014 by The American Association of Immunologists, Inc.

  3. Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

    PubMed Central

    Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.

    2011-01-01

    The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515

  4. Sequence Diversity, Intersubgroup Relationships, and Origins of the Mouse Leukemia Gammaretroviruses of Laboratory and Wild Mice.

    PubMed

    Bamunusinghe, Devinka; Naghashfar, Zohreh; Buckler-White, Alicia; Plishka, Ronald; Baliji, Surendranath; Liu, Qingping; Kassner, Joshua; Oler, Andrew J; Hartley, Janet; Kozak, Christine A

    2016-04-01

    Mouse leukemia viruses (MLVs) are found in the common inbred strains of laboratory mice and in the house mouse subspecies ofMus musculus Receptor usage and envelope (env) sequence variation define three MLV host range subgroups in laboratory mice: ecotropic, polytropic, and xenotropic MLVs (E-, P-, and X-MLVs, respectively). These exogenous MLVs derive from endogenous retroviruses (ERVs) that were acquired by the wild mouse progenitors of laboratory mice about 1 million years ago. We analyzed the genomes of seven MLVs isolated from Eurasian and American wild mice and three previously sequenced MLVs to describe their relationships and identify their possible ERV progenitors. The phylogenetic tree based on the receptor-determining regions ofenvproduced expected host range clusters, but these clusters are not maintained in trees generated from other virus regions. Colinear alignments of the viral genomes identified segmental homologies to ERVs of different host range subgroups. Six MLVs show close relationships to a small xenotropic ERV subgroup largely confined to the inbred mouse Y chromosome.envvariations define three E-MLV subtypes, one of which carries duplications of various sizes, sequences, and locations in the proline-rich region ofenv Outside theenvregion, all E-MLVs are related to different nonecotropic MLVs. These results document the diversity in gammaretroviruses isolated from globally distributedMussubspecies, provide insight into their origins and relationships, and indicate that recombination has had an important role in the evolution of these mutagenic and pathogenic agents. Laboratory mice carry mouse leukemia viruses (MLVs) of three host range groups which were acquired from their wild mouse progenitors. We sequenced the complete genomes of seven infectious MLVs isolated from geographically separated Eurasian and American wild mice and compared them with endogenous germ line retroviruses (ERVs) acquired early in house mouse evolution. We did this

  5. Succession sequence of lactic acid bacteria driven by environmental factors and substrates throughout the brewing process of Shanxi aged vinegar.

    PubMed

    Zheng, Yu; Mou, Jun; Niu, Jiwei; Yang, Shuai; Chen, Lin; Xia, Menglei; Wang, Min

    2018-03-01

    Lactic acid bacteria (LAB) are essential microbiota for the fermentation and flavor formation of Shanxi aged vinegar, a famous Chinese traditional cereal vinegar that is manufactured using open solid-state fermentation (SSF) technology. However, the dynamics of LAB in this SSF process and the underlying mechanism remain poorly understood. Here, the diversity of LAB and the potential driving factors of the entire process were analyzed by combining culture-independent and culture-dependent methods. Canonical correlation analysis indicated that ethanol, acetic acid, and temperature that result from the metabolism of microorganisms serve as potential driving factors for LAB succession. LAB strains were periodically isolated, and the characteristics of 57 isolates on environmental factor tolerance and substrate utilization were analyzed to understand the succession sequence. The environmental tolerance of LAB from different stages was in accordance with their fermentation conditions. Remarkable correlations were identified between LAB growth and environmental factors with 0.866 of ethanol (70 g/L), 0.756 of acetic acid (10 g/L), and 0.803 of temperature (47 °C). More gentle or harsh environments (less or more than 60 or 80 g/L of ethanol, 5 or 20 g/L of acetic acid, and 30 or 55 °C temperature) did not affect the LAB succession. The utilization capability evaluation of the 57 isolates for 95 compounds proved that strains from different fermentation stages exhibited different predilections on substrates to contribute to the fermentation at different stages. Results demonstrated that LAB succession in the SSF process was driven by the capabilities of environmental tolerance and substrate utilization.

  6. Complete genomic sequences of Propionibacterium freudenreichii phages from Swiss cheese reveal greater diversity than Cutibacterium (formerly Propionibacterium) acnes phages.

    PubMed

    Cheng, Lucy; Marinelli, Laura J; Grosset, Noël; Fitz-Gibbon, Sorel T; Bowman, Charles A; Dang, Brian Q; Russell, Daniel A; Jacobs-Sera, Deborah; Shi, Baochen; Pellegrini, Matteo; Miller, Jeff F; Gautier, Michel; Hatfull, Graham F; Modlin, Robert L

    2018-03-01

    A remarkable exception to the large genetic diversity often observed for bacteriophages infecting a specific bacterial host was found for the Cutibacterium acnes (formerly Propionibacterium acnes) phages, which are highly homogeneous. Phages infecting the related species, which is also a member of the Propionibacteriaceae family, Propionibacterium freudenreichii, a bacterium used in production of Swiss-type cheeses, have also been described and are common contaminants of the cheese manufacturing process. However, little is known about their genetic composition and diversity. We obtained seven independently isolated bacteriophages that infect P. freudenreichii from Swiss-type cheese samples, and determined their complete genome sequences. These data revealed that all seven phage isolates are of similar genomic length and GC% content, but their genomes are highly diverse, including genes encoding the capsid, tape measure, and tail proteins. In contrast to C. acnes phages, all P. freudenreichii phage genomes encode a putative integrase protein, suggesting they are capable of lysogenic growth. This is supported by the finding of related prophages in some P. freudenreichii strains. The seven phages could further be distinguished as belonging to two distinct genomic types, or 'clusters', based on nucleotide sequences, and host range analyses conducted on a collection of P. freudenreichii strains show a higher degree of host specificity than is observed for the C. acnes phages. Overall, our data demonstrate P. freudenreichii bacteriophages are distinct from C. acnes phages, as evidenced by their higher genetic diversity, potential for lysogenic growth, and more restricted host ranges. This suggests substantial differences in the evolution of these related species from the Propionibacteriaceae family and their phages, which is potentially related to their distinct environmental niches.

  7. High Diversity of CTX-M Extended-Spectrum β-Lactamases in Municipal Wastewater and Urban Wetlands

    PubMed Central

    Borgogna, Timothy R.; Borgogna, Joanna-Lynn; Mielke, Jenna A.; Brown, Celeste J.; Top, Eva M.; Botts, Ryan T.

    2016-01-01

    The CTX-M-type extended-spectrum β-lactamases (ESBLs) present a serious public health threat as they have become nearly ubiquitous among clinical gram-negative pathogens, particularly the enterobacteria. To aid in the understanding and eventual control of the spread of such resistance genes, we sought to determine the diversity of CTX-M ESBLs not among clinical isolates, but in the environment, where weaker and more diverse selective pressures may allow greater enzyme diversification. This was done by examining the CTX-M diversity in municipal wastewater and urban coastal wetlands in southern California, United States, by Sanger sequencing of polymerase chain reaction amplicons. Of the five known CTX-M phylogroups (1, 2, 8, 9, and 25), only genes from groups 1 and 2 were detected in both wastewater treatment plants (WWTPs), and group 1 genes were also detected in one of the two wetlands after a winter rain. The highest relative abundance of blaCTX-M group 1 genes was in the sludge of one WWTP (2.1 × 10−4 blaCTX-M copies/16S rRNA gene copy). Gene libraries revealed surprisingly high nucleotide sequence diversity, with 157 new variants not found in GenBank, representing 99 novel amino acid sequences. Our results indicate that the resistomes of WWTPs and urban wetlands contain diverse blaCTX-M ESBLs, which may constitute a mobile reservoir of clinically relevant resistance genes. PMID:26670020

  8. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification

    PubMed Central

    Schouten, Jan P.; McElgunn, Cathal J.; Waaijer, Raymond; Zwijnenburg, Danny; Diepvens, Filip; Pals, Gerard

    2002-01-01

    We describe a new method for relative quantification of 40 different DNA sequences in an easy to perform reaction requiring only 20 ng of human DNA. Applications shown of this multiplex ligation-dependent probe amplification (MLPA) technique include the detection of exon deletions and duplications in the human BRCA1, MSH2 and MLH1 genes, detection of trisomies such as Down’s syndrome, characterisation of chromosomal aberrations in cell lines and tumour samples and SNP/mutation detection. Relative quantification of mRNAs by MLPA will be described elsewhere. In MLPA, not sample nucleic acids but probes added to the samples are amplified and quantified. Amplification of probes by PCR depends on the presence of probe target sequences in the sample. Each probe consists of two oligonucleotides, one synthetic and one M13 derived, that hybridise to adjacent sites of the target sequence. Such hybridised probe oligonucleotides are ligated, permitting subsequent amplification. All ligated probes have identical end sequences, permitting simultaneous PCR amplification using only one primer pair. Each probe gives rise to an amplification product of unique size between 130 and 480 bp. Probe target sequences are small (50–70 nt). The prerequisite of a ligation reaction provides the opportunity to discriminate single nucleotide differences. PMID:12060695

  9. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification.

    PubMed

    Schouten, Jan P; McElgunn, Cathal J; Waaijer, Raymond; Zwijnenburg, Danny; Diepvens, Filip; Pals, Gerard

    2002-06-15

    We describe a new method for relative quantification of 40 different DNA sequences in an easy to perform reaction requiring only 20 ng of human DNA. Applications shown of this multiplex ligation-dependent probe amplification (MLPA) technique include the detection of exon deletions and duplications in the human BRCA1, MSH2 and MLH1 genes, detection of trisomies such as Down's syndrome, characterisation of chromosomal aberrations in cell lines and tumour samples and SNP/mutation detection. Relative quantification of mRNAs by MLPA will be described elsewhere. In MLPA, not sample nucleic acids but probes added to the samples are amplified and quantified. Amplification of probes by PCR depends on the presence of probe target sequences in the sample. Each probe consists of two oligonucleotides, one synthetic and one M13 derived, that hybridise to adjacent sites of the target sequence. Such hybridised probe oligonucleotides are ligated, permitting subsequent amplification. All ligated probes have identical end sequences, permitting simultaneous PCR amplification using only one primer pair. Each probe gives rise to an amplification product of unique size between 130 and 480 bp. Probe target sequences are small (50-70 nt). The prerequisite of a ligation reaction provides the opportunity to discriminate single nucleotide differences.

  10. Detecting differential DNA methylation from sequencing of bisulfite converted DNA of diverse species.

    PubMed

    Huh, Iksoo; Wu, Xin; Park, Taesung; Yi, Soojin V

    2017-07-21

    DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation. © The Author 2017. Published by Oxford University Press.

  11. A transcriptome resource for the koala (Phascolarctos cinereus): insights into koala retrovirus transcription and sequence diversity.

    PubMed

    Hobbs, Matthew; Pavasovic, Ana; King, Andrew G; Prentis, Peter J; Eldridge, Mark D B; Chen, Zhiliang; Colgan, Donald J; Polkinghorne, Adam; Wilkins, Marc R; Flanagan, Cheyne; Gillett, Amber; Hanger, Jon; Johnson, Rebecca N; Timms, Peter

    2014-09-11

    The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene.Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. This transcriptomic

  12. Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

    PubMed Central

    Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  13. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    PubMed Central

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

  14. Genetic Diversity of Arabica Coffee (Coffea arabica L.) in Nicaragua as Estimated by Simple Sequence Repeat Markers

    PubMed Central

    Geleta, Mulatu; Herrera, Isabel; Monzón, Arnulfo; Bryngelsson, Tomas

    2012-01-01

    Coffea arabica L. (arabica coffee), the only tetraploid species in the genus Coffea, represents the majority of the world's coffee production and has a significant contribution to Nicaragua's economy. The present paper was conducted to determine the genetic diversity of arabica coffee in Nicaragua for its conservation and breeding values. Twenty-six populations that represent eight varieties in Nicaragua were investigated using simple sequence repeat (SSR) markers. A total of 24 alleles were obtained from the 12 loci investigated across 260 individual plants. The total Nei's gene diversity (H T) and the within-population gene diversity (H S) were 0.35 and 0.29, respectively, which is comparable with that previously reported from other countries and regions. Among the varieties, the highest diversity was recorded in the variety Catimor. Analysis of variance (AMOVA) revealed that about 87% of the total genetic variation was found within populations and the remaining 13% differentiate the populations (F ST = 0.13; P < 0.001). The variation among the varieties was also significant. The genetic variation in Nicaraguan coffee is significant enough to be used in the breeding programs, and most of this variation can be conserved through ex situ conservation of a low number of populations from each variety. PMID:22701376

  15. Effect of plant diversity on the diversity of soil organic compounds.

    PubMed

    El Moujahid, Lamiae; Le Roux, Xavier; Michalet, Serge; Bellvert, Florian; Weigelt, Alexandra; Poly, Franck

    2017-01-01

    The effect of plant diversity on aboveground organisms and processes was largely studied but there is still a lack of knowledge regarding the link between plant diversity and soil characteristics. Here, we analyzed the effect of plant identity and diversity on the diversity of extractible soil organic compounds (ESOC) using 87 experimental grassland plots with different levels of plant diversity and based on a pool of over 50 plant species. Two pools of low molecular weight organic compounds, LMW1 and LMW2, were characterized by GC-MS and HPLC-DAD, respectively. These pools include specific organic acids, fatty acids and phenolics, with more organic acids in LMW1 and more phenolics in LMW2. Plant effect on the diversity of LMW1 and LMW2 compounds was strong and weak, respectively. LMW1 richness observed for bare soil was lower than that observed for all planted soils; and the richness of these soil compounds increased twofold when dominant plant species richness increased from 1 to 6. Comparing the richness of LMW1 compounds observed for a range of plant mixtures and for plant monocultures of species present in these mixtures, we showed that plant species richness increases the richness of these ESOC mainly through complementarity effects among plant species associated with contrasted spectra of soil compounds. This could explain previously reported effects of plant diversity on the diversity of soil heterotrophic microorganisms.

  16. Effect of plant diversity on the diversity of soil organic compounds

    PubMed Central

    El Moujahid, Lamiae; Michalet, Serge; Bellvert, Florian; Weigelt, Alexandra; Poly, Franck

    2017-01-01

    The effect of plant diversity on aboveground organisms and processes was largely studied but there is still a lack of knowledge regarding the link between plant diversity and soil characteristics. Here, we analyzed the effect of plant identity and diversity on the diversity of extractible soil organic compounds (ESOC) using 87 experimental grassland plots with different levels of plant diversity and based on a pool of over 50 plant species. Two pools of low molecular weight organic compounds, LMW1 and LMW2, were characterized by GC-MS and HPLC-DAD, respectively. These pools include specific organic acids, fatty acids and phenolics, with more organic acids in LMW1 and more phenolics in LMW2. Plant effect on the diversity of LMW1 and LMW2 compounds was strong and weak, respectively. LMW1 richness observed for bare soil was lower than that observed for all planted soils; and the richness of these soil compounds increased twofold when dominant plant species richness increased from 1 to 6. Comparing the richness of LMW1 compounds observed for a range of plant mixtures and for plant monocultures of species present in these mixtures, we showed that plant species richness increases the richness of these ESOC mainly through complementarity effects among plant species associated with contrasted spectra of soil compounds. This could explain previously reported effects of plant diversity on the diversity of soil heterotrophic microorganisms. PMID:28166250

  17. Relevance and Diversity of Nitrospira Populations in Biofilters of Brackish RAS

    PubMed Central

    Kruse, Myriam; Keuter, Sabine; Bakker, Evert; Spieck, Eva; Eggers, Till; Lipski, André

    2013-01-01

    Lithoautotrophic nitrite-oxidizing bacterial populations from moving-bed biofilters of brackish recirculation aquaculture systems (RAS; shrimp and barramundi) were tested for their metabolic activity and phylogenetic diversity. Samples from the biofilters were labeled with 13C-bicarbonate and supplemented with nitrite at concentrations of 0.3, 3 and 10 mM, and incubated at 17 and 28°C, respectively. The biofilm material was analyzed by fatty acid methyl ester - stable isotope probing (FAME-SIP). High portions of up to 45% of Nitrospira-related labeled lipid markers were found confirming that Nitrospira is the major autotrophic nitrite oxidizer in these brackish systems with high nitrogen loads. Other nitrite-oxidizing bacteria such as Nitrobacter or Nitrotoga were functionally not relevant in the investigated biofilters. Nitrospira-related 16S rRNA gene sequences were obtained from the samples with 10 mM nitrite and analyzed by a cloning approach. Sequence studies revealed four different phylogenetic clusters within the marine sublineage IV of Nitrospira, though most sequences clustered with the type strain of Nitrospira marina and with a strain isolated from a marine RAS. Three lipids dominated the whole fatty acid profiles of nitrite-oxidizing marine and brackish enrichments of Nitrospira sublineage IV organisms. The membranes included two marker lipids (16∶1 cis7 and 16∶1 cis11) combined with the non-specific acid 16∶0 as major compounds and confirmed these marker lipids as characteristic for sublineage IV species. The predominant labeling of these characteristic fatty acids and the phylogenetic sequence analyses of the marine Nitrospira sublineage IV identified organisms of this sublineage as main autotrophic nitrite-oxidizers in the investigated brackish biofilter systems. PMID:23705006

  18. Nucleic Acid Extraction from Synthetic Mars Analog Soils for in situ Life Detection

    NASA Astrophysics Data System (ADS)

    Mojarro, Angel; Ruvkun, Gary; Zuber, Maria T.; Carr, Christopher E.

    2017-08-01

    Biological informational polymers such as nucleic acids have the potential to provide unambiguous evidence of life beyond Earth. To this end, we are developing an automated in situ life-detection instrument that integrates nucleic acid extraction and nanopore sequencing: the Search for Extra-Terrestrial Genomes (SETG) instrument. Our goal is to isolate and determine the sequence of nucleic acids from extant or preserved life on Mars, if, for example, there is common ancestry to life on Mars and Earth. As is true of metagenomic analysis of terrestrial environmental samples, the SETG instrument must isolate nucleic acids from crude samples and then determine the DNA sequence of the unknown nucleic acids. Our initial DNA extraction experiments resulted in low to undetectable amounts of DNA due to soil chemistry-dependent soil-DNA interactions, namely adsorption to mineral surfaces, binding to divalent/trivalent cations, destruction by iron redox cycling, and acidic conditions. Subsequently, we developed soil-specific extraction protocols that increase DNA yields through a combination of desalting, utilization of competitive binders, and promotion of anaerobic conditions. Our results suggest that a combination of desalting and utilizing competitive binders may establish a "universal" nucleic acid extraction protocol suitable for analyzing samples from diverse soils on Mars.

  19. Genomic diversity of bacteriophages infecting the fish pathogen Flavobacterium psychrophilum.

    PubMed

    Castillo, Daniel; Middelboe, Mathias

    2016-12-01

    Bacteriophages infecting the fish pathogen Flavobacterium psychrophilum can potentially be used to prevent and control outbreaks of this bacterium in salmonid aquaculture. However, the application of bacteriophages in disease control requires detailed knowledge on their genetic composition. To explore the diversity of F. pyschrophilum bacteriophages, we have analyzed the complete genome sequences of 17 phages isolated from two distant geographic areas (Denmark and Chile), including the previously characterized temperate bacteriophage 6H. Phage genome size ranged from 39 302 to 89 010 bp with a G+C content of 27%-32%. None of the bacteriophages isolated in Denmark contained genes associated with lysogeny, whereas the Chilean isolates were all putative temperate phages and similar to bacteriophage 6H. Comparative genome analysis showed that phages grouped in three different genetic clusters based on genetic composition and gene content, indicating a limited genetic diversity of F. psychrophilum-specific bacteriophages. However, amino acid sequence dissimilarity (25%) was found in putative structural proteins, which could be related to the host specificity determinants. This study represents the first analysis of genomic diversity and composition among bacteriophages infecting the fish pathogen F. psychrophilum and discusses the implications for the application of phages in disease control. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Diversity of Δ12 Fatty Acid Desaturases in Santalaceae and Their Role in Production of Seed Oil Acetylenic Fatty Acids*

    PubMed Central

    Okada, Shoko; Zhou, Xue-Rong; Damcevski, Katherine; Gibb, Nerida; Wood, Craig; Hamberg, Mats; Haritos, Victoria S.

    2013-01-01

    Plants in the Santalaceae family, including the native cherry Exocarpos cupressiformis and sweet quandong Santalum acuminatum, accumulate ximenynic acid (trans-11-octadecen-9-ynoic acid) in their seed oil and conjugated polyacetylenic fatty acids in root tissue. Twelve full-length genes coding for microsomal Δ12 fatty acid desaturases (FADs) from the two Santalaceae species were identified by degenerate PCR. Phylogenetic analysis of the predicted amino acid sequences placed five Santalaceae FADs with Δ12 FADs, which include Arabidopsis thaliana FAD2. When expressed in yeast, the major activity of these genes was Δ12 desaturation of oleic acid, but unusual activities were also observed: i.e. Δ15 desaturation of linoleic acid as well as trans-Δ12 and trans-Δ11 desaturations of stearolic acid (9-octadecynoic acid). The trans-12-octadecen-9-ynoic acid product was also detected in quandong seed oil. The two other FAD groups (FADX and FADY) were present in both species; in a phylogenetic tree of microsomal FAD enzymes, FADX and FADY formed a unique clade, suggesting that are highly divergent. The FADX group enzymes had no detectable Δ12 FAD activity but instead catalyzed cis-Δ13 desaturation of stearolic acid when expressed in yeast. No products were detected for the FADY group when expressed recombinantly. Quantitative PCR analysis showed that the FADY genes were expressed in leaf rather than developing seed of the native cherry. FADs with promiscuous and unique activities have been identified in Santalaceae and explain the origin of some of the unusual lipids found in this plant family. PMID:24062307