Science.gov

Sample records for acid sequence diversity

  1. Amino acid sequence diversity of the major human papillomavirus capsid protein: Implications for current and next generation vaccines☆

    PubMed Central

    Ahmed, Amina I.; Bissett, Sara L.; Beddows, Simon

    2013-01-01

    Despite the fidelity of host cell polymerases, the human papillomavirus (HPV) displays a degree of genomic polymorphism resulting in distinct genotypes and intra-type variants. The current HPV vaccines target the most prevalent genotypes associated with cervical cancer (HPV16/18) and genital warts (HPV6/11). Although these vaccines confer some measure of cross-protection, a multivalent HPV vaccine is in the pipeline that aims to broaden vaccine protection against other cervical cancer-associated genotypes including HPV31, HPV33, HPV45, HPV52 and HPV58. Both current and next generation vaccines comprise virus-like particles, based upon the major capsid protein, L1, and vaccine-induced, type-specific protection is likely mediated by neutralizing antibodies targeting L1 surface-exposed domains. The aim of this study was to perform an in silico analysis of existing full length L1 sequences representing vaccine-relevant HPV genotypes in order to address the degree of naturally-occurring, intra-type polymorphisms. In total, 1281 sequences from the Americas, Africa, Asia and Europe were assembled. Intra-type entropy was low and/or limited to non-surface-exposed residues for HPV6, HPV11 and HPV52 suggesting a minimal effect on vaccine antibodies for these genotypes. For HPV16, intra-type entropy was high but the present analysis did not reveal any significant polymorphisms not previously identified. For HPV31, HPV33, HPV58, however, intra-type entropy was high, mostly mapped to surface-exposed domains and in some cases within known neutralizing antibody epitopes. For HPV18 and HPV45 there were too few sequences for a definitive analysis, but HPV45 displayed some degree of surface-exposed residue diversity. In most cases, the reference sequence for each genotype represented a minority variant and the consensus L1 sequences for HPV18, HPV31, HPV45 and HPV58 did not reflect the L1 sequence of the currently available HPV pseudoviruses. These data highlight a number of variant

  2. Diversity of trypsins in the Mediterranean corn borer Sesamia nonagrioides (Lepidoptera: Noctuidae), revealed by nucleic acid sequences and enzyme purification.

    PubMed

    Díaz-Mendoza, M; Ortego, F; García de Lacoba, M; Magaña, C; de la Poza, M; Farinós, G P; Castañera, P; Hernández-Crespo, P

    2005-09-01

    The existence of a diverse trypsin gene family with a main role in the proteolytic digestion process has been proved in vertebrate and invertebrate organisms. In lepidopteran insects, a diversity of trypsin-like genes expressed in midgut has also been identified. Genomic DNA and cDNA trypsin-like sequences expressed in the Mediterranean corn Borer (MCB), Sesamia nonagrioides, midgut are reported in this paper. A phylogenetic analysis revealed that at least three types of trypsin-like enzymes putatively involved in digestion are conserved in MCB and other lepidopteran species. As expected, a diversity of sequences has been found, including four type-I (two subtypes), four type-II (two subtypes) and one type-III. In parallel, four different trypsins have been purified from midgut lumen of late instar MCB larvae. N-terminal sequencing and mass spectrometric analyses of purified trypsins have been performed in order to identify cDNAs coding for major trypsins among the diversity of trypsin-like sequences obtained. Thus, it is revealed that the four purified trypsins in MCB belong to the three well-defined phylogenetic groups of trypsin-like sequences detected in Lepidoptera. Major active trypsins present in late instar MCB lumen guts are trypsin-I (type-I), trypsin-IIA and trypsin-IIB (type-II), and trypsin-III (type-III). Trypsin-I, trypsin-IIA and trypsin-III showed preference for Arg over Lys, but responded differently to proteinaceous or synthetic inhibitors. As full-length cDNA clones coding for the purified trypsins were available, three-dimensional protein models were built in order to study the implication of specific residues on their response to inhibitors. Thus, it is predicted that Arg73, conserved in type-I lepidopteran trypsins, may favour reversible inhibition by the E-64. Indeed, the substitution of Val213Cys, unique for type-II lepidopteran trypsins, may be responsible for their specific inhibition by HgCl2. The implication of these results on the

  3. High genetic diversity among strains of the unindustrialized lactic acid bacterium Carnobacterium maltaromaticum in dairy products as revealed by multilocus sequence typing.

    PubMed

    Rahman, Abdur; Cailliez-Grimal, Catherine; Bontemps, Cyril; Payot, Sophie; Chaillou, Stéphane; Revol-Junelles, Anne-Marie; Borges, Frédéric

    2014-07-01

    Dairy products are colonized with three main classes of lactic acid bacteria (LAB): opportunistic bacteria, traditional starters, and industrial starters. Most of the population structure studies were previously performed with LAB species belonging to these three classes and give interesting knowledge about the population structure of LAB at the stage where they are already industrialized. However, these studies give little information about the population structure of LAB prior their use as an industrial starter. Carnobacterium maltaromaticum is a LAB colonizing diverse environments, including dairy products. Since this bacterium was discovered relatively recently, it is not yet commercialized as an industrial starter, which makes C. maltaromaticum an interesting model for the study of unindustrialized LAB population structure in dairy products. A multilocus sequence typing scheme based on an analysis of fragments of the genes dapE, ddlA, glpQ, ilvE, pyc, pyrE, and leuS was applied to a collection of 47 strains, including 28 strains isolated from dairy products. The scheme allowed detecting 36 sequence types with a discriminatory index of 0.98. The whole population was clustered in four deeply branched lineages, in which the dairy strains were spread. Moreover, the dairy strains could exhibit a high diversity within these lineages, leading to an overall dairy population with a diversity level as high as that of the nondairy population. These results are in agreement with the hypothesis according to which the industrialization of LAB leads to a diversity reduction in dairy products.

  4. Genome Sequences of Eight Morphologically Diverse Alphaproteobacteria▿

    PubMed Central

    Brown, Pamela J. B.; Kysela, David T.; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V.

    2011-01-01

    The Alphaproteobacteriacomprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium. PMID:21705585

  5. Genome sequences of eight morphologically diverse Alphaproteobacteria.

    PubMed

    Brown, Pamela J B; Kysela, David T; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V

    2011-09-01

    The Alphaproteobacteria comprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium.

  6. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Human Ig heavy chain CDR3 regions in adult bone marrow pre-B cells display an adult phenotype of diversity: evidence for structural selection of DH amino acid sequences.

    PubMed

    Raaphorst, F M; Raman, C S; Tami, J; Fischbach, M; Sanz, I

    1997-10-01

    Ig repertoires generated at various developmental stages differ markedly in diversity. It is well documented that Ig H chain genes in human fetal liver are limited with regard to N-regional diversity and use of diversity elements. It is unclear whether these characteristics persist in pre-B cell H chain genes of adult bone marrow. Using Ig H chain CDR3 fingerprinting and sequence analysis, we analyzed the diversity of Ig H chain third complementarity determining regions (HCDR3) in adult bone marrow pre-B and mature B lymphocytes. Pre-B cell HCDR3 sequences exhibited adult characteristics with respect to HCDR3 size, distribution of N regions and usage of diversity elements. This suggested that pre-B cells in adults are distinct from fetal B cell precursors with regard to Ig H chain diversification mechanisms. At the DNA sequence level, HCDR3 diversity in mature B cells was similar to that in pre-B cells. Pre-B HCDR3s, however, frequently contained a consecutive stretch of hydrophobic amino acids, which were rare in mature B cells. We propose that highly hydrophobic pre-B HCDR3s may be negatively selected on the basis of structural limitations imposed by the antigen binding site. At the same time, usage of hydrophilic HCDR3 sequences (thought to support HCDR3 loop formation) may be promoted by positive selection.

  8. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  9. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    PubMed

    Rivarola, Maximo; Foster, Jeffrey T; Chan, Agnes P; Williams, Amber L; Rice, Danny W; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M J; Khouri, Hoda M; Beckstrom-Sternberg, Stephen M; Allan, Gerard J; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade.

  10. Castor Bean Organelle Genome Sequencing and Worldwide Genetic Diversity Analysis

    PubMed Central

    Chan, Agnes P.; Williams, Amber L.; Rice, Danny W.; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M. J.; Khouri, Hoda M.; Beckstrom-Sternberg, Stephen M.; Allan, Gerard J.; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D.

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  11. Amino acid sequence diversity within the family of antibodies bearing the major antiarsonate cross-reactive idiotype of the A strain mouse

    PubMed Central

    1983-01-01

    VH region amino acid sequences are described for five A/J anti-p- azophenylarsonate (anti-Ars) hybridoma antibodies for which the VL region sequences have previously been determined, thus completing the V domain sequences of these molecules. These antibodies all belong to the family designated Ars-A which bears the major anti-arsonate cross- reactive idiotype (CRI) of the A strain mouse. However, they differ in the degree to which they express the CRI in standard competition radioimmunoassays. Although the sequences are closely related, all are different from each other. Replacements are distributed throughout the VH region and occur in positions of the chain encoded by all three gene segments, VH, DH, and JH. It is likely that somatic diversification processes play a dominant role in producing the sequence variability in each of these segments. The number of differences from the sequence encoded by the germline is smallest for antibodies that express the CRI most strongly, suggesting that somatic diversification is responsible for loss of the CRI in members of the Ars-A antibody family. There is an unusual degree of clustering of differences in both CDR2 and CDR3 and many of the substitutions are located in "hot spots" of variation. The large number of differences between the chains prohibits the unambiguous identification of positions at which alterations play a major role in reducing the expression of the CRI. However, the data suggest that the loss of the CRI is associated with a definable repertoire of somatic changes at a restricted number of highly variable sites. PMID:6415209

  12. Sequence diversity and evolution of antimicrobial peptides in invertebrates.

    PubMed

    Tassanakajon, Anchalee; Somboonwiwat, Kunlaya; Amparyup, Piti

    2015-02-01

    Antimicrobial peptides (AMPs) are evolutionarily ancient molecules that act as the key components in the invertebrate innate immunity against invading pathogens. Several AMPs have been identified and characterized in invertebrates, and found to display considerable diversity in their amino acid sequence, structure and biological activity. AMP genes appear to have rapidly evolved, which might have arisen from the co-evolutionary arms race between host and pathogens, and enabled organisms to survive in different microbial environments. Here, the sequence diversity of invertebrate AMPs (defensins, cecropins, crustins and anti-lipopolysaccharide factors) are presented to provide a better understanding of the evolution pattern of these peptides that play a major role in host defense mechanisms.

  13. Bacterial Diversity at an Acid Mine Drainage Site in Maine

    NASA Astrophysics Data System (ADS)

    Gaynor, J.; Sawyer, T.; Riley, F. E.; Moulton, K. D.; Rothschild, L. J.; Duboise, S. M.

    2010-04-01

    Bacterial diversity in acidic mine drainage at a historic Maine iron mining site was investigated by isolation of environmental DNA, PCR amplification of the V3 region of the 16S rRNA gene, denaturing gradient gel electrophoresis, and DNA sequencing.

  14. Menagerie of Viruses: Diverse Chemical Sequences or Simple Electrostatics?

    NASA Astrophysics Data System (ADS)

    Muthukumar, M.

    2008-03-01

    The genome packing in hundreds of viruses is investigated by analyzing the chemical sequences of the genomes and the corresponding capsid proteins, in combination with experimental facts on the structures of the packaged genomes. Based on statistical mechanics arguments and computer simulations, we have derived a universal model, based simply on non-specific electrostatic interactions. Our model is able to predict the essential aspects of genome packing in diversely different viruses, such as the genome size and its density distribution. Our result is in contrast to the long-held view that specific interactions between the sequenced amino acid residues and the nucleotides of the genome control the genome packing. Implications of this finding in the evolution and biotechnology will be discussed.

  15. Diversity of amino acids in a typical chernozem of Moldova

    NASA Astrophysics Data System (ADS)

    Frunze, N. I.

    2014-12-01

    The content and composition of the amino acids in typical chernozems were studied. The objects of the study included a reference soil under an old fallow and three variants under fodder crop rotations: not fertilized, with mineral fertilizers, and with organic fertilizers. The contents of 18 amino acids were determined in these soils. The amino acids were extracted by the method of acid hydrolysis and identified by the method of ion-exchange chromatography. The total content of most of the amino acids was maximal in the reference soil; it was much lower in the cultivated soils and decreased in the following sequence: organic background > mineral background > no fertilization. The diversity of amino acids was evaluated quantitatively using different parameters applied in ecology for estimating various aspects of the species composition of communities (Simpson, Margalef, Menhinick, and Shannon's indices). The diversity and contribution of different amino acids to the total pool of amino acids also varied significantly in the studied variants. The maximum diversity of amino acids and maximum evenness of their relative abundance indices were typical of the reference chernozem; these parameters were lower in the cultivated soils. It was concluded that the changes in the structure of the amino acids under the impact of agricultural loads are similar to those that are usually observed under stress conditions.

  16. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  17. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  18. The complete amino acid sequence of prochymosin.

    PubMed Central

    Foltmann, B; Pedersen, V B; Jacobsen, H; Kauffman, D; Wybrandt, G

    1977-01-01

    The total sequence of 365 amino acid residues in bovine prochymosin is presented. Alignment with the amino acid sequence of porcine pepsinogen shows that 204 amino acid residues are common to the two zymogens. Further comparison and alignment with the amino acid sequence of penicillopepsin shows that 66 residues are located at identical positions in all three proteases. The three enzymes belong to a large group of proteases with two aspartate residues in the active center. This group forms a family derived from one common ancestor. PMID:329280

  19. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  20. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  1. Sequence diversity of wheat mosaic virus isolates.

    PubMed

    Stewart, Lucy R

    2016-02-02

    Wheat mosaic virus (WMoV), transmitted by eriophyid wheat curl mites (Aceria tosichella) is the causal agent of High Plains disease in wheat and maize. WMoV and other members of the genus Emaravirus evaded thorough molecular characterization for many years due to the experimental challenges of mite transmission and manipulating multisegmented negative sense RNA genomes. Recently, the complete genome sequence of a Nebraska isolate of WMoV revealed eight segments, plus a variant sequence of the nucleocapsid protein-encoding segment. Here, near-complete and partial consensus sequences of five more WMoV isolates are reported and compared to the Nebraska isolate: an Ohio maize isolate (GG1), a Kansas barley isolate (KS7), and three Ohio wheat isolates (H1, K1, W1). Results show two distinct groups of WMoV isolates: Ohio wheat isolate RNA segments had 84% or lower nucleotide sequence identity to the NE isolate, whereas GG1 and KS7 had 98% or higher nucleotide sequence identity to the NE isolate. Knowledge of the sequence variability of WMoV isolates is a step toward understanding virus biology, and potentially explaining observed biological variation.

  2. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

    PubMed Central

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D.; Adir, Noam

    2016-01-01

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  3. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.

  4. Sequence diversity and molecular evolutionary rates between buffalo and cattle.

    PubMed

    Moaeen-ud-Din, M; Bilal, G

    2015-02-01

    Identification of genes of importance regarding production traits in buffalo is impaired by a paucity of genomic resources. Choice to fill this gap is to exploit data available for cow. The cross-species application of comparative genomics tools is potential gear to investigate the buffalo genome. However, this is dependent on nucleotide sequences similarity. In this study, gene diversity between buffalo and cattle was determined using 86 gene orthologues. There was approximately 3% difference in all genes in terms of nucleotide diversity and 0.267 ± 0.134 in amino acids, indicating the possibility for successfully using cross-species strategies for genomic studies. There were significantly higher non-synonymous substitutions both in cattle and buffalo; however, there was similar difference in terms of dN- dS (4.414 versus 4.745) in buffalo and cattle, respectively. Higher rate of non-synonymous substitutions at similar level in buffalo and cattle indicated a similar positive selection pressure. Results for relative rate test were assessed with the chi-squared test. There was no significance difference on unique mutations between cattle and buffalo lineages at synonymous sites. However, there was a significance difference on unique mutations for non-synonymous sites, indicating ongoing mutagenic process that generates substitutional mutation at approximately the same rate at silent sites. Moreover, despite of common ancestry, our results indicate a different divergent time among genes of cattle and buffalo. This is the first demonstration that variable rates of molecular evolution may be present within the family Bovidae.

  5. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  6. Optimal Sequencing Strategies for Surveying Molecular Genetic Diversity

    PubMed Central

    Pluzhnikov, A.; Donnelly, P.

    1996-01-01

    Two commonly used measures of genetic diversity for intraspecies DNA sequence data are based, respectively, on the number of segregating sites, and on the average number of pairwise nucleotide differences. Expressions are derived for their variance in the presence of intragenic recombination for a panmictic population of fixed size that is at neutral equilibrium at the region sequenced. We show that, in contrast to the slow decrease in variance with increasing sample size, if the recombination rate is nonzero, the asymptotic rate of decrease of variance with increasing sequence length, for fixed sample size, is quite rapid. In particular, it is close to that which would be obtained by sequencing independent chromosome regions. The correlation between measures of diversity from linked regions is also examined. For a given total number of bases sequenced in a particular region, optimal sequencing strategies are derived. These typically involve sequencing relatively few (three to 10) long copies of the region. Under optimal strategies, the variances of the two measures are very similar for most parameter values considered. Results concerning optimal sequencing strategies will be sensitive to gross departures from the underlying assumptions, such as population bottlenecks, selective sweeps, and substantial population substructure. PMID:8913765

  7. Next Generation Sequencing Reveals the Hidden Diversity of Zooplankton Assemblages

    PubMed Central

    Harmer, Rachel A.; Somerfield, Paul J.; Atkinson, Angus

    2013-01-01

    Background Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. Methodology/Principle Findings Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. Conclusions Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly

  8. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  9. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  10. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  11. Dynamics of immunoglobulin sequence diversity in HIV-1 infected individuals

    PubMed Central

    Hoehn, Kenneth B.; Gall, Astrid; Bashford-Rogers, Rachael; Fidler, S. J.; Kaye, S.; Weber, J. N.; McClure, M. O.; Kellam, Paul; Pybus, Oliver G.

    2015-01-01

    Advances in immunoglobulin (Ig) sequencing technology are leading to new perspectives on immune system dynamics. Much research in this nascent field has focused on resolving immune responses to viral infection. However, the dynamics of B-cell diversity in early HIV infection, and in response to anti-retroviral therapy, are still poorly understood. Here, we investigate these dynamics through bulk Ig sequencing of samples collected over 2 years from a group of eight HIV-1 infected patients, five of whom received anti-retroviral therapy during the first half of the study period. We applied previously published methods for visualizing and quantifying B-cell sequence diversity, including the Gini index, and compared their efficacy to alternative measures. While we found significantly greater clonal structure in HIV-infected patients versus healthy controls, within HIV patients, we observed no significant relationships between statistics of B-cell clonal expansion and clinical variables such as viral load and CD4+ count. Although there are many potential explanations for this, we suggest that important factors include poor sampling resolution and complex B-cell dynamics that are difficult to summarize using simple summary statistics. Importantly, we find a significant association between observed Gini indices and sequencing read depth, and we conclude that more robust analytical methods and a closer integration of experimental and theoretical work is needed to further our understanding of B-cell repertoire diversity during viral infection. PMID:26194755

  12. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  13. Next generation sequencing technologies: tool to study avian virus diversity.

    PubMed

    Kapgate, S S; Barbuddhe, S B; Kumanan, K

    2015-03-01

    Increased globalisation, climatic changes and wildlife-livestock interface led to emergence of novel viral pathogens or zoonoses that have become serious concern to avian, animal and human health. High biodiversity and bird migration facilitate spread of the pathogen and provide reservoirs for emerging infectious diseases. Current classical diagnostic methods designed to be virus-specific or aim to be limited to group of viral agents, hinder identifying of novel viruses or viral variants. Recently developed approaches of next-generation sequencing (NGS) provide culture-independent methods that are useful for understanding viral diversity and discovery of novel virus, thereby enabling a better diagnosis and disease control. This review discusses the different possible steps of a NGS study utilizing sequence-independent amplification, high-throughput sequencing and bioinformatics approaches to identify novel avian viruses and their diversity. NGS lead to the identification of a wide range of new viruses such as picobirnavirus, picornavirus, orthoreovirus and avian gamma coronavirus associated with fulminating disease in guinea fowl and is also used in describing viral diversity among avian species. The review also briefly discusses areas of viral-host interaction and disease associated causalities with newly identified avian viruses.

  14. Code-Time Diversity for Direct Sequence Spread Spectrum Systems

    PubMed Central

    Hassan, A. Y.

    2014-01-01

    Time diversity is achieved in direct sequence spread spectrum by receiving different faded delayed copies of the transmitted symbols from different uncorrelated channel paths when the transmission signal bandwidth is greater than the coherence bandwidth of the channel. In this paper, a new time diversity scheme is proposed for spread spectrum systems. It is called code-time diversity. In this new scheme, N spreading codes are used to transmit one data symbol over N successive symbols interval. The diversity order in the proposed scheme equals to the number of the used spreading codes N multiplied by the number of the uncorrelated paths of the channel L. The paper represents the transmitted signal model. Two demodulators structures will be proposed based on the received signal models from Rayleigh flat and frequency selective fading channels. Probability of error in the proposed diversity scheme is also calculated for the same two fading channels. Finally, simulation results are represented and compared with that of maximal ration combiner (MRC) and multiple-input and multiple-output (MIMO) systems. PMID:24982925

  15. Characterisation of the genetic diversity of Brucella by multilocus sequencing

    PubMed Central

    Whatmore, Adrian M; Perrett, Lorraine L; MacMillan, Alastair P

    2007-01-01

    Background Brucella species include economically important zoonotic pathogens that can infect a wide range of animals. There are currently six classically recognised species of Brucella although, as yet unnamed, isolates from various marine mammal species have been reported. In order to investigate genetic relationships within the group and identify potential diagnostic markers we have sequenced multiple genetic loci from a large sample of Brucella isolates representing the known diversity of the genus. Results Nine discrete genomic loci corresponding to 4,396 bp of sequence were examined from 160 Brucella isolates. By assigning each distinct allele at a locus an arbitrary numerical designation the population was found to represent 27 distinct sequence types (STs). Diversity at each locus ranged from 1.03–2.45% while overall genetic diversity equated to 1.5%. Most loci examined represent housekeeping gene loci and, in all but one case, the ratio of non-synonymous to synonymous change was substantially <1. Analysis of linkage equilibrium between loci indicated a strongly clonal overall population structure. Concatenated sequence data were used to construct an unrooted neighbour-joining tree representing the relationships between STs. This shows that four previously characterized classical Brucella species, B. abortus, B. melitensis, B. ovis and B. neotomae correspond to well-separated clusters. With the exception of biovar 5, B. suis isolates cluster together, although they form a more diverse group than other classical species with a number of distinct STs corresponding to the remaining four biovars. B. canis isolates are located on the same branch very closely related to, but distinguishable from, B. suis biovar 3 and 4 isolates. Marine mammal isolates represent a distinct, though rather weakly supported, cluster within which individual STs display one of three clear host preferences. Conclusion The sequence database provides a powerful dataset for addressing

  16. Fatty Acid Diversity is Not Associated with Neutral Genetic Diversity in Native Populations of the Biodiesel Plant Jatropha curcas L.

    PubMed

    Martínez-Díaz, Yesenia; González-Rodríguez, Antonio; Rico-Ponce, Héctor Rómulo; Rocha-Ramírez, Víctor; Ovando-Medina, Isidro; Espinosa-García, Francisco J

    2017-01-01

    Jatropha curcas L. (Euphorbiaceae) is a shrub native to Mexico and Central America, which produces seeds with a high oil content that can be converted to biodiesel. The genetic diversity of this plant has been widely studied, but it is not known whether the diversity of the seed oil chemical composition correlates with neutral genetic diversity. The total seed oil content, the diversity of profiles of fatty acids and phorbol esters were quantified, also, the genetic diversity obtained from simple sequence repeats was analyzed in native populations of J. curcas in Mexico. Using the fatty acids profiles, a discriminant analysis recognized three groups of individuals according to geographical origin. Bayesian assignment analysis revealed two genetic groups, while the genetic structure of the populations could not be explained by isolation-by-distance. Genetic and fatty acid profile data were not correlated based on Mantel test. Also, phorbol ester content and genetic diversity were not associated. Multiple linear regression analysis showed that total oil content was associated with altitude and seasonality of temperature. The content of unsaturated fatty acids was associated with altitude. Therefore, the cultivation planning of J. curcas should take into account chemical variation related to environmental factors.

  17. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  18. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  19. An intimate link between antimicrobial peptide sequence diversity and binding to essential components of bacterial membranes.

    PubMed

    Schmitt, Paulina; Rosa, Rafael D; Destoumieux-Garzón, Delphine

    2016-05-01

    Antimicrobial peptides and proteins (AMPs) are widespread in the living kingdom. They are key effectors of defense reactions and mediators of competitions between organisms. They are often cationic and amphiphilic, which favors their interactions with the anionic membranes of microorganisms. Several AMP families do not directly alter membrane integrity but rather target conserved components of the bacterial membranes in a process that provides them with potent and specific antimicrobial activities. Thus, lipopolysaccharides (LPS), lipoteichoic acids (LTA) and the peptidoglycan precursor Lipid II are targeted by a broad series of AMPs. Studying the functional diversity of immune effectors tells us about the essential residues involved in AMP mechanism of action. Marine invertebrates have been found to produce a remarkable diversity of AMPs. Molluscan defensins and crustacean anti-LPS factors (ALF) are diverse in terms of amino acid sequence and show contrasted phenotypes in terms of antimicrobial activity. Their activity is directed essentially against Gram-positive or Gram-negative bacteria due to their specific interactions with Lipid II or Lipid A, respectively. Through those interesting examples, we discuss here how sequence diversity generated throughout evolution informs us on residues required for essential molecular interaction at the bacterial membranes and subsequent antibacterial activity. Through the analysis of molecular variants having lost antibacterial activity or shaped novel functions, we also discuss the molecular bases of functional divergence in AMPs. This article is part of a Special Issue entitled: Antimicrobial peptides edited by Karl Lohner and Kai Hilpert.

  20. Sequence diversity of mating-type genes in Phaeosphaeria avenaria.

    PubMed

    Ueng, Peter P; Dai, Qun; Cui, Kai-rong; Czembor, Paweł C; Cunfer, Barry M; Tsang, H; Arseniuk, Edward; Bergstrom, Gary C

    2003-05-01

    Phaeosphaeria avenaria, one of the causal agents of stagonospora leaf blotch diseases in cereals, is composed of two subspecies, P. avenaria f. sp. triticea (Pat) and P. avenaria f. sp. avenaria (Paa). The Pat subspecies was grouped into Pat1-Pat3, based on restriction fragment length polymorphism (RFLP) and ribosomal DNA (rDNA) internal transcribed spacer (ITS) sequences in previous studies. Mating-type genes and their potential use in phylogeny and molecular classification were studied by DNA hybridization and PCR amplification. The majority of Pat1 isolates reported to be homothallic and producing sexual reproduction structures on cultural media had only the MAT1-1 gene. Minor sequence variations were found in the conserved region of MAT1-1 gene in Pat1 isolates. However, both mating-type genes, MAT1-1 and MAT1-2, were identified in P. avenaria isolates represented by ATCC12277 from oats (Paa) and the Pat2 isolates from foxtail barley ( Hordeum jubatum L.). Cluster analyses based on mating-type gene conserved regions revealed that cereal Phaeosphaeria is not phylogenetically closely related to other ascomycetes, including Mycosphaerella graminicola (anamorph Septoria tritici). The sequence diversity of mating-type genes in Pat and Paa supports our previous phylogenetic relationship and molecular classification based on RFLP fingerprinting and rDNA ITS sequences.

  1. Sequence diversity, reproductive isolation and species concepts in Saccharomyces.

    PubMed

    Liti, Gianni; Barton, David B H; Louis, Edward J

    2006-10-01

    Using the biological species definition, yeasts of the genus Saccharomyces sensu stricto comprise six species and one natural hybrid. Previous work has shown that reproductive isolation between the species is due primarily to sequence divergence acted upon by the mismatch repair system and not due to major gene differences or chromosomal rearrangements. Sequence divergence through mismatch repair has also been shown to cause partial reproductive isolation among populations within a species. We have surveyed sequence variation in populations of Saccharomyces sensu stricto yeasts and measured meiotic sterility in hybrids. This allows us to determine the divergence necessary to produce the reproductive isolation seen among species. Rather than a sharp transition from fertility to sterility, which may have been expected, we find a smooth monotonic relationship between diversity and reproductive isolation, even as far as the well-accepted designations of S. paradoxus and S. cerevisiae as distinct species. Furthermore, we show that one species of Saccharomyces--S. cariocanus--differs from a population of S. paradoxus by four translocations, but not by sequence. There is molecular evidence of recent introgression from S. cerevisiae into the European population of S. paradoxus, supporting the idea that in nature the boundary between these species is fuzzy.

  2. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  3. Los Alamos sequence analysis package for nucleic acids and proteins.

    PubMed Central

    Kanehisa, M I

    1982-01-01

    An interactive system for computer analysis of nucleic acid and protein sequences has been developed for the Los Alamos DNA Sequence Database. It provides a convenient way to search or verify various sequence features, e.g., restriction enzyme sites, protein coding frames, and properties of coded proteins. Further, the comprehensive analysis package on a large-scale database can be used for comparative studies on sequence and structural homologies in order to find unnoted information stored in nucleic acid sequences. PMID:6174934

  4. Phylogenetic diversity of insecticolous fusaria inferred from multilocus DNA sequence data and their molecular identification via FUSARIUM-ID and Fusarium MLST

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We constructed several multilocus Deoxyribonucleic acid (DNA) sequence datasets to assess the phylogenetic diversity of insecticolous fusaria, especially focusing on those housed in the Agricultural Research Service Collection of Entomopathogenic Fungi (ARSEF), and to facilitate molecular identifica...

  5. Characterization of the Genomic Diversity of Norovirus in Linked Patients Using a Metagenomic Deep Sequencing Approach

    PubMed Central

    Nasheri, Neda; Petronella, Nicholas; Ronholm, Jennifer; Bidawid, Sabah; Corneau, Nathalie

    2017-01-01

    Norovirus (NoV) is the leading cause of gastroenteritis worldwide. A robust cell culture system does not exist for NoV and therefore detailed characterization of outbreak and sporadic strains relies on molecular techniques. In this study, we employed a metagenomic approach that uses non-specific amplification followed by next-generation sequencing to whole genome sequence NoV genomes directly from clinical samples obtained from 8 linked patients. Enough sequencing depth was obtained for each sample to use a de novo assembly of near-complete genome sequences. The resultant consensus sequences were then used to identify inter-host nucleotide variations that occur after direct transmission, analyze amino acid variations in the major capsid protein, and provide evidence of recombination events. The analysis of intra-host quasispecies diversity was possible due to high coverage-depth. We also observed a linear relationship between NoV viral load in the clinical sample and the number of sequence reads that could be attributed to NoV. The method demonstrated here has the potential for future use in whole genome sequence analyses of other RNA viruses isolated from clinical, environmental, and food specimens. PMID:28197136

  6. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  7. Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes

    PubMed Central

    Ikeda, Shun; Ono, Naoaki; Altaf-Ul-Amin, Md.; Kanaya, Shigehiko

    2014-01-01

    Progress in the “omics” fields such as genomics, transcriptomics, proteomics, and metabolomics has engendered a need for innovative analytical techniques to derive meaningful information from the ever increasing molecular data. KNApSAcK motorcycle DB is a popular database for enzymes related to secondary metabolic pathways in plants. One of the challenges in analyses of protein sequence data in such repositories is the standard notation of sequences as strings of alphabetical characters. This has created lack of a natural underlying metric that eases amenability to computation. In view of this requirement, we applied novel integration of selected biochemical and physical attributes of amino acids derived from the amino acid index and quantified in numerical scale, to examine diversity of peptide sequences of terpenoid synthases accumulated in KNApSAcK motorcycle DB. We initially generated a reduced amino acid index table. This is a set of biochemical and physical properties obtained by random forest feature selection of important indices from the amino acid index. Principal component analysis was then applied for characterization of enzymes involved in synthesis of terpenoids. The variance explained was increased by incorporation of residue attributes for analyses. PMID:24900985

  8. Diverse Levels of Sequence Selectivity and Catalytic Efficiency of Protein-Tyrosine Phosphatases

    PubMed Central

    Selner, Nicholas G.; Luechapanichkul, Rinrada; Chen, Xianwen; Neel, Benjamin G.; Zhang, Zhong-Yin; Knapp, Stefan; Bell, Charles E.; Pei, Dehua

    2014-01-01

    The sequence selectivity of 14 classical protein-tyrosine phosphatases (PTPs) (PTPRA, PTPRB, PTPRC, PTPRD, PTPRO, PTP1B, SHP-1, SHP-2, HePTP, PTP-PEST, TCPTP, PTPH1, PTPD1, and PTPD2) was systematically profiled by screening their catalytic domains against combinatorial peptide libraries. All of the PTPs exhibit similar preference for pY peptides rich in acidic amino acids and disfavor positively charged sequences, but differ vastly in their degrees of preference/disfavor. Some PTPs (PTP-PEST, SHP-1, and SHP-2) are highly selective for acidic over basic (or neutral) peptides (by >105-fold), whereas others (PTPRA and PTPRD) show no to little sequence selectivity. PTPs also have diverse intrinsic catalytic efficiencies (kcat/KM values against optimal substrates), which differ by >105-fold due to different kcat and/or KM values. Moreover, PTPs show little positional preference for the acidic residues relative to the pY residue. Mutation of Arg47 of PTP1B, which is located near the pY-1 and pY-2 residues of a bound substrate, decreased the enzymatic activity by 3–18-fold toward all pY substrates containing acidic residues anywhere within the pY-6 to pY+5 region. Similarly, mutation of Arg24, which is situated near the C-terminus of a bound substrate, adversely affected the kinetic activity of all acidic substrates. A co-crystal structure of PTP1B bound with a nephrin pY1193 peptide suggests that Arg24 engages in electrostatic interactions with acidic residues at the pY+1, pY+2, and likely other positions. These results suggest that long-range electrostatic interactions between positively charged residues near the PTP active site and acidic residues on pY substrates allow a PTP to bind acidic substrates with similar affinities and the varying levels of preference for acidic sequences by different PTPs are likely caused by the different electrostatic potentials near their active sites. The implications of the varying sequence selectivity and intrinsic catalytic

  9. The role of pitch and temporal diversity in the perception and production of musical sequences.

    PubMed

    Prince, Jon B; Pfordresher, Peter Q

    2012-10-01

    In two experiments we explored how the dimensions of pitch and time contribute to the perception and production of musical sequences. We tested how dimensional diversity (the number of unique categories in each dimension) affects how pitch and time combine. In Experiment 1, 18 musically trained participants rated the complexity of sequences varying only in their diversity in pitch or time; a separate group of 18 pianists reproduced these sequences after listening to them without practice. Overall, sequences with more diversity were perceived as more complex, but pitch diversity influenced ratings more strongly than temporal diversity. Further, although participants perceived sequences with high levels of pitch diversity as more complex, errors were more common in the sequences with higher diversity in time. Sequences in Experiment 2 exhibited diversity in both pitch and time; diversity levels were a subset of those tested in Experiment 1. Again diversity affected complexity ratings and errors, but there were no statistical interactions between dimensions. Nonetheless, pitch diversity was the primary factor in determining perceived complexity, and again temporal errors occurred more often than pitch errors. Additionally, diversity in one dimension influenced error rates in the other dimension in that both error types were more frequent relative to Experiment 1. These results suggest that although pitch and time do not interact directly, they are nevertheless not processed in an informationally encapsulated manner. The findings also align with a dimensional salience hypothesis, in which pitch is prioritised in the processing of typical Western musical sequences.

  10. Diversity and Activity of Alternative Nitrogenases in Sequenced Genomes and Coastal Environments

    PubMed Central

    McRose, Darcy L.; Zhang, Xinning; Kraepiel, Anne M. L.; Morel, François M. M.

    2017-01-01

    The nitrogenase enzyme, which catalyzes the reduction of N2 gas to NH4+, occurs as three separate isozyme that use Mo, Fe-only, or V. The majority of global nitrogen fixation is attributed to the more efficient ‘canonical’ Mo-nitrogenase, whereas Fe-only and V-(‘alternative’) nitrogenases are often considered ‘backup’ enzymes, used when Mo is limiting. Yet, the environmental distribution and diversity of alternative nitrogenases remains largely unknown. We searched for alternative nitrogenase genes in sequenced genomes and used PacBio sequencing to explore the diversity of canonical (nifD) and alternative (anfD and vnfD) nitrogenase amplicons in two coastal environments: the Florida Everglades and Sippewissett Marsh (MA). Genome-based searches identified an additional 25 species and 10 genera not previously known to encode alternative nitrogenases. Alternative nitrogenase amplicons were found in both Sippewissett Marsh and the Florida Everglades and their activity was further confirmed using newly developed isotopic techniques. Conserved amino acid sequences corresponding to cofactor ligands were also analyzed in anfD and vnfD amplicons, offering insight into environmental variants of these motifs. This study increases the number of available anfD and vnfD sequences ∼20-fold and allows for the first comparisons of environmental Mo-, Fe-only, and V-nitrogenase diversity. Our results suggest that alternative nitrogenases are maintained across a range of organisms and environments and that they can make important contributions to nitrogenase diversity and nitrogen fixation. PMID:28293220

  11. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  12. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  13. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  14. Complete Genome Sequences of 12 Species of Stable Defined Moderately Diverse Mouse Microbiota 2

    PubMed Central

    Uchimura, Yasuhiro; Wyss, Madeleine; Brugiroux, Sandrine; Limenitakis, Julien P.; Stecher, Bärbel; McCoy, Kathy D.

    2016-01-01

    We report here the complete genome sequences of 12 bacterial species of stable defined moderately diverse mouse microbiota 2 (sDMDMm2) used to colonize germ-free mice with defined microbes. Whole-genome sequencing of these species was performed using the PacBio sequencing platform yielding circularized genome sequences of all 12 species. PMID:27634994

  15. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  16. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  17. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  18. Novel alpha-conotoxins identified by gene sequencing from cone snails native to Hainan, and their sequence diversity.

    PubMed

    Luo, Sulan; Zhangsun, Dongting; Zhang, Ben; Quan, Yaru; Wu, Yong

    2006-11-01

    Conotoxins (CTX) from the venom of marine cone snails (genus Conus) represent large families of proteins, which show a similar precursor organization with surprisingly conserved signal sequence of the precursor peptides, but highly diverse pharmacological activities. By using the conserved sequences found within the genes that encode the alpha-conotoxin precursors, a technique based on RT-PCR was used to identify, respectively, two novel peptides (LiC22, LeD2) from the two worm-hunting Conus species Conus lividus, and Conus litteratus, and one novel peptide (TeA21) from the snail-hunting Conus species Conus textile, all native to Hainan in China. The three peptides share an alpha4/7 subfamily alpha-conotoxins common cysteine pattern (CCX(4)CX(7)C, two disulfide bonds), which are competitive antagonists of nicotinic acetylcholine receptor (nAChRs). The cDNA of LiC22N encodes a precursor of 40 residues, including a propeptide of 19 residues and a mature peptide of 21 residues. The cDNA of LeD2N encodes a precursor of 41 residues, including a propeptide of 21 residues and a mature peptide of 16 residues with three additional Gly residues. The cDNA of TeA21N encodes a precursor of 38 residues, including a propeptide of 20 residues and a mature peptide of 17 residues with an additional residue Gly. The additional residue Gly of LeD2N and TeA21N is a prerequisite for the amidation of the preceding C-terminal Cys. All three sequences are processed at the common signal site -X-Arg- immediately before the mature peptide sequences. The properties of the alpha4/7 conotoxins known so far were discussed in detail. Phylogenetic analysis of the new conotoxins in the present study and the published homologue of alpha4/7 conotoxins from the other Conus species were performed systematically. Patterns of sequence divergence for the three regions of signal, proregion, and mature peptides, both nucleotide acids and residue substitutions in DNA and peptide levels, as well as Cys codon

  19. Amino acid sequence of mouse submaxillary gland renin.

    PubMed Central

    Misono, K S; Chang, J J; Inagami, T

    1982-01-01

    The complete amino acid sequences of the heavy chain and light chain of mouse submaxillary gland renin have been determined. The heavy chain consists of 288 amino acid residues having a Mr of 31,036 calculated from the sequence. The light chain contains 48 amino acid residues with a Mr of 5,458. The sequence of the heavy chain was determined by automated Edman degradations of the cyanogen bromide peptides and tryptic peptides generated after citraconylation, as well as other peptides generated therefrom. The sequence of the light chain was derived from sequence analyses of the peptides generated by cyanogen bromide cleavage or by digestion with Staphylococcus aureus protease. The sequences in the active site regions in renin containing two catalytically essential aspartyl residues 32 and 215 were found identical with those in pepsin, chymosin, and penicillopepsin. Comparison of the amino acid sequence of renin with that of porcine pepsin indicated a 42% sequence identity of the heavy chain with the amino-terminal and middle regions and a 46% identity of the light chain with the carboxyl-terminal region of the porcine pepsin sequence. Residues identical in renin and pepsin are distributed throughout the length of the molecules, suggesting a similarity in their overall structures. PMID:6812055

  20. Definition of the tempo of sequence diversity across an alignment and automatic identification of sequence motifs: Application to protein homologous families and superfamilies

    PubMed Central

    May, Alex C.W.

    2002-01-01

    It is often possible to identify sequence motifs that characterize a protein family in terms of its fold and/or function from aligned protein sequences. Such motifs can be used to search for new family members. Partitioning of sequence alignments into regions of similar amino acid variability is usually done by hand. Here, I present a completely automatic method for this purpose: one that is guaranteed to produce globally optimal solutions at all levels of partition granularity. The method is used to compare the tempo of sequence diversity across reliable three-dimensional (3D) structure-based alignments of 209 protein families (HOMSTRAD) and that for 69 superfamilies (CAMPASS). (The mean alignment length for HOMSTRAD and CAMPASS are very similar.) Surprisingly, the optimal segmentation distributions for the closely related proteins and distantly related ones are found to be very similar. Also, optimal segmentation identifies an unusual protein superfamily. Finally, protein 3D structure clues from the tempo of sequence diversity across alignments are examined. The method is general, and could be applied to any area of comparative biological sequence and 3D structure analysis where the constraint of the inherent linear organization of the data imposes an ordering on the set of objects to be clustered. PMID:12441381

  1. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    PubMed Central

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-01-01

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species. PMID:27739446

  2. Sequence and diversity of DRB genes of Aotus nancymaae, a primate model for human malaria parasites.

    PubMed

    Nino-Vasquez, J J; Vogel, D; Rodriguez, R; Moreno, A; Patarroyo, M E; Pluschke, G; Daubenberger, C A

    2000-03-01

    The New World primate Aotus nancymaae is susceptible to infection with the human malaria parasite Plasmodium falciparum and Plasmodium vivax and has therefore been recommended by the World Health Organization as a model for evaluation of malaria vaccine candidates. We present here a first step in the molecular characterization of the major histocompatibility complex (MHC) class II DRB genes of Aotus nancymaae (owl monkey or night monkey) by nucleotide sequence analysis of the polymorphic exon 2 segments. In a group of 15 nonrelated animals captivated in the wild, 34 MHC DRB alleles could be identified. Six allelic lineages were detected, two of them having human counterparts, while two other lineages have not been described in any other New World monkey species studied. As in the common marmoset, the diversity of DRB alleles appears to have arisen largely by point mutations in the beta-pleated sheets and by frequent exchange of fixed sequence motifs in the alpha-helical portion. Pairs of alleles differing only at amino acid position b86 by an exchange of valine to glycine are present in Aotus, as in humans. Essential amino acid residues contributing to MHC DR peptide binding pockets number 1 and 4 are conserved or semiconserved between HLA-DR and Aona-DRB molecules, indicating a capacity to bind similar peptide repertoires. These results support fully our using Aotus monkeys as an animal model for evaluation of future subunit vaccine candidates.

  3. Sequence Diversity in MIC6 Gene among Toxoplasma gondii Isolates from Different Hosts and Geographical Locations.

    PubMed

    Li, Zhong-Yuan; Song, Hui-Qun; Chen, Jia; Zhu, Xing-Quan

    2015-06-01

    Toxoplasma gondii is an opportunistic protozoan parasite that can infect almost all warm-blooded animals including humans with a worldwide distribution. Micronemes play an important role in invasion process of T. gondii, associated with the attachment, motility, and host cell recognition. In this research, sequence diversity in microneme protein 6 (MIC6) gene among 16 T. gondii isolates from different hosts and geographical regions and 1 reference strain was examined. The results showed that the sequence of all the examined T. gondii strains was 1,050 bp in length, and their A + T content was between 45.7% and 46.1%. Sequence analysis presented 33 nucleotide mutation positions (0-1.1%), resulting in 23 amino acid substitutions (0-2.3%) aligned with T. gondii RH strain. Moreover, T. gondii strains representing the 3 classical genotypes (Type I, II, and III) were separated into different clusters based on the locus of MIC6 using phylogenetic analyses by Bayesian inference (BI), maximum parsimony (MP), and maximum likelihood (ML), but T. gondii strains belonging to ToxoDB #9 were separated into different clusters. Our results suggested that MIC6 gene is not a suitable marker for T. gondii population genetic studies.

  4. Amino Acid Sequence of Human Cholinesterase

    DTIC Science & Technology

    1985-10-01

    liquid chromatography (HPLC). Activity testing of the aged, DFP-labeled cholinesterase showed that 99.8% of the active sites had been labeled, since...acids were quantitated by ninhydrin at the AAA Labs, or by derivatization with phenylisothiocyanate at the University of Michigan. The latter method

  5. Cystatin. Amino acid sequence and possible secondary structure.

    PubMed Central

    Schwabe, C; Anastasi, A; Crow, H; McDonald, J K; Barrett, A J

    1984-01-01

    The amino acid sequence of cystatin, the protein from chicken egg-white that is a tight-binding inhibitor of many cysteine proteinases, is reported. Cystatin is composed of 116 amino acid residues, and the Mr is calculated to be 13 143. No striking similarity to any other known sequence has been detected. The results of computer analysis of the sequence and c.d. spectrometry indicate that the secondary structure includes relatively little alpha-helix (about 20%) and that the remainder is mainly beta-structure. PMID:6712597

  6. Draft Genome Sequences of Nine Cyanobacterial Strains from Diverse Habitats

    PubMed Central

    Zhu, Tao; Hou, Shengwei

    2017-01-01

    ABSTRACT Here, we report the annotated draft genome sequences of nine different cyanobacteria, which were originally collected from different habitats, including hot springs, terrestrial, freshwater, and marine environments, and cover four of the five morphological subsections of cyanobacteria. PMID:28254973

  7. Draft Genome Sequences of Nine Cyanobacterial Strains from Diverse Habitats.

    PubMed

    Zhu, Tao; Hou, Shengwei; Lu, Xuefeng; Hess, Wolfgang R

    2017-03-02

    Here, we report the annotated draft genome sequences of nine different cyanobacteria, which were originally collected from different habitats, including hot springs, terrestrial, freshwater, and marine environments, and cover four of the five morphological subsections of cyanobacteria.

  8. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  9. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  10. Effects of Abiotic Factors on the Phylogenetic Diversity of Bacterial Communities in Acidic Thermal Springs▿

    PubMed Central

    Mathur, Jayanti; Bizzoco, Richard W.; Ellis, Dean G.; Lipson, David A.; Poole, Alexander W.; Levine, Richard; Kelley, Scott T.

    2007-01-01

    Acidic thermal springs offer ideal environments for studying processes underlying extremophile microbial diversity. We used a carefully designed comparative analysis of acidic thermal springs in Yellowstone National Park to determine how abiotic factors (chemistry and temperature) shape acidophile microbial communities. Small-subunit rRNA gene sequences were PCR amplified, cloned, and sequenced, by using evolutionarily conserved bacterium-specific primers, directly from environmental DNA extracted from Amphitheater Springs and Roaring Mountain sediment samples. Energy-dispersive X-ray spectroscopy, X-ray diffraction, and colorimetric assays were used to analyze sediment chemistry, while an optical emission spectrometer was used to evaluate water chemistry and electronic probes were used to measure the pH, temperature, and Eh of the spring waters. Phylogenetic-statistical analyses found exceptionally strong correlations between bacterial community composition and sediment mineral chemistry, followed by weaker but significant correlations with temperature gradients. For example, sulfur-rich sediment samples contained a high diversity of uncultured organisms related to Hydrogenobaculum spp., while iron-rich sediments were dominated by uncultured organisms related to a diverse array of gram-positive iron oxidizers. A detailed analysis of redox chemistry indicated that the available energy sources and electron acceptors were sufficient to support the metabolic potential of Hydrogenobaculum spp. and iron oxidizers, respectively. Principal-component analysis found that two factors explained 95% of the genetic diversity, with most of the variance attributable to mineral chemistry and a smaller fraction attributable to temperature. PMID:17220248

  11. Meteoritic Amino Acids: Diversity in Compositions Reflects Parent Body Histories

    PubMed Central

    2016-01-01

    The analysis of amino acids in meteorites dates back over 50 years; however, it is only in recent years that research has expanded beyond investigations of a narrow set of meteorite groups (exemplified by the Murchison meteorite) into meteorites of other types and classes. These new studies have shown a wide diversity in the abundance and distribution of amino acids across carbonaceous chondrite groups, highlighting the role of parent body processes and composition in the creation, preservation, or alteration of amino acids. Although most chiral amino acids are racemic in meteorites, the enantiomeric distribution of some amino acids, particularly of the nonprotein amino acid isovaline, has also been shown to vary both within certain meteorites and across carbonaceous meteorite groups. Large l-enantiomeric excesses of some extraterrestrial protein amino acids (up to ∼60%) have also been observed in rare cases and point to nonbiological enantiomeric enrichment processes prior to the emergence of life. In this Outlook, we review these recent meteoritic analyses, focusing on variations in abundance, structural distributions, and enantiomeric distributions of amino acids and discussing possible explanations for these observations and the potential for future work. PMID:27413780

  12. Sequence diversity of NanA manifests in distinct enzyme kinetics and inhibitor susceptibility

    NASA Astrophysics Data System (ADS)

    Xu, Zhongli; von Grafenstein, Susanne; Walther, Elisabeth; Fuchs, Julian E.; Liedl, Klaus R.; Sauerbrei, Andreas; Schmidtke, Michaela

    2016-04-01

    Streptococcus pneumoniae is the leading pathogen causing bacterial pneumonia and meningitis. Its surface-associated virulence factor neuraminidase A (NanA) promotes the bacterial colonization by removing the terminal sialyl residues from glycoconjugates on eukaryotic cell surface. The predominant role of NanA in the pathogenesis of pneumococci renders it an attractive target for therapeutic intervention. Despite the highly conserved activity of NanA, our alignment of the 11 NanAs revealed the evolutionary diversity of this enzyme. The amino acid substitutions we identified, particularly those in the lectin domain and in the insertion domain next to the catalytic centre triggered our special interest. We synthesised the representative NanAs and the mutagenized derivatives from E. coli for enzyme kinetics study and neuraminidase inhibitor susceptibility test. Via molecular docking we got a deeper insight into the differences between the two major variants of NanA and their influence on the ligand-target interactions. In addition, our molecular dynamics simulations revealed a prominent intrinsic flexibility of the linker between the active site and the insertion domain, which influences the inhibitor binding. Our findings for the first time associated the primary sequence diversity of NanA with the biochemical properties of the enzyme and with the inhibitory efficiency of neuraminidase inhibitors.

  13. Sequence diversity of NanA manifests in distinct enzyme kinetics and inhibitor susceptibility

    PubMed Central

    Xu, Zhongli; von Grafenstein, Susanne; Walther, Elisabeth; Fuchs, Julian E.; Liedl, Klaus R.; Sauerbrei, Andreas; Schmidtke, Michaela

    2016-01-01

    Streptococcus pneumoniae is the leading pathogen causing bacterial pneumonia and meningitis. Its surface-associated virulence factor neuraminidase A (NanA) promotes the bacterial colonization by removing the terminal sialyl residues from glycoconjugates on eukaryotic cell surface. The predominant role of NanA in the pathogenesis of pneumococci renders it an attractive target for therapeutic intervention. Despite the highly conserved activity of NanA, our alignment of the 11 NanAs revealed the evolutionary diversity of this enzyme. The amino acid substitutions we identified, particularly those in the lectin domain and in the insertion domain next to the catalytic centre triggered our special interest. We synthesised the representative NanAs and the mutagenized derivatives from E. coli for enzyme kinetics study and neuraminidase inhibitor susceptibility test. Via molecular docking we got a deeper insight into the differences between the two major variants of NanA and their influence on the ligand-target interactions. In addition, our molecular dynamics simulations revealed a prominent intrinsic flexibility of the linker between the active site and the insertion domain, which influences the inhibitor binding. Our findings for the first time associated the primary sequence diversity of NanA with the biochemical properties of the enzyme and with the inhibitory efficiency of neuraminidase inhibitors. PMID:27125351

  14. Multilocus sequence analysis of Streptomyces griseus isolates delineating intraspecific diversity in terms of both taxonomy and biosynthetic potential.

    PubMed

    Rong, Xiaoying; Liu, Ning; Ruan, Jisheng; Huang, Ying

    2010-08-01

    Systematics can provide a fundamental framework for understanding the relationships and diversification of organisms. Multilocus sequence analysis (MLSA) has shown great promise for an elaborate taxonomic grouping of streptomycete diversity. To evaluate the practical significance of MLSA as a valuable systematic tool for streptomycetes, we examined six endophytic Streptomyces griseus isolates and two S. griseus reference strains possessing obvious antagonistic activities and identical 16S rRNA gene sequences, using both housekeeping genes and secondary metabolic genes. All the eight strains contained PKS-I and NRPS genes, but not PKS-II genes, and showed similar diversity in both the MLSA phylogeny based on five housekeeping genes (atpD, gyrB, recA, rpoB and trpB) and fingerprinting of KS-AT genes. We also inferred a phylogeny based on concatenated amino acid sequences of representative KS-AT genes from the strains, which displayed a topology correlated well with those of housekeeping-gene MLSA and KS-AT fingerprinting. The good congruence observed between phylogenies based on the different datasets verified that the MLSA scheme provided robust resolution at intraspecific level and could predict the overall diversity of secondary metabolic potential within a Streptomyces species, despite somewhat of a discrepancy with antimicrobial data. It is therefore feasible to apply MLSA to dissecting natural diversity of streptomycetes for a better understanding of their evolution and ecology, as well as for facilitating their bioprospecting.

  15. Diversity of putative archaeal RNA viruses in metagenomic datasets of a yellowstone acidic hot spring.

    PubMed

    Wang, Hongming; Yu, Yongxin; Liu, Taigang; Pan, Yingjie; Yan, Shuling; Wang, Yongjie

    2015-01-01

    Two genomic fragments (5,662 and 1,269 nt in size, GenBank accession no. JQ756122 and JQ756123, respectively) of novel, positive-strand RNA viruses that infect archaea were first discovered in an acidic hot spring in Yellowstone National Park (Bolduc et al., 2012). To investigate the diversity of these newly identified putative archaeal RNA viruses, global metagenomic datasets were searched for sequences that were significantly similar to those of the viruses. A total of 3,757 associated reads were retrieved solely from the Yellowstone datasets and were used to assemble the genomes of the putative archaeal RNA viruses. Nine contigs with lengths ranging from 417 to 5,866 nt were obtained, 4 of which were longer than 2,200 nt; one contig was 204 nt longer than JQ756122, representing the longest genomic sequence of the putative archaeal RNA viruses. These contigs revealed more than 50% sequence similarity to JQ756122 or JQ756123 and may be partial or nearly complete genomes of novel genogroups or genotypes of the putative archaeal RNA viruses. Sequence and phylogenetic analyses indicated that the archaeal RNA viruses are genetically diverse, with at least 3 related viral lineages in the Yellowstone acidic hot spring environment.

  16. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  17. Genome sequence and genetic diversity of European ash trees.

    PubMed

    Sollars, Elizabeth S A; Harper, Andrea L; Kelly, Laura J; Sambles, Christine M; Ramirez-Gonzalez, Ricardo H; Swarbreck, David; Kaithakottil, Gemy; Cooper, Endymion D; Uauy, Cristobal; Havlickova, Lenka; Worswick, Gemma; Studholme, David J; Zohren, Jasmin; Salmon, Deborah L; Clavijo, Bernardo J; Li, Yi; He, Zhesi; Fellgett, Alison; McKinney, Lea Vig; Nielsen, Lene Rostgaard; Douglas, Gerry C; Kjær, Erik Dahl; Downie, J Allan; Boshier, David; Lee, Steve; Clark, Jo; Grant, Murray; Bancroft, Ian; Caccamo, Mario; Buggs, Richard J A

    2017-01-12

    Ash trees (genus Fraxinus, family Oleaceae) are widespread throughout the Northern Hemisphere, but are being devastated in Europe by the fungus Hymenoscyphus fraxineus, causing ash dieback, and in North America by the herbivorous beetle Agrilus planipennis. Here we sequence the genome of a low-heterozygosity Fraxinus excelsior tree from Gloucestershire, UK, annotating 38,852 protein-coding genes of which 25% appear ash specific when compared with the genomes of ten other plant species. Analyses of paralogous genes suggest a whole-genome duplication shared with olive (Olea europaea, Oleaceae). We also re-sequence 37 F. excelsior trees from Europe, finding evidence for apparent long-term decline in effective population size. Using our reference sequence, we re-analyse association transcriptomic data, yielding improved markers for reduced susceptibility to ash dieback. Surveys of these markers in British populations suggest that reduced susceptibility to ash dieback may be more widespread in Great Britain than in Denmark. We also present evidence that susceptibility of trees to H. fraxineus is associated with their iridoid glycoside levels. This rapid, integrated, multidisciplinary research response to an emerging health threat in a non-model organism opens the way for mitigation of the epidemic.

  18. Simple sequence repeat diversity in diploid and tetraploid Coffea species.

    PubMed

    Moncada, Pilar; McCouch, Susan

    2004-06-01

    Thirty-four fluorescently labeled microsatellite markers were used to assess genetic diversity in a set of 30 Coffea accessions from the CENICAFE germplasm bank in Colombia. The plant material included one sample per accession of seven East African accessions representing five diploid species and 23 wild and cultivated tetraploid accessions of Coffea arabica from Africa, Indonesia, and South America. More allelic diversity was detected among the five diploid species than among the 23 tetraploid genotypes. The diploid species averaged 3.6 alleles/locus and had an average polymorphism information content (PIC) value of 0.6, whereas the wild tetraploids averaged 2.5 alleles/locus and had an average PIC value of 0.3 and the cultivated tetraploids (C. arabica cultivars) averaged 1.9 alleles/locus and had an average PIC value of 0.22. Fifty-five percent of the alleles found in the wild tetraploids were not shared with cultivated C. arabica genotypes, supporting the idea that the wild tetraploid ancestors from Ethiopia could be used productively as a source of novel genetic variation to expand the gene pool of elite C. arabica germplasm.

  19. Extensive amino acid sequence homologies between animal lectins

    SciTech Connect

    Paroutaud, P.; Levi, G.; Teichberg, V.I.; Strosberg, A.D.

    1987-09-01

    The authors have established the amino acid sequence of the ..beta..-D-galactoside binding lectin from the electric eel and the sequences of several peptides from a similar lectin isolated from human placenta. These sequences were compared with the published sequences of peptides derived from the ..beta..-D-galactoside binding lectin from human lung and with sequences deduced from cDNAs assigned to the ..beta..-D-galactoside binding lectins from chicken embryo skin and human hepatomas. Significant homologies were observed. One of the highly conserved regions that contains a tryptophan residue and two glutamic acid resides is probably part of the ..beta..-D-galactoside binding site, which, on the basis of spectroscopic studies of the electric eel lectin, is expected to contain such residues. The similarity of the hydropathy profiles and the predicted secondary structure of the lectins from chicken skin and electric eel, in spite of differences in their amino acid sequences, strongly suggests that these proteins have maintained structural homologies during evolution and together with the other ..beta..-D-galactoside binding lectins were derived form a common ancestor gene.

  20. Amino acid sequence of porcine spleen cathepsin D.

    PubMed Central

    Shewale, J G; Tang, J

    1984-01-01

    The amino acid sequence of porcine spleen cathepsin D heavy chain has been determined and, hence, the complete structure of this enzyme is now known. The sequence of heavy chain was constructed by aligning the structures of peptides generated by cyanogen bromide, trypsin, and endo-proteinase Lys C cleavages. The structure of the light chain has been published previously. The cathepsin D molecule contains 339 amino acid residues in two polypeptide chains: a 97-residue light chain and a 242-residue heavy chain, with a combined Mr of 36,779 (without carbohydrate). There are two carbohydrate units linked to asparagine residues 70 and 192. The disulfide bond arrangement in cathepsin D is probably similar to that of pepsin, because the positions of six half-cystine residues are conserved. The active site aspartyl residues, corresponding to aspartic acid-32 and -215 of pepsin, are located at residues 33 and 224 in the cathepsin D molecule. The amino acid sequence around these aspartyl residues is strongly conserved. Cathepsin D shows a strong homology with other acid proteases. When the sequence of cathepsin D, renin, and pepsin are aligned, 32.7% of the residues are identical. The homology is observed throughout the length of the molecules, indicating that three-dimensional structures of all three molecules are similar. PMID:6587385

  1. Exploring Genetic Diversity in Plants Using High-Throughput Sequencing Techniques

    PubMed Central

    Onda, Yoshihiko; Mochida, Keiichi

    2016-01-01

    Food security has emerged as an urgent concern because of the rising world population. To meet the food demands of the near future, it is required to improve the productivity of various crops, not just of staple food crops. The genetic diversity among plant populations in a given species allows the plants to adapt to various environmental conditions. Such diversity could therefore yield valuable traits that could overcome the food-security challenges. To explore genetic diversity comprehensively and to rapidly identify useful genes and/or allele, advanced high-throughput sequencing techniques, also called next-generation sequencing (NGS) technologies, have been developed. These provide practical solutions to the challenges in crop genomics. Here, we review various sources of genetic diversity in plants, newly developed genetic diversity-mining tools synergized with NGS techniques, and related genetic approaches such as quantitative trait locus analysis and genome-wide association study. PMID:27499684

  2. Exploring Genetic Diversity in Plants Using High-Throughput Sequencing Techniques.

    PubMed

    Onda, Yoshihiko; Mochida, Keiichi

    2016-08-01

    Food security has emerged as an urgent concern because of the rising world population. To meet the food demands of the near future, it is required to improve the productivity of various crops, not just of staple food crops. The genetic diversity among plant populations in a given species allows the plants to adapt to various environmental conditions. Such diversity could therefore yield valuable traits that could overcome the food-security challenges. To explore genetic diversity comprehensively and to rapidly identify useful genes and/or allele, advanced high-throughput sequencing techniques, also called next-generation sequencing (NGS) technologies, have been developed. These provide practical solutions to the challenges in crop genomics. Here, we review various sources of genetic diversity in plants, newly developed genetic diversity-mining tools synergized with NGS techniques, and related genetic approaches such as quantitative trait locus analysis and genome-wide association study.

  3. Repetitive sequences: the hidden diversity of heterochromatin in prochilodontid fish

    PubMed Central

    Terencio, Maria L.; Schneider, Carlos H.; Gross, Maria C.; do Carmo, Edson Junior; Nogaroto, Viviane; de Almeida, Mara Cristina; Artoni, Roberto Ferreira; Vicari, Marcelo R.; Feldberg, Eliana

    2015-01-01

    Abstract The structure and organization of repetitive elements in fish genomes are still relatively poorly understood, although most of these elements are believed to be located in heterochromatic regions. Repetitive elements are considered essential in evolutionary processes as hotspots for mutations and chromosomal rearrangements, among other functions – thus providing new genomic alternatives and regulatory sites for gene expression. The present study sought to characterize repetitive DNA sequences in the genomes of Semaprochilodus insignis (Jardine & Schomburgk, 1841) and Semaprochilodus taeniurus (Valenciennes, 1817) and identify regions of conserved syntenic blocks in this genome fraction of three species of Prochilodontidae (Semaprochilodus insignis, Semaprochilodus taeniurus, and Prochilodus lineatus (Valenciennes, 1836) by cross-FISH using Cot-1 DNA (renaturation kinetics) probes. We found that the repetitive fractions of the genomes of Semaprochilodus insignis and Semaprochilodus taeniurus have significant amounts of conserved syntenic blocks in hybridization sites, but with low degrees of similarity between them and the genome of Prochilodus lineatus, especially in relation to B chromosomes. The cloning and sequencing of the repetitive genomic elements of Semaprochilodus insignis and Semaprochilodus taeniurus using Cot-1 DNA identified 48 fragments that displayed high similarity with repetitive sequences deposited in public DNA databases and classified as microsatellites, transposons, and retrotransposons. The repetitive fractions of the Semaprochilodus insignis and Semaprochilodus taeniurus genomes exhibited high degrees of conserved syntenic blocks in terms of both the structures and locations of hybridization sites, but a low degree of similarity with the syntenic blocks of the Prochilodus lineatus genome. Future comparative analyses of other prochilodontidae species will be needed to advance our understanding of the organization and evolution of

  4. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  5. [Sequence diversity of the 3' end genome for Zucchini yellow mosaic virus isolates].

    PubMed

    Chen, Jieyun; Chen, Jishuang; Hong, Jian

    2003-06-01

    The present study analyzed the 3' end sequence of nine mainland isolates of Zucchini yellow mosaic virus (ZYMV) genome including the coat protein (CP) gene and 3' end un-translated region (UTR). Obtained sequence data was compared with previously reported sequences of 16 ZYMV isolates from other regions of the world. In a certain degree, similarity of nucleic acid sequence for CP gene was found being related with the host origin and geological distribution, but not very obvious. Similarity of the CP amino acid sequences deduced from nucleic acid sequences of the 25 ZYMV isolates reached a higher sequence similarity and a clearer relationship to the host origins than to the geological distributions. According to its variation, the amino acid sequence of ZYMV CP was divided into two parts--"the high variable region" contains about 41 amino acids at its N end, while "the conservative region" includes CP core-region and C termini amino acids. Our results showed that the trend of ZYMV variation for its rapid adoption for fitness of the ecological condition, especially to host interaction by mutation of its genomic RNA.

  6. Characterization of an Insertion Sequence Element Associated with Genetically Diverse Plant Pathogenic Streptomyces spp.

    PubMed Central

    Healy, Frank G.; Bukhalid, Raghida A.; Loria, Rosemary

    1999-01-01

    Streptomycetes are common soil inhabitants, yet few described species are plant pathogens. While the pathogenicity mechanisms remain unclear, previous work identified a gene, nec1, which encodes a putative pathogenicity or virulence factor. nec1 and a neighboring transposase pseudogene, ORFtnp, are conserved among unrelated plant pathogens and absent from nonpathogens. The atypical GC content of nec1 suggests that it was acquired through horizontal transfer events. Our investigation of the genetic organization of regions adjacent to the 3′ end of nec1 in Streptomyces scabies 84.34 identified a new insertion sequence (IS) element, IS1629, with homology to other IS elements from prokaryotic animal pathogens. IS1629 is 1,462 bp with 26-bp terminal inverted repeats and encodes a putative 431-amino-acid (aa) transposase. Transposition of IS1629 generates a 10-bp target site duplication. A 77-nucleotide (nt) sequence encompassing the start codon and upstream region of the transposase was identified which could function in the posttranscritpional regulation of transposase synthesis. A functional copy of IS1629 from S. turgidiscabies 94.09 (Hi-C-13) was selected in the transposon trap pCZA126, through its insertion into the λ cI857 repressor. IS1629 is present in multiple copies in some S. scabies strains and is present in all S. acidiscabies and S. turgidiscabies strains examined. A second copy of IS1629 was identified between ORFtnp and nec1 in S. acidiscabies strains. The diversity of IS1629 hybridization profiles was greatest within S. scabies. IS1629 was absent from the 27 nonpathogenic Streptomyces strains tested. The genetic organization and nucleotide sequence of the nec1-IS1629 region was conserved and identical among representatives of S. acidiscabies and S. turgidiscabies. These findings support our current model for the unidirectional transfer of the ORFtnp-nec1-IS1629 locus from IS1629-containing S. scabies (type II) to S. acidiscabies and S. turgidiscabies

  7. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing.

    PubMed

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity.

  8. Low Diversity in the Mitogenome of Sperm Whales Revealed by Next-Generation Sequencing

    PubMed Central

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C. Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity. PMID:23254394

  9. High mitochondrial sequence diversity in linguistic isolates of the Alps.

    PubMed Central

    Stenico, M.; Nigro, L.; Bertorelle, G.; Calafell, F.; Capitanio, M.; Corrain, C.; Barbujani, G.

    1996-01-01

    Segment I of the control region of mtDNA (360 bases) was sequenced in seven samples, each of 10 individuals inhabiting villages in the eastern Italian Alps (South Tyrol and Trentino). Three linguistic groups, German, Italian, and Ladin, were represented by two samples each; the seventh sample comes from an isolated group of German origin, the Mocheni, who are linguistically distinct and geographically separated from the bulk of the German speakers. Seventy-four polymorphic sites were identified, defining 63 different haplotypes. Mocheni and Ladin speakers tend to form two clusters in the evolutionary trees inferred from sequences. Analysis of molecular variance shows significant differentiation within samples, among them, and among linguistic groups. Genetic differences between the Ladins and the other groups are not much smaller than between Europeans and some Africans; variation is large within groups, as well, with the exception of only the Mocheni. In the evolutionary trees where the four alpine groups are compared with other European populations, Mocheni and especially Ladins appear as clear outliers. Romansch-speaking Swiss, who are linguistically related to Ladins, are not genetically similar to them, for this segment of DNA. Because the time elapsed since colonization of the Alps (< or = 12,000 years) is short in mutational terms, the only model accounting for the observed relationships between mtDNA variation and linguistic identity seems one in which a population ancestral to Ladin speakers was already differentiated long before the Alps were settled and the current linguistic affiliations were established. For the Mocheni, the results are consistent with a simpler episode of allele loss, from an original genetic pool common to the ancestors of the current German speakers. PMID:8940282

  10. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing.

    PubMed

    Shih, Patrick M; Wu, Dongying; Latifi, Amel; Axen, Seth D; Fewer, David P; Talla, Emmanuel; Calteau, Alexandra; Cai, Fei; Tandeau de Marsac, Nicole; Rippka, Rosmarie; Herdman, Michael; Sivonen, Kaarina; Coursin, Therese; Laurent, Thierry; Goodwin, Lynne; Nolan, Matt; Davenport, Karen W; Han, Cliff S; Rubin, Edward M; Eisen, Jonathan A; Woyke, Tanja; Gugger, Muriel; Kerfeld, Cheryl A

    2013-01-15

    The cyanobacterial phylum encompasses oxygenic photosynthetic prokaryotes of a great breadth of morphologies and ecologies; they play key roles in global carbon and nitrogen cycles. The chloroplasts of all photosynthetic eukaryotes can trace their ancestry to cyanobacteria. Cyanobacteria also attract considerable interest as platforms for "green" biotechnology and biofuels. To explore the molecular basis of their different phenotypes and biochemical capabilities, we sequenced the genomes of 54 phylogenetically and phenotypically diverse cyanobacterial strains. Comparison of cyanobacterial genomes reveals the molecular basis for many aspects of cyanobacterial ecophysiological diversity, as well as the convergence of complex morphologies without the acquisition of novel proteins. This phylum-wide study highlights the benefits of diversity-driven genome sequencing, identifying more than 21,000 cyanobacterial proteins with no detectable similarity to known proteins, and foregrounds the diversity of light-harvesting proteins and gene clusters for secondary metabolite biosynthesis. Additionally, our results provide insight into the distribution of genes of cyanobacterial origin in eukaryotic nuclear genomes. Moreover, this study doubles both the amount and the phylogenetic diversity of cyanobacterial genome sequence data. Given the exponentially growing number of sequenced genomes, this diversity-driven study demonstrates the perspective gained by comparing disparate yet related genomes in a phylum-wide context and the insights that are gained from it.

  11. Hill number as a bacterial diversity measure framework with high-throughput sequence data

    PubMed Central

    Kang, Sanghoon; Rodrigues, Jorge L. M.; Ng, Justin P.; Gentry, Terry J.

    2016-01-01

    Bacterial diversity is an important parameter for measuring bacterial contributions to the global ecosystem. However, even the task of describing bacterial diversity is challenging due to biological and technological difficulties. One of the challenges in bacterial diversity estimation is the appropriate measure of rare taxa, but the uncertainty of the size of rare biosphere is yet to be experimentally determined. One approach is using the generalized diversity, Hill number (Na), to control the variability associated with rare taxa by differentially weighing them. Here, we investigated Hill number as a framework for microbial diversity measure using a taxa-accmulation curve (TAC) with soil bacterial community data from two distinct studies by 454 pyrosequencing. The reliable biodiversity estimation was obtained when an increase in Hill number arose as the coverage became stable in TACs for a ≥ 1. In silico analysis also indicated that a certain level of sampling depth was desirable for reliable biodiversity estimation. Thus, in order to attain bacterial diversity from second generation sequencing, Hill number can be a good diversity framework with given sequencing depth, that is, until technology is further advanced and able to overcome the under- and random-sampling issues of the current sequencing approaches. PMID:27901123

  12. Measuring the diversity of the human microbiota with targeted next-generation sequencing.

    PubMed

    Finotello, Francesca; Mastrorilli, Eleonora; Di Camillo, Barbara

    2016-12-26

    The human microbiota is a complex ecological community of commensal, symbiotic and pathogenic microorganisms harboured by the human body. Next-generation sequencing (NGS) technologies, in particular targeted amplicon sequencing of the 16S ribosomal RNA gene (16S-seq), are enabling the identification and quantification of human-resident microorganisms at unprecedented resolution, providing novel insights into the role of the microbiota in health and disease. Once microbial abundances are quantified through NGS data analysis, diversity indices provide valuable mathematical tools to describe the ecological complexity of a single sample or to detect species differences between samples. However, diversity is not a determined physical quantity for which a consensus definition and unit of measure have been established, and several diversity indices are currently available. Furthermore, they were originally developed for macroecology and their robustness to the possible bias introduced by sequencing has not been characterized so far. To assist the reader with the selection and interpretation of diversity measures, we review a panel of broadly used indices, describing their mathematical formulations, purposes and properties, and characterize their behaviour and criticalities in dependence of the data features using simulated data as ground truth. In addition, we make available an R package, DiversitySeq, which implements in a unified framework the full panel of diversity indices and a simulator of 16S-seq data, and thus represents a valuable resource for the analysis of diversity from NGS count data and for the benchmarking of computational methods for 16S-seq.

  13. Genotyping by sequencing reveals the genetic diversity of the USDA pisum diversity collection

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The USDA expanded Pisum Single Plant (PSP) core collection is a unique resource that represents the breadth of the genetic diversity of the genus in an inbred format that facilitates genetic study. The collection includes inbred accessions from the refined pea core collection, parent lines of USDA r...

  14. Genome diversity in Brachypodium distachyon: deep sequencing of highly diverse inbred lines

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Natural variation provides a powerful opportunity to study the genetic basis of biological traits. Brachypodium distachyon is a broadly distributed diploid model grass with a small genome and a large collection of diverse inbred lines. As a step towards understanding the genetic basis of the natura...

  15. Active site amino acid sequence of human factor D.

    PubMed

    Davis, A E

    1980-08-01

    Factor D was isolated from human plasma by chromatography on CM-Sephadex C50, Sephadex G-75, and hydroxylapatite. Digestion of reduced, S-carboxymethylated factor D with cyanogen bromide resulted in three peptides which were isolated by chromatography on Sephadex G-75 (superfine) equilibrated in 20% formic acid. NH2-Terminal sequences were determined by automated Edman degradation with a Beckman 890C sequencer using a 0.1 M Quadrol program. The smallest peptide (CNBr III) consisted of the NH2-terminal 14 amino acids. The other two peptides had molecular weights of 17,000 (CNBr I) and 7000 (CNBr II). Overlap of the NH2-terminal sequence of factor D with the NH2-terminal sequence of CNBr I established the order of the peptides. The NH2-terminal 53 residues of factor D are somewhat more homologous with the group-specific protease of rat intestine than with other serine proteases. The NH2-terminal sequence of CNBr II revealed the active site serine of factor D. The typical serine protease active site sequence (Gly-Asp-Ser-Gly-Gly-Pro was found at residues 12-17. The region surrounding the active site serine does not appear to be more highly homologous with any one of the other serine proteases. The structural data obtained point out the similarities between factor D and the other proteases. However, complete definition of the degree of relationship between factor D and other proteases will require determination of the remainder of the primary structure.

  16. Sequence diversity in 36 candidate genes for cardiovascular disorders.

    PubMed Central

    Cambien, F; Poirier, O; Nicaud, V; Herrmann, S M; Mallet, C; Ricard, S; Behague, I; Hallet, V; Blanc, H; Loukaci, V; Thillet, J; Evans, A; Ruidavets, J B; Arveiler, D; Luc, G; Tiret, L

    1999-01-01

    Two strategies involving whole-genome association studies have been proposed for the identification of genes involved in complex diseases. The first one seeks to characterize all common variants of human genes and to test their association with disease. The second one seeks to develop dense maps of single-nucleotide polymorphisms (SNPs) and to detect susceptibility genes through linkage disequilibrium. We performed a molecular screening of the coding and/or flanking regions of 36 candidate genes for cardiovascular diseases. All polymorphisms identified by this screening were further genotyped in 750 subjects of European descent. In the whole set of genes, the lengths explored spanned 53.8 kb in the 5' regions, 68.4 kb in exonic regions, and 13 kb in the 3' regions. The strength of linkage disequilibrium within candidate regions suggests that genomewide maps of SNPs might be efficient ways to identify new disease-susceptibility genes, provided that the maps are sufficiently dense. However, the relatively large number of polymorphisms within coding and regulatory regions of candidate genes raises the possibility that several of them might be functional and that the pattern of genotype-phenotype association might be more complex than initially envisaged, as actually has been observed in some well-characterized genes. These results argue in favor of both genomewide association studies and detailed studies of the overall sequence variation of candidate genes, as complementary approaches. PMID:10364531

  17. The amino acid sequence of iguana (Iguana iguana) pancreatic ribonuclease.

    PubMed

    Zhao, W; Beintema, J J; Hofsteenge, J

    1994-01-15

    The pyrimidine-specific ribonuclease superfamily constitutes a group of homologous proteins so far found only in higher vertebrates. Four separate families are found in mammals, which have resulted from gene duplications in mammalian ancestors. To learn more about the evolutionary history of this superfamily, the primary structure and other characteristics of the pancreatic enzyme from iguana (Iguana iguana), a herbivorous lizard species belonging to the reptiles, have been determined. The polypeptide chain consists of 119 amino acid residues. The positions of insertions and deletions in the sequence are identical to those in the enzyme from snapping turtle. However, the two enzymes differ at 54% of the amino acid positions. Iguana ribonuclease contains no carbohydrate, although the enzyme possesses three recognition sites for carbohydrate attachment, and has a high number of acidic residues in a localized part of the sequence.

  18. Sequence diversity and novelty of natural assemblages of picoeukaryotes from the Indian Ocean

    PubMed Central

    Massana, Ramon; Pernice, Massimo; Bunge, John A; Campo, Javier del

    2011-01-01

    Despite the ecological importance of marine pico-size eukaryotes, the study of their in situ diversity using molecular tools started just a few years ago. These studies have revealed that marine picoeukaryotes are very diverse and include many novel taxa. However, the amount and structure of their phylogenetic diversity and the extent of their sequence novelty still remains poorly known, as a systematic analysis has been seldom attempted. In this study, we use a coherent and carefully curated data set of 500 published 18S ribosomal DNA sequences to quantify the diversity and novelty patterns of picoeukaryotes in the Indian Ocean. Our phylogenetic tree showed many distant lineages. We grouped sequences in OTUs (operational taxonomic units) at discrete values delineated by pair-wise Jukes–Cantor (JC) distances and tree patristic distances. At a distance of 0.01, the number of OTUs observed (237/242; using JC or patristic distances, respectively) was half the number of sequences analyzed, indicating the existence of microdiverse clusters of highly related sequences. At this distance level, we estimated 600–800 OTUs using several statistical methods. The number of OTUs observed was still substantial at higher distances (39/82 at 0.20 distance) suggesting a large diversity at high-taxonomic ranks. Most sequences were related to marine clones from other sites and many were distant to cultured organisms, highlighting the huge culturing gap within protists. The novelty analysis indicated the putative presence of pseudogenes and of truly novel high-rank phylogenetic lineages. The identified diversity and novelty patterns among marine picoeukaryotes are of great importance for understanding and interpreting their ecology and evolution. PMID:20631807

  19. Microbial diversity and metabolic networks in acid mine drainage habitats

    PubMed Central

    Méndez-García, Celia; Peláez, Ana I.; Mesa, Victoria; Sánchez, Jesús; Golyshina, Olga V.; Ferrer, Manuel

    2015-01-01

    Acid mine drainage (AMD) emplacements are low-complexity natural systems. Low-pH conditions appear to be the main factor underlying the limited diversity of the microbial populations thriving in these environments, although temperature, ionic composition, total organic carbon, and dissolved oxygen are also considered to significantly influence their microbial life. This natural reduction in diversity driven by extreme conditions was reflected in several studies on the microbial populations inhabiting the various micro-environments present in such ecosystems. Early studies based on the physiology of the autochthonous microbiota and the growing success of omics-based methodologies have enabled a better understanding of microbial ecology and function in low-pH mine outflows; however, complementary omics-derived data should be included to completely describe their microbial ecology. Furthermore, recent updates on the distribution of eukaryotes and archaea recovered through sterile filtering (herein referred to as filterable fraction) in these environments demand their inclusion in the microbial characterization of AMD systems. In this review, we present a complete overview of the bacterial, archaeal (including filterable fraction), and eukaryotic diversity in these ecosystems, and include a thorough depiction of the metabolism and element cycling in AMD habitats. We also review different metabolic network structures at the organismal level, which is necessary to disentangle the role of each member of the AMD communities described thus far. PMID:26074887

  20. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  1. Amino acid sequence and comparative antigenicity of chicken metallothionein.

    PubMed Central

    McCormick, C C; Fullmer, C S; Garvey, J S

    1988-01-01

    The complete amino acid sequence of metallothionein (MT) from chicken liver is reported. The primary structure was determined by automated sequence analysis of peptides produced by limited acid hydrolysis and by trypsin digestion. The comparative antigenicity of chicken MT was determined by radioimmunoassay using rabbit anti-rat MT polyclonal antibody. Chicken MT consists of 63 amino acids as compared to 61 found in MTs from mammals. One insertion (and two substitutions) occurs in the amino-terminal region, a region considered invariant among mammalian MTs. Eighteen of the 20 cysteines in chicken MT were aligned with cysteines from other mammalian sequences. Two cysteines near the carboxyl terminus are shifted by one residue due to the insertion of proline in that region. Overall, the chicken protein showed approximately equal to 68% sequence identity in a comparison with various mammalian MTs. The affinity of the polyclonal antibody for chicken MT was decreased by 2 orders of magnitude in comparison to that of a mammalian MT (rat MT isoforms). This reduced affinity is attributed to major substitutions in chicken MT in the regions of the principal determinants of mammalian MTs. Theoretical analysis of the primary structure predicted the secondary structure to consist of reverse turns and random coils with no stable beta or helix conformations. There is no evidence that chicken MT differs functionally from mammalian MTs. PMID:2448773

  2. Exploring the environmental diversity of kinetoplastid flagellates in the high-throughput DNA sequencing era

    PubMed Central

    d’Avila-Levy, Claudia Masini; Boucinha, Carolina; Kostygov, Alexei; Santos, Helena Lúcia Carneiro; Morelli, Karina Alessandra; Grybchuk-Ieremenko, Anastasiia; Duval, Linda; Votýpka, Jan; Yurchenko, Vyacheslav; Grellier, Philippe; Lukeš, Julius

    2015-01-01

    The class Kinetoplastea encompasses both free-living and parasitic species from a wide range of hosts. Several representatives of this group are responsible for severe human diseases and for economic losses in agriculture and livestock. While this group encompasses over 30 genera, most of the available information has been derived from the vertebrate pathogenic genera Leishmaniaand Trypanosoma. Recent studies of the previously neglected groups of Kinetoplastea indicated that the actual diversity is much higher than previously thought. This article discusses the known segment of kinetoplastid diversity and how gene-directed Sanger sequencing and next-generation sequencing methods can help to deepen our knowledge of these interesting protists. PMID:26602872

  3. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data

    PubMed Central

    Beerenwinkel, Niko; Günthard, Huldrych F.; Roth, Volker; Metzner, Karin J.

    2012-01-01

    Many viruses, including the clinically relevant RNA viruses HIV (human immunodeficiency virus) and HCV (hepatitis C virus), exist in large populations and display high genetic heterogeneity within and between infected hosts. Assessing intra-patient viral genetic diversity is essential for understanding the evolutionary dynamics of viruses, for designing effective vaccines, and for the success of antiviral therapy. Next-generation sequencing (NGS) technologies allow the rapid and cost-effective acquisition of thousands to millions of short DNA sequences from a single sample. However, this approach entails several challenges in experimental design and computational data analysis. Here, we review the entire process of inferring viral diversity from sample collection to computing measures of genetic diversity. We discuss sample preparation, including reverse transcription and amplification, and the effect of experimental conditions on diversity estimates due to in vitro base substitutions, insertions, deletions, and recombination. The use of different NGS platforms and their sequencing error profiles are compared in the context of various applications of diversity estimation, ranging from the detection of single nucleotide variants (SNVs) to the reconstruction of whole-genome haplotypes. We describe the statistical and computational challenges arising from these technical artifacts, and we review existing approaches, including available software, for their solution. Finally, we discuss open problems, and highlight successful biomedical applications and potential future clinical use of NGS to estimate viral diversity. PMID:22973268

  4. Diversity of 1,213 hepatitis C virus NS3 protease sequences from a clinical virology laboratory database in Marseille university hospitals, southeastern France.

    PubMed

    Hajji, Hind; Aherfi, Sarah; Motte, Anne; Ravaux, Isabelle; Mokhtari, Saadia; Ruiz, Jean-Marie; Poizot-Martin, Isabelle; Tourres, Christian; Tivoli, Natacha; Gérolami, René; Tamalet, Catherine; Colson, Philippe

    2015-11-01

    Infection with hepatitis C virus (HCV) represents a major public health concern worldwide. Recent therapeutic advances have been considerable, HCV genotype continuing to guide therapeutic management. Since 2008, HCV genotyping in our clinical microbiology laboratory at university hospitals of Marseille, Southeastern France, has been based on NS3 protease gene population sequencing, to allow concurrent HCV genotype and protease inhibitor (PI) genotypic resistance determinations. We aimed, first, to analyze the genetic diversity of HCV NS3 protease obtained from blood samples collected between 2003 and 2013 from patients monitored at university hospitals of Marseille and detect possible atypical sequences; and, second, to identify NS3 protease amino acid patterns associated with decreased susceptibility to HCV PIs. A total of 1,213 HCV NS3 protease sequences were available in our laboratory sequence database. We implemented a strategy based on bioinformatic tools to determine whether HCV sequences are representative of our local HCV genetic diversity, or divergent. In our 2003-2012 HCV NS3 protease sequence database, we delineated 32 clusters representative of the majority HCV genetic diversity, and 61 divergent sequences. Five of these divergent sequences showed less than 85% nucleotide identity with their top GenBank hit. In addition, among the 294 sequences obtained in 2013, three were divergent relative to these 32 previously delineated clusters. Finally, we detected both natural and on-treatment genotypic resistance to HCV NS3 PIs, including a substantial prevalence of Q80K substitutions associated with decreased susceptibility to simeprevir, a second generation PI.

  5. Expanding the diversity of oenococcal bacteriophages: insights into a novel group based on the integrase sequence.

    PubMed

    Jaomanjaka, Fety; Ballestra, Patricia; Dols-lafargue, Marguerite; Le Marrec, Claire

    2013-09-02

    Temperate bacteriophages are a contributor of the genetic diversity in the lactic acid bacterium Oenococcus oeni. We used a classification scheme for oenococcal prophages based on integrase gene polymorphism, to analyze a collection of Oenococcus strains mostly isolated in the area of Bordeaux, which represented the major lineages identified through MLST schemes in the species. Genome sequences of oenococcal prophages were clustered into four integrase groups (A to D) which were related to the chromosomal integration site. The prevalence of each group was determined and we could show that members of the intB- and intC-prophage groups were rare in our panel of strains. Our study focused on the so far uncharacterized members of the intD-group. Various intD viruses could be easily isolated from wine samples, while intD lysogens could be induced to produce phages active against two permissive O. oeni isolates. These data support the role of this prophage group in the biology of O. oeni. Global alignment of three relevant intD-prophages revealed significant conservation and highlighted a number of unique ORFs that may contribute to phage and lysogen fitness.

  6. Genome diversity in Brachypodium distachyon: deep sequencing of highly diverse inbred lines.

    PubMed

    Gordon, Sean P; Priest, Henry; Des Marais, David L; Schackwitz, Wendy; Figueroa, Melania; Martin, Joel; Bragg, Jennifer N; Tyler, Ludmila; Lee, Cheng-Ruei; Bryant, Doug; Wang, Wenqin; Messing, Joachim; Manzaneda, Antonio J; Barry, Kerrie; Garvin, David F; Budak, Hikmet; Tuna, Metin; Mitchell-Olds, Thomas; Pfender, William F; Juenger, Thomas E; Mockler, Todd C; Vogel, John P

    2014-08-01

    Brachypodium distachyon is small annual grass that has been adopted as a model for the grasses. Its small genome, high-quality reference genome, large germplasm collection, and selfing nature make it an excellent subject for studies of natural variation. We sequenced six divergent lines to identify a comprehensive set of polymorphisms and analyze their distribution and concordance with gene expression. Multiple methods and controls were utilized to identify polymorphisms and validate their quality. mRNA-Seq experiments under control and simulated drought-stress conditions, identified 300 genes with a genotype-dependent treatment response. We showed that large-scale sequence variants had extremely high concordance with altered expression of hundreds of genes, including many with genotype-dependent treatment responses. We generated a deep mRNA-Seq dataset for the most divergent line and created a de novo transcriptome assembly. This led to the discovery of >2400 previously unannotated transcripts and hundreds of genes not present in the reference genome. We built a public database for visualization and investigation of sequence variants among these widely used inbred lines.

  7. Estimating and comparing microbial diversity in the presence of sequencing errors

    PubMed Central

    Chiu, Chun-Huo

    2016-01-01

    Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This

  8. Estimating and comparing microbial diversity in the presence of sequencing errors.

    PubMed

    Chiu, Chun-Huo; Chao, Anne

    2016-01-01

    Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures' emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This approach

  9. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  10. Multiple site-selective insertions of non-canonical amino acids into sequence-repetitive polypeptides

    PubMed Central

    Wu, I-Lin; Patterson, Melissa A.; Carpenter Desai, Holly E.; Mehl, Ryan A.; Giorgi, Gianluca

    2013-01-01

    A simple and efficient method is described for introduction of non-canonical amino acids at multiple, structurally defined sites within recombinant polypeptide sequences. E. coli MRA30, a bacterial host strain with attenuated activity for release factor 1 (RF1), is assessed for its ability to support the incorporation of a diverse range of non-canonical amino acids in response to multiple encoded amber (TAG) codons within genetic templates derived from superfolder GFP and an elastin-mimetic protein polymer. Suppression efficiency and isolated protein yield were observed to depend on the identity of the orthogonal aminoacyl-tRNA synthetase/tRNACUA pair and the non-canonical amino acid substrate. This approach afforded elastin-mimetic protein polymers containing non-canonical amino acid derivatives at up to twenty-two positions within the repeat sequence with high levels of substitution. The identity and position of the variant residues was confirmed by mass spectrometric analysis of the full-length polypeptides and proteolytic cleavage fragments resulting from thermolysin digestion. The accumulated data suggest that this multi-site suppression approach permits the preparation of protein-based materials in which novel chemical functionality can be introduced at precisely defined positions within the polypeptide sequence. PMID:23625817

  11. Next generation sequencing to define prokaryotic and fungal diversity in the bovine rumen

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A combination of Sanger and 454 sequences of small subunit rRNA loci were used to interrogate the microbial diversity in the bovine rumen of 14 pasture-fed animals. The observed bacterial species richness, based on the V1-V3 region of the 15S rRNA gene, was between 1902 to 2596 species-level operati...

  12. HIV-1 neutralizing antibody response and viral genetic diversity characterized with next generation sequencing

    PubMed Central

    Carter, Christoph C.; Wagner, Gabriel A.; Hightower, George K.; Caballero, Gemma; Phung, Pham; Richman, Douglas D.; Kosakovsky Pond, Sergei L.; Smith, Davey M.

    2014-01-01

    To better understand the dynamics of HIV-specific neutralizing antibody (NAb), we examined associations between viral genetic diversity and the NAb response against a multi-subtype panel of heterologous viruses in a well-characterized, therapy-naïve primary infection cohort. Using next generation sequencing (NGS), we computed sequence-based measures of diversity within HIV-1 env, gag and pol, and compared them to NAb breadth and potency as calculated by a neutralization score. Contemporaneous env diversity and the neutralization score were positively correlated (p=0.0033), as were the neutralization score and estimated duration of infection (EDI) (p=0.0038), and env diversity and EDI (p=0.0005). Neither early env diversity nor baseline viral load correlated with future NAb breadth and potency (p>0.05). Taken together, it is unlikely that neutralizing capability in our cohort was conditioned on viral diversity, but rather that env evolution was driven by the level of NAb selective pressure. PMID:25463602

  13. Propionibacterium acnes: Disease-Causing Agent or Common Contaminant? Detection in Diverse Patient Samples by Next-Generation Sequencing

    PubMed Central

    Friis-Nielsen, Jens; Vinner, Lasse; Hansen, Thomas Arn; Richter, Stine Raith; Fridholm, Helena; Herrera, Jose Alejandro Romero; Lund, Ole; Brunak, Søren; Izarzugaza, Jose M. G.; Mourier, Tobias; Nielsen, Lars Peter

    2016-01-01

    Propionibacterium acnes is the most abundant bacterium on human skin, particularly in sebaceous areas. P. acnes is suggested to be an opportunistic pathogen involved in the development of diverse medical conditions but is also a proven contaminant of human clinical samples and surgical wounds. Its significance as a pathogen is consequently a matter of debate. In the present study, we investigated the presence of P. acnes DNA in 250 next-generation sequencing data sets generated from 180 samples of 20 different sample types, mostly of cancerous origin. The samples were subjected to either microbial enrichment, involving nuclease treatment to reduce the amount of host nucleic acids, or shotgun sequencing. We detected high proportions of P. acnes DNA in enriched samples, particularly skin tissue-derived and other tissue samples, with the levels being higher in enriched samples than in shotgun-sequenced samples. P. acnes reads were detected in most samples analyzed, though the proportions in most shotgun-sequenced samples were low. Our results show that P. acnes can be detected in practically all sample types when molecular methods, such as next-generation sequencing, are employed. The possibility of contamination from the patient or other sources, including laboratory reagents or environment, should therefore always be considered carefully when P. acnes is detected in clinical samples. We advocate that detection of P. acnes always be accompanied by experiments validating the association between this bacterium and any clinical condition. PMID:26818667

  14. Local diversity of heathland Cercozoa explored by in-depth sequencing

    PubMed Central

    Harder, Christoffer Bugge; Rønn, Regin; Brejnrod, Asker; Bass, David; Al-Soud, Waleed Abu; Ekelund, Flemming

    2016-01-01

    Cercozoa are abundant free-living soil protozoa and quantitatively important in soil food webs; yet, targeted high-throughput sequencing (HTS) has not yet been applied to this group. Here we describe the development of a targeted assay to explore Cercozoa using HTS, and we apply this assay to measure Cercozoan community response to drought in a Danish climate manipulation experiment (two sites exposed to artificial drought, two unexposed). Based on a comparison of the hypervariable regions of the 18S ribosomal DNA of 193 named Cercozoa, we concluded that the V4 region is the most suitable for group-specific diversity analysis. We then designed a set of highly specific primers (encompassing ~270 bp) for 454 sequencing. The primers captured all major cercozoan groups; and >95% of the obtained sequences were from Cercozoa. From 443 350 high-quality short reads (>300 bp), we recovered 1585 operational taxonomic units defined by >95% V4 sequence similarity. Taxonomic annotation by phylogeny enabled us to assign >95% of our reads to order level and ~85% to genus level despite the presence of a large, hitherto unknown diversity. Over 40% of the annotated sequences were assigned to Glissomonad genera, whereas the most common individually named genus was the euglyphid Trinema. Cercozoan diversity was largely resilient to drought, although we observed a community composition shift towards fewer testate amoebae. PMID:26953604

  15. The complementary deoxyribonucleic acid sequence of guinea pig endometrial prorelaxin.

    PubMed

    Lee, Y A; Bryant-Greenwood, G D; Mandel, M; Greenwood, F C

    1992-03-01

    The nucleotide sequence of the relaxin gene transcript in the endometrium of the late pregnant guinea pig has been determined. The strategy used was a combination of polymerase chain reaction (PCR) with primers designed from the mRNA sequence of porcine preprorelaxin, rapid amplification of cDNA ends-PCR, and blunt end cloning in M13 mp18. With heterologous primers, a 226-basepair (bp) segment of the guinea pig relaxin gene sequence was obtained and was used to design a guinea pig-specific primer for use with the rapid amplification of cDNA ends-PCR method. The latter allowed completion of the sequence of 336 bp, with a 96-bp overlap. The sequence obtained shows greater homology at both the nucleotide and amino acid levels with porcine and human relaxins H1 and H2 than with rat relaxin, supporting the thesis that the guinea pig is not a rodent. The transcription of the guinea pig endometrial relaxin gene during pregnancy was confirmed by Northern analysis of guinea pig endometrial tissues with a species-specific cDNA probe. The endometrial relaxin gene is transcribed during pregnancy, but not in lactation, consistent with the observed immunostaining for relaxin.

  16. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  17. Archaeon and archaeal virus diversity classification via sequence entropy and fractal dimension

    NASA Astrophysics Data System (ADS)

    Tremberger, George, Jr.; Gallardo, Victor; Espinoza, Carola; Holden, Todd; Gadura, N.; Cheung, E.; Schneider, P.; Lieberman, D.; Cheung, T.

    2010-09-01

    Archaea are important potential candidates in astrobiology as their metabolism includes solar, inorganic and organic energy sources. Archaeal viruses would also be expected to be present in a sustainable archaeal exobiological community. Genetic sequence Shannon entropy and fractal dimension can be used to establish a two-dimensional measure for classification and phylogenetic study of these organisms. A sequence fractal dimension can be calculated from a numerical series consisting of the atomic numbers of each nucleotide. Archaeal 16S and 23S ribosomal RNA sequences were studied. Outliers in the 16S rRNA fractal dimension and entropy plot were found to be halophilic archaea. Positive correlation (R-square ~ 0.75, N = 18) was observed between fractal dimension and entropy across the studied species. The 16S ribosomal RNA sequence entropy correlates with the 23S ribosomal RNA sequence entropy across species with R-square 0.93, N = 18. Entropy values correspond positively with branch lengths of a published phylogeny. The studied archaeal virus sequences have high fractal dimensions of 2.02 or more. A comparison of selected extremophile sequences with archaeal sequences from the Humboldt Marine Ecosystem database (Wood-Hull Oceanography Institute, MIT) suggests the presence of continuous sequence expression as inferred from distributions of entropy and fractal dimension, consistent with the diversity expected in an exobiological archaeal community.

  18. Soil Parameters Drive the Structure, Diversity and Metabolic Potentials of the Bacterial Communities Across Temperate Beech Forest Soil Sequences.

    PubMed

    Jeanbille, M; Buée, M; Bach, C; Cébron, A; Frey-Klett, P; Turpault, M P; Uroz, S

    2016-02-01

    Soil and climatic conditions as well as land cover and land management have been shown to strongly impact the structure and diversity of the soil bacterial communities. Here, we addressed under a same land cover the potential effect of the edaphic parameters on the soil bacterial communities, excluding potential confounding factors as climate. To do this, we characterized two natural soil sequences occurring in the Montiers experimental site. Spatially distant soil samples were collected below Fagus sylvatica tree stands to assess the effect of soil sequences on the edaphic parameters, as well as the structure and diversity of the bacterial communities. Soil analyses revealed that the two soil sequences were characterized by higher pH and calcium and magnesium contents in the lower plots. Metabolic assays based on Biolog Ecoplates highlighted higher intensity and richness in usable carbon substrates in the lower plots than in the middle and upper plots, although no significant differences occurred in the abundance of bacterial and fungal communities along the soil sequences as assessed using quantitative PCR. Pyrosequencing analysis of 16S ribosomal RNA (rRNA) gene amplicons revealed that Proteobacteria, Acidobacteria and Bacteroidetes were the most abundantly represented phyla. Acidobacteria, Proteobacteria and Chlamydiae were significantly enriched in the most acidic and nutrient-poor soils compared to the Bacteroidetes, which were significantly enriched in the soils presenting the higher pH and nutrient contents. Interestingly, aluminium, nitrogen, calcium, nutrient availability and pH appeared to be the best predictors of the bacterial community structures along the soil sequences.

  19. Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing.

    PubMed

    Hansey, Candice N; Vaillancourt, Brieanne; Sekhon, Rajandeep S; de Leon, Natalia; Kaeppler, Shawn M; Buell, C Robin

    2012-01-01

    Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.

  20. Deep sequencing uncovers protistan plankton diversity in the Portuguese Ria Formosa solar saltern ponds.

    PubMed

    Filker, Sabine; Gimmler, Anna; Dunthorn, Micah; Mahé, Frédéric; Stoeck, Thorsten

    2015-03-01

    We used high-throughput sequencing to unravel the genetic diversity of protistan (including fungal) plankton in hypersaline ponds of the Ria Formosa solar saltern works in Portugal. From three ponds of different salinity (4, 12 and 38 %), we obtained ca. 105,000 amplicons (V4 region of the SSU rDNA). The genetic diversity we found was higher than what has been described from solar saltern ponds thus far by microscopy or molecular studies. The obtained operational taxonomic units (OTUs) could be assigned to 14 high-rank taxonomic groups and blasted to 120 eukaryotic families. The novelty of this genetic diversity was extremely high, with 27 % of all OTUs having a sequence divergence of more than 10 % to deposited sequences of described taxa. The highest degree of novelty was found at intermediate salinity of 12 % within the ciliates, which traditionally are considered as the best known and described taxon group within the kingdom Protista. Further substantial novelty was detected within the stramenopiles and the chlorophytes. Analyses of community structures suggest a transition boundary for protistan plankton between 4 and 12 % salinity, suggesting different haloadaptation strategies in individual evolutionary lineages as a result of environmental filtering. Our study makes evident the gaps in our knowledge not only of protistan and fungal plankton diversity in hypersaline environments, but also in their ecology and their strategies to cope with these environmental conditions. It substantiates that specific future research needs to fill these gaps.

  1. Sequence diversity of a domesticated transposase gene, MUG1, in Oryza species.

    PubMed

    Kwon, Soon-Jae; Park, Kyong-Cheul; Son, Jae-Han; Bureau, Thomas; Park, Cheul-Ho; Kim, Nam-Soo

    2009-04-30

    MUG1 is a MULE transposon-related domesticated gene in plants. We assessed the sequence diversity, neutrality, expression, and phylogenetics of the MUG1 gene among Oryza ssp. We found MUG1 expression in all tissues analyzed, with different levels in O. sativa. There were 408 variation sites in the 3886 bp of MUG1 locus. The nucleotide diversity of the MUG1 was higher than functionally known genes in rice. The nucleotide diversity (pi) in the domains was lower than the average nucleotide diversity in whole coding region. The pi values in nonsynonymous sites were lower than those of synonymous sites. Tajima D and Fu and Li D* values were mostly negative values, suggesting purifying selection in MUG1 sequences of Oryza ssp. Genome-specific variation and phylogenetic analyses show a general grouping of MUG1 sequences congruent with Oryza ssp. biogeography; however, our MUG1 phylogenetic results, in combination with separate B and D genome studies, might suggest an early divergence of the Oryza ssp. by continental drift of Gondwanaland. O. longistaminata MUG1 divergence from other AA diploids suggests that it might not be a direct ancestor of the African rice species.

  2. Evaluating Methods for Isolating Total RNA and Predicting the Success of Sequencing Phylogenetically Diverse Plant Transcriptomes

    PubMed Central

    Bruskiewich, Richard; Burris, Jason N.; Carrigan, Charlotte T.; Chase, Mark W.; Clarke, Neil D.; Covshoff, Sarah; dePamphilis, Claude W.; Edger, Patrick P.; Goh, Falicia; Graham, Sean; Greiner, Stephan; Hibberd, Julian M.; Jordon-Thaden, Ingrid; Kutchan, Toni M.; Leebens-Mack, James; Melkonian, Michael; Miles, Nicholas; Myburg, Henrietta; Patterson, Jordan; Pires, J. Chris; Ralph, Paula; Rolf, Megan; Sage, Rowan F.; Soltis, Douglas; Soltis, Pamela; Stevenson, Dennis; Stewart, C. Neal; Surek, Barbara; Thomsen, Christina J. M.; Villarreal, Juan Carlos; Wu, Xiaolei; Zhang, Yong; Deyholos, Michael K.; Wong, Gane Ka-Shu

    2012-01-01

    Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate total RNA from 1115 samples from 695 plant species in 324 families, which represents >900 million years of phylogenetic diversity from green algae through flowering plants, including many plants of economic importance. We then sequenced 629 of these samples on Illumina GAIIx and HiSeq platforms and performed a large comparative analysis to identify predictors of RNA quality and the diversity of putative genes (scaffolds) expressed within samples. Tissue types (e.g., leaf vs. flower) varied in RNA quality, sequencing depth and the number of scaffolds. Tissue age also influenced RNA quality but not the number of scaffolds ≥1000 bp. Overall, 36% of the variation in the number of scaffolds was explained by metrics of RNA integrity (RIN score), RNA purity (OD 260/230), sequencing platform (GAIIx vs HiSeq) and the amount of total RNA used for sequencing. However, our results show that the most commonly used measures of RNA quality (e.g., RIN) are weak predictors of the number of scaffolds because Illumina sequencing is robust to variation in RNA quality. These results provide novel insight into the methods that are most important in isolating high quality RNA for sequencing and assembling plant transcriptomes. The methods and recommendations provided here could increase the efficiency and decrease the cost of RNA sequencing for individual labs and genome centers. PMID:23185583

  3. Analysis of amino acid sequence variations and immunoglobulin E-binding epitopes of German cockroach tropomyosin.

    PubMed

    Jeong, Kyoung Yong; Lee, Jongweon; Lee, In-Yong; Ree, Han-Il; Hong, Chein-Soo; Yong, Tai-Soon

    2004-09-01

    The allergenicities of tropomyosins from different organisms have been reported to vary. The cDNA encoding German cockroach tropomyosin (Bla g 7) was isolated, expressed, and characterized previously. In the present study, the amino acid sequence variations in German cockroach tropomyosin were analyzed in order to investigate its influence on allergenicity. We also undertook the identification of immunodominant peptides containing immunoglobulin E (IgE) epitopes which may facilitate the development of diagnostic and immunotherapeutic strategies based on the recombinant proteins. Two-dimensional gel electrophoresis and immunoblot analysis with mouse anti-recombinant German cockroach tropomyosin serum was performed to investigate the isoforms at the protein level. Reverse transcriptase PCR (RT-PCR) was applied to examine the sequence diversity. Eleven different variants of the deduced amino acid sequences were identified by RT-PCR. German cockroach tropomyosin has only minor sequence variations that did not seem to affect its allergenicity significantly. These results support the molecular basis underlying the cross-reactivities of arthropod tropomyosins. Recombinant fragments were also generated by PCR, and IgE-binding epitopes were assessed by enzyme-linked immunosorbent assay. Sera from seven patients revealed heterogeneous IgE-binding responses. This study demonstrates multiple IgE-binding epitope regions in a single molecule, suggesting that full-length tropomyosin should be used for the development of diagnostic and therapeutic reagents.

  4. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  5. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  6. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  7. The amino acid sequence of rabbit cardiac troponin I.

    PubMed Central

    Grand, R J; Wilkinson, J M

    1976-01-01

    The complete amino acid sequence of troponin I from rabbit cardiac muscle was determined by the isolation of four unique CNBr fragments, together with overlapping tryptic peptides containing radioactive methionine residues. Overlap data for residues 35-36, 93-94 and 140-145 are incomplete, the sequence at these positions being based on homology with the sequence of the fast-skeletal-muscle protein. Cardiac troponin I is a single polypeptide chain of 206 residues with mol.wt. 23550 and an extinction coefficient, E 1%,1cm/280, of 4.37. The protein has a net positive charge of 14 and is thus somewhat more basic than troponin I from fast-skeletal muscle. Comparison of the sequences of troponin I from cardiac and fast skeletal muscle show that the cardiac protein has 26 extra residues at the N-terminus which account for the larger size of the protein. In the remainder of sequence there is a considerable degree of homology, this being greater in the C-terminal two-thirds of the molecule. The region in the cardiac protein corresponding to the peptide with inhibitory activity from the fast-skeletal-muscle protein is very similar and it seems unlikely that this is the cause of the difference in inhibitory activity between the two proteins. The region responsible for binding troponin C, however, possesses a lower degree of homology. Detailed evidence on which the sequence is based has been deposited as Supplementary Publication SUP 50072 (20 pages), at the British Library Lending Division, Boston Spa, Wetherby, West Yorkshire LS23 7QB, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1976) 153, 5. PMID:1008822

  8. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    SciTech Connect

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; Lamson, Jacob S.; He, Jennifer; Hoover, Cindi A.; Blow, Matthew J.; Bristow, James; Butland, Gareth; Arkin, Adam P.; Deutschbauer, Adam

    2015-05-12

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with any transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes

  9. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

    DOE PAGES

    Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; ...

    2015-05-12

    Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are

  10. Amino acid sequence of a mouse immunoglobulin mu chain.

    PubMed Central

    Kehry, M; Sibley, C; Fuhrman, J; Schilling, J; Hood, L E

    1979-01-01

    The complete amino acid sequence of the mouse mu chain from the BALB/c myeloma tumor MOPC 104E is reported. The C mu region contains four consecutive homology regions of approximately 110 residues and a COOH-terminal region of 19 residues. A comparison of this mu chain from mouse with a complete mu sequence from human (Ou) and a partial mu chain sequence from dog (Moo) reveals a striking gradient of increasing homology from the NH2-terminal to the COOH-terminal portion of these mu chains, with the former being the least and the latter the most highly conserved. Four of the five sites of carbohydrate attachment appear to be at identical residue positions when the constant regions of the mouse and human mu chains are compared. The mu chain of MOPC 104E has a carbohydrate moiety attached in the second hypervariable region. This is particularly interesting in view of the fact that MOPC 104E binds alpha-(1 leads to 3)-dextran, a simple carbohydrate. The structural and functional constraints imposed by these comparative sequence analyses are discussed. PMID:111247

  11. New Tools For Understanding Microbial Diversity Using High-throughput Sequence Data

    NASA Astrophysics Data System (ADS)

    Knight, R.; Hamady, M.; Liu, Z.; Lozupone, C.

    2007-12-01

    High-throughput sequencing techniques such as 454 are straining the limits of tools traditionally used to build trees, choose OTUs, and perform other essential sequencing tasks. We have developed a workflow for phylogenetic analysis of large-scale sequence data sets that combines existing tools, such as the Arb phylogeny package and the NAST multiple sequence alignment tool, with new methods for choosing and clustering OTUs and for performing phylogenetic community analysis with UniFrac. This talk discusses the cyberinfrastructure we are developing to support the human microbiome project, and the application of these workflows to analyze very large data sets that contrast the gut microbiota with a range of physical environments. These tools will ultimately help to define core and peripheral microbiomes in a range of environments, and will allow us to understand the physical and biotic factors that contribute most to differences in microbial diversity.

  12. High Sequence Variability, Diverse Subcellular Localizations, and Ecological Implications of Alkaline Phosphatase in Dinoflagellates and Other Eukaryotic Phytoplankton

    PubMed Central

    Lin, Xin; Zhang, Huan; Cui, Yudong; Lin, Senjie

    2012-01-01

    Alkaline phosphatase (AP) is a key enzyme for phytoplankton to utilize dissolved organic phosphorus (DOP) when dissolved inorganic phosphorus is limited. While three major types of AP and their correspondingly diverse subcellular localization have been recognized in bacteria, little is known about AP in eukaryotic phytoplankton such as dinoflagellates. Here, we isolated a full-length AP cDNA from a latest-diverging dinoflagellate genus Alexandrium, and conducted comparative analyses with homologs from a relatively basal (Amphidinium carterae) and late-diverging (Karenia brevis) lineage of dinoflagellates as well as other eukaryotic algae. New data and previous studies indicate that AP is common in dinoflagellates and most other major eukaryotic groups of phytoplankton. AP sequences are more variable than many other genes studied in dinoflagellates, and are divergent among different eukaryotic phytoplankton lineages. Sequence comparison to the other characterized APs suggests that dinoflagellates and some other eukaryotic phytoplankton possess the putative AP as phoA type, but some other eukaryotic phytoplankton seem to have other types. Phylogenetic analyses based on AP amino acid sequences indicated that the “red-type” eukaryotic lineages formed a monophyletic group, suggesting a common origin of their APs. As different amino acid sequences have been found to predictably determine different spatial distribution in the cells, which may facilitate access to different pools of DOP, existing computational models were adopted to predict the subcellular localizations of putative AP in the three dinoflagellates and other eukaryotic phytoplankton. Results showed different subcellular localizations of APs in different dinoflagellates and other lineages. The linkage between AP sequence divergence, subcellular localization, and ecological niche differentiation requires rigorous experimental verification, and this study now provides a framework for such a future effort

  13. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  14. Novel chytrid lineages dominate fungal sequences in diverse marine and freshwater habitats

    NASA Astrophysics Data System (ADS)

    Comeau, André M.; Vincent, Warwick F.; Bernier, Louis; Lovejoy, Connie

    2016-07-01

    In aquatic environments, fungal communities remain little studied despite their taxonomic and functional diversity. To extend the ecological coverage of this group, we conducted an in-depth analysis of fungal sequences within our collection of 3.6 million V4 18S rRNA pyrosequences originating from 319 individual marine (including sea-ice) and freshwater samples from libraries generated within diverse projects studying Arctic and temperate biomes in the past decade. Among the ~1.7 million post-filtered reads of highest taxonomic and phylogenetic quality, 23,263 fungal sequences were identified. The overall mean proportion was 1.35%, but with large variability; for example, from 0.01 to 59% of total sequences for Arctic seawater samples. Almost all sample types were dominated by Chytridiomycota-like sequences, followed by moderate-to-minor contributions of Ascomycota, Cryptomycota and Basidiomycota. Species and/or strain richness was high, with many novel sequences and high niche separation. The affinity of the most common reads to phytoplankton parasites suggests that aquatic fungi deserve renewed attention for their role in algal succession and carbon cycling.

  15. Novel chytrid lineages dominate fungal sequences in diverse marine and freshwater habitats

    PubMed Central

    Comeau, André M.; Vincent, Warwick F.; Bernier, Louis; Lovejoy, Connie

    2016-01-01

    In aquatic environments, fungal communities remain little studied despite their taxonomic and functional diversity. To extend the ecological coverage of this group, we conducted an in-depth analysis of fungal sequences within our collection of 3.6 million V4 18S rRNA pyrosequences originating from 319 individual marine (including sea-ice) and freshwater samples from libraries generated within diverse projects studying Arctic and temperate biomes in the past decade. Among the ~1.7 million post-filtered reads of highest taxonomic and phylogenetic quality, 23,263 fungal sequences were identified. The overall mean proportion was 1.35%, but with large variability; for example, from 0.01 to 59% of total sequences for Arctic seawater samples. Almost all sample types were dominated by Chytridiomycota-like sequences, followed by moderate-to-minor contributions of Ascomycota, Cryptomycota and Basidiomycota. Species and/or strain richness was high, with many novel sequences and high niche separation. The affinity of the most common reads to phytoplankton parasites suggests that aquatic fungi deserve renewed attention for their role in algal succession and carbon cycling. PMID:27444055

  16. Global Genomic Diversity of Human Papillomavirus 11 Based on 433 Isolates and 78 Complete Genome Sequences

    PubMed Central

    Jelen, Mateja M.; Chen, Zigui; Kocjan, Boštjan J.; Hošnjak, Lea; Burt, Felicity J.; Chan, Paul K. S.; Chouhy, Diego; Combrinck, Catharina E.; Estrade, Christine; Fiander, Alison; Garland, Suzanne M.; Giri, Adriana A.; González, Joaquín Víctor; Gröning, Arndt; Hibbitts, Sam; Luk, Tommy N. M.; Marinic, Karina; Matsukura, Toshihiko; Neumann, Anna; Oštrbenk, Anja; Picconi, Maria Alejandra; Sagadin, Martin; Sahli, Roland; Seedat, Riaz Y.; Seme, Katja; Severini, Alberto; Sinchi, Jessica L.; Smahelova, Jana; Tabrizi, Sepehr N.; Tachezy, Ruth; Tohme Faybush, Sarah; Uloza, Virgilijus; Uloziene, Ingrida; Wong, Yong Wee; Židovec Lepej, Snježana; Burk, Robert D.

    2016-01-01

    ABSTRACT Human papillomavirus 11 (HPV11) is an etiological agent of anogenital warts and laryngeal papillomas and is included in the 4-valent and 9-valent prophylactic HPV vaccines. We established the largest collection of globally circulating HPV11 isolates to date and examined the genomic diversity of 433 isolates and 78 complete genomes (CGs) from six continents. The genomic variation within the 2,800-bp E5a-E5b-L1-upstream regulatory region was initially studied in 181/207 (87.4%) HPV11 isolates collected for this study. Of these, the CGs of 30 HPV11 variants containing unique single nucleotide polymorphisms (SNPs), indels (insertions or deletions), or amino acid changes were fully sequenced. A maximum likelihood tree based on the global alignment of 78 HPV11 CGs (30 CGs from our study and 48 CGs from GenBank) revealed two HPV11 lineages (lineages A and B) and four sublineages (sublineages A1, A2, A3, and A4). HPV11 (sub)lineage-specific SNPs within the CG were identified, as well as the 208-bp representative region for CG-based phylogenetic clustering within the partial E2 open reading frame and noncoding region 2. Globally, sublineage A2 was the most prevalent, followed by sublineages A1, A3, and A4 and lineage B. IMPORTANCE This collaborative international study defined the global heterogeneity of HPV11 and established the largest collection of globally circulating HPV11 genomic variants to date. Thirty novel complete HPV11 genomes were determined and submitted to the available sequence repositories. Global phylogenetic analysis revealed two HPV11 variant lineages and four sublineages. The HPV11 (sub)lineage-specific SNPs and the representative region identified within the partial genomic region E2/noncoding region 2 (NCR2) will enable the simpler identification and comparison of HPV11 variants worldwide. This study provides an important knowledge base for HPV11 for future studies in HPV epidemiology, evolution, pathogenicity, prevention, and molecular assay

  17. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies.

    PubMed

    Rössler, D; Ludwig, W; Schleifer, K H; Lin, C; McGill, T J; Wisotzkey, J D; Jurtshuk, P; Fox, G E

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  18. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies

    NASA Technical Reports Server (NTRS)

    Rossler, D.; Ludwig, W.; Schleifer, K. H.; Lin, C.; McGill, T. J.; Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  19. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    PubMed

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  20. Increasing Sequence Diversity with Flexible Backbone Protein Design: The Complete Redesign of a Protein Hydrophobic Core

    SciTech Connect

    Murphy, Grant S.; Mills, Jeffrey L.; Miley, Michael J.; Machius, Mischa; Szyperski, Thomas; Kuhlman, Brian

    2015-10-15

    Protein design tests our understanding of protein stability and structure. Successful design methods should allow the exploration of sequence space not found in nature. However, when redesigning naturally occurring protein structures, most fixed backbone design algorithms return amino acid sequences that share strong sequence identity with wild-type sequences, especially in the protein core. This behavior places a restriction on functional space that can be explored and is not consistent with observations from nature, where sequences of low identity have similar structures. Here, we allow backbone flexibility during design to mutate every position in the core (38 residues) of a four-helix bundle protein. Only small perturbations to the backbone, 12 {angstrom}, were needed to entirely mutate the core. The redesigned protein, DRNN, is exceptionally stable (melting point >140C). An NMR and X-ray crystal structure show that the side chains and backbone were accurately modeled (all-atom RMSD = 1.3 {angstrom}).

  1. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment.

  2. Microbiology of diverse acidic and non-acidic microhabitats within a sulfidic ore mine.

    PubMed

    Falteisek, Lukáš; Cepička, Ivan

    2012-11-01

    A wide variety of microhabitats within the extremely acidic abandoned underground copper mine Zlaté Hory (Czech Republic) was investigated. SSU rDNA libraries were analyzed from 15 samples representing gossan, sulfide-leaching environments in the oxidation zone, and acidic water springs in the mine galleries. Microbial analyses were extended by analyses of chemical composition of water and solid phases and identification of arising secondary minerals. The microbial communities of the three main classes of microenvironments differed in almost every aspect. Among others, ecological partitioning of Acidithiobacillus ferrooxidans and the recently described A. ferrivorans was observed. Distinct types of communities inhabiting the water springs were detected. The more extreme springs (pH <3, conductivity >2 mS/cm) were inhabited by "Ferrovum" spp. and A. ferrivorans, whereas Gallionella sp. dominated the less extreme ones. A new role for gossan in the extremely acidic ecosystem is proposed. This zone was inhabited by a large diversity of neutrophilic heterotrophs that appeared to be continuously washed out to the acidic environments localized downstream. Five species originating in gossan were found in several acidic habitats. Here they can survive and probably serve as scavengers of dead biomass, particularly from chemoautotrophic growths. No such process has been described from acidic mine environments so far.

  3. Using mitochondrial nucleotide sequences to investigate diversity and genealogical relationships within common carp (Cyprinus carpio L.).

    PubMed

    Thai, B T; Burridge, C P; Pham, T A; Austin, C M

    2005-02-01

    Direct sequencing of mitochondrial DNA (mtDNA) D-loop (745 bp) and MTATPase6/MTATPase8 (857 bp) regions was used to investigate genetic variation within common carp and develop a global genealogy of common carp strains. The D-loop region was more variable than the MTATPase6/MTATPase8 region, but given the wide distribution of carp the overall levels of sequence divergence were low. Levels of haplotype diversity varied widely among countries with Chinese, Indonesian and Vietnamese carp showing the greatest diversity whereas Japanese Koi and European carp had undetectable nucleotide variation. A genealogical analysis supports a close relationship between Vietnamese, Koi and Chinese Color carp strains and to a lesser extent, European carp. Chinese and Indonesian carp strains were the most divergent, and their relationships do not support the evolution of independent Asian and European lineages and current taxonomic treatments.

  4. Culturable and molecular phylogenetic diversity of microorganisms in an open-dumped, extremely acidic Pb/Zn mine tailings.

    PubMed

    Tan, Gui-Liang; Shu, Wen-Sheng; Hallberg, Kevin B; Li, Fang; Lan, Chong-Yu; Zhou, Wen-Hua; Huang, Li-Nan

    2008-09-01

    A combination of cultivation-based and molecular-based approaches was used to reveal the culturable and molecular diversity of the microbes inhabiting an open-dumped Pb/Zn mine tailings that was undergoing intensive acid generation (pH 1.9). Culturable bacteria found in the extremely acidic mine tailings were Acidithiobacillus ferrooxidans, Leptospirillum ferriphilum, Sulfobacillus thermotolerans and Acidiphilium cryptum, where the number of acidophilic heterotrophs was ten times higher than that of the iron- and sulfur-oxidizing bacteria. Cloning and phylogenetic analysis revealed that, in contrast to the adjacent AMD, the mine tailings possessed a low microbial diversity with archaeal sequence types dominating the 16S rRNA gene library. Of the 141 clones examined, 132 were represented by two sequence types phylogenetically affiliated with the iron-oxidizing archaea Ferroplasma acidiphilum and three belonged to two tentative groups within the Thermoplasma lineage so far represented by only a few environmental sequences. Six clones in the library were represented by the only bacterial sequence type and were closely related to the well-described iron-oxidizer L. ferriphilum. The significant differences in the prokaryotic community structures of the extremely acidic mine tailings and the AMD associated with it highlights the importance of studying the microbial communities that are more directly involved in the iron and sulfur cycles of mine tailings.

  5. Diversity of lactic acid bacteria in two Flemish artisan raw milk Gouda-type cheeses.

    PubMed

    Van Hoorde, Koenraad; Verstraete, Tine; Vandamme, Peter; Huys, Geert

    2008-10-01

    PCR-denaturing gradient gel electrophoresis (PCR-DGGE) was used to study the diversity of lactic acid bacteria (LAB) in two Flemish artisan raw milk Gouda-type cheeses. In parallel, conventional culturing was performed. Isolates were identified using (GTG)(5)-PCR and sequence analysis of 16S rRNA and pheS genes. Discriminant analysis revealed some differences in overall LAB diversity between the two batches and between the two cheeses. Within each batch, the diversity of 8- and 12-week-old cheeses was relatively similar. Conventional isolation mainly revealed the presence of Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus brevis, Lactobacillus rhamnosus and Pediococcus pentosaceus. PCR-DGGE revealed the presence of three species of which no isolates were recovered, i.e. Enterococcus faecalis, Lactobacillus parabuchneri and Lactobacillus gallinarum. Conversely, not all isolated bacteria were detected by PCR-DGGE. We recommend the integrated use of culture-dependent and -independent approaches to maximally encompass the taxonomic spectrum of LAB occurring in Gouda-type and other cheeses.

  6. Microbial Diversity and Population Structure of Extremely Acidic Sulfur-Oxidizing Biofilms From Sulfidic Caves

    NASA Astrophysics Data System (ADS)

    Jones, D.; Stoffer, T.; Lyon, E. H.; Macalady, J. L.

    2005-12-01

    Extremely acidic (pH 0-1) microbial biofilms called snottites form on the walls of sulfidic caves where gypsum replacement crusts isolate sulfur-oxidizing microorganisms from the buffering action of limestone host rock. We investigated the phylogeny and population structure of snottites from sulfidic caves in central Italy using full cycle rRNA methods. A small subunit rRNA bacterial clone library from a Frasassi cave complex snottite sample contained a single sequence group (>60 clones) similar to Acidithiobacillus thiooxidans. Bacterial and universal rRNA clone libraries from other Frasassi snottites were only slightly more diverse, containing a maximum of 4 bacterial species and probably 2 archaeal species. Fluorescence in situ hybridization (FISH) of snottites from Frasassi and from the much warmer Rio Garrafo cave complex revealed that all of the communities are simple (low-diversity) and dominated by Acidithiobacillus and/or Ferroplasma species, with smaller populations of an Acidimicrobium species, filamentous fungi, and protists. Our results suggest that sulfidic cave snottites will be excellent model microbial ecosystems suited for ecological and metagenomic studies aimed at elucidating geochemical and ecological controls on microbial diversity, and at mapping the spatial history of microbial evolutionary events such as adaptations, recombinations and gene transfers.

  7. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    PubMed

    Atkinson, Holly J; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C

    2009-01-01

    The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  8. AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.

    PubMed

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.

  9. AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees

    PubMed Central

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php. PMID:24892935

  10. Multilocus sequence analysis reveals high genetic diversity in clinical isolates of Burkholderia cepacia complex from India.

    PubMed

    Gautam, Vikas; Patil, Prashant P; Kumar, Sunil; Midha, Samriti; Kaur, Mandeep; Kaur, Satinder; Singh, Meenu; Mali, Swapna; Shastri, Jayanthi; Arora, Anita; Ray, Pallab; Patil, Prabhu B

    2016-10-21

    Burkholderia cepacia complex (Bcc) is a complex group of bacteria causing opportunistic infections in immunocompromised and cystic fibrosis (CF) patients. Herein, we report multilocus sequence typing and analysis of the 57 clinical isolates of Bcc collected over the period of seven years (2005-2012) from several hospitals across India. A total of 21 sequence types (ST) including two STs from cystic fibrosis patient's isolates and twelve novel STs were identified in the population reflecting the extent of genetic diversity. Multilocus sequence analysis revealed two lineages in population, a major lineage belonging to B. cenocepacia and a minor lineage belonging to B. cepacia. Split-decomposition analysis suggests absence of interspecies recombination and intraspecies recombination contributed in generating genotypic diversity amongst isolates. Further linkage disequilibrium analysis indicates that recombination takes place at a low frequency, which is not sufficient to break down the clonal relationship. This knowledge of the genetic structure of Bcc population from a rapidly developing country will be invaluable in the epidemiology, surveillance and understanding global diversity of this group of a pathogen.

  11. Multilocus sequence analysis reveals high genetic diversity in clinical isolates of Burkholderia cepacia complex from India

    PubMed Central

    Gautam, Vikas; Patil, Prashant P.; Kumar, Sunil; Midha, Samriti; Kaur, Mandeep; Kaur, Satinder; Singh, Meenu; Mali, Swapna; Shastri, Jayanthi; Arora, Anita; Ray, Pallab; Patil, Prabhu B.

    2016-01-01

    Burkholderia cepacia complex (Bcc) is a complex group of bacteria causing opportunistic infections in immunocompromised and cystic fibrosis (CF) patients. Herein, we report multilocus sequence typing and analysis of the 57 clinical isolates of Bcc collected over the period of seven years (2005–2012) from several hospitals across India. A total of 21 sequence types (ST) including two STs from cystic fibrosis patient’s isolates and twelve novel STs were identified in the population reflecting the extent of genetic diversity. Multilocus sequence analysis revealed two lineages in population, a major lineage belonging to B. cenocepacia and a minor lineage belonging to B. cepacia. Split-decomposition analysis suggests absence of interspecies recombination and intraspecies recombination contributed in generating genotypic diversity amongst isolates. Further linkage disequilibrium analysis indicates that recombination takes place at a low frequency, which is not sufficient to break down the clonal relationship. This knowledge of the genetic structure of Bcc population from a rapidly developing country will be invaluable in the epidemiology, surveillance and understanding global diversity of this group of a pathogen. PMID:27767197

  12. Diversity of the Cronobacter Genus as Revealed by Multilocus Sequence Typing

    PubMed Central

    Joseph, S.; Sonbol, H.; Hariri, S.; Desai, P.; McClelland, M.

    2012-01-01

    Cronobacter (previously known as Enterobacter sakazakii) is a diverse bacterial genus consisting of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. universalis, C. muytjensii, C. dublinensis, and C. condimenti. In this study, we have used a multilocus sequence typing (MLST) approach employing the alleles of 7 genes (atpD, fusA, glnS, gltB, gyrB, infB, and ppsA; total length, 3,036 bp) to investigate the phylogenetic relationship of 325 Cronobacter species isolates. Strains were chosen on the basis of their species, geographic and temporal distribution, source, and clinical outcome. The earliest strain was isolated from milk powder in 1950, and the earliest clinical strain was isolated in 1953. The existence of seven species was supported by MLST. Intraspecific variation ranged from low diversity in C. sakazakii to extensive diversity within some species, such as C. muytjensii and C. dublinensis, including evidence of gene conversion between species. The predominant species from clinical sources was found to be C. sakazakii. C. sakazakii sequence type 4 (ST4) was the predominant sequence type of cerebral spinal fluid isolates from cases of meningitis. PMID:22785185

  13. Genetic diversity analysis of Gossypium arboreum germplasm accessions using genotyping-by-sequencing.

    PubMed

    Li, Ruijuan; Erpelding, John E

    2016-10-01

    The diploid cotton species Gossypium arboreum possesses many favorable agronomic traits such as drought tolerance and disease resistance, which can be utilized in the development of improved upland cotton cultivars. The USDA National Plant Germplasm System maintains more than 1600 G. arboreum accessions. Little information is available on the genetic diversity of the collection thereby limiting the utilization of this cotton species. The genetic diversity and population structure of the G. arboreum germplasm collection were assessed by genotyping-by-sequencing of 375 accessions. Using genome-wide single nucleotide polymorphism sequence data, two major clusters were inferred with 302 accessions in Cluster 1, 64 accessions in Cluster 2, and nine accessions unassigned due to their nearly equal membership to each cluster. These two clusters were further evaluated independently resulting in the identification of two sub-clusters for the 302 Cluster 1 accessions and three sub-clusters for the 64 Cluster 2 accessions. Low to moderate genetic diversity between clusters and sub-clusters were observed indicating a narrow genetic base. Cluster 2 accessions were more genetically diverse and the majority of the accessions in this cluster were landraces. In contrast, Cluster 1 is composed of varieties or breeding lines more recently added to the collection. The majority of the accessions had kinship values ranging from 0.6 to 0.8. Eight pairs of accessions were identified as potential redundancies due to their high kinship relatedness. The genetic diversity and genotype data from this study are essential to enhance germplasm utilization to identify genetically diverse accessions for the detection of quantitative trait loci associated with important traits that would benefit upland cotton improvement.

  14. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  15. Comparison of a High-Resolution Melting Assay to Next-Generation Sequencing for Analysis of HIV Diversity

    PubMed Central

    Cousins, Matthew M.; Ou, San-San; Wawer, Maria J.; Munshaw, Supriya; Swan, David; Magaret, Craig A.; Mullis, Caroline E.; Serwadda, David; Porcella, Stephen F.; Gray, Ronald H.; Quinn, Thomas C.; Donnell, Deborah; Eshleman, Susan H.

    2012-01-01

    Next-generation sequencing (NGS) has recently been used for analysis of HIV diversity, but this method is labor-intensive, costly, and requires complex protocols for data analysis. We compared diversity measures obtained using NGS data to those obtained using a diversity assay based on high-resolution melting (HRM) of DNA duplexes. The HRM diversity assay provides a single numeric score that reflects the level of diversity in the region analyzed. HIV gag and env from individuals in Rakai, Uganda, were analyzed in a previous study using NGS (n = 220 samples from 110 individuals). Three sequence-based diversity measures were calculated from the NGS sequence data (percent diversity, percent complexity, and Shannon entropy). The amplicon pools used for NGS were analyzed with the HRM diversity assay. HRM scores were significantly associated with sequence-based measures of HIV diversity for both gag and env (P < 0.001 for all measures). The level of diversity measured by the HRM diversity assay and NGS increased over time in both regions analyzed (P < 0.001 for all measures except for percent complexity in gag), and similar amounts of diversification were observed with both methods (P < 0.001 for all measures except for percent complexity in gag). Diversity measures obtained using the HRM diversity assay were significantly associated with those from NGS, and similar increases in diversity over time were detected by both methods. The HRM diversity assay is faster and less expensive than NGS, facilitating rapid analysis of large studies of HIV diversity and evolution. PMID:22785188

  16. The pig gut microbial diversity: Understanding the pig gut microbial ecology through the next generation high throughput sequencing.

    PubMed

    Kim, Hyeun Bum; Isaacson, Richard E

    2015-06-12

    The importance of the gut microbiota of animals is widely acknowledged because of its pivotal roles in the health and well being of animals. The genetic diversity of the gut microbiota contributes to the overall development and metabolic needs of the animal, and provides the host with many beneficial functions including production of volatile fatty acids, re-cycling of bile salts, production of vitamin K, cellulose digestion, and development of immune system. Thus the intestinal microbiota of animals has been the subject of study for many decades. Although most of the older studies have used culture dependent methods, the recent advent of high throughput sequencing of 16S rRNA genes has facilitated in depth studies exploring microbial populations and their dynamics in the animal gut. These culture independent DNA based studies generate large amounts of data and as a result contribute to a more detailed understanding of the microbiota dynamics in the gut and the ecology of the microbial populations. Of equal importance, is being able to identify and quantify microbes that are difficult to grow or that have not been grown in the laboratory. Interpreting the data obtained from this type of study requires using basic principles of microbial diversity to understand importance of the composition of microbial populations. In this review, we summarize the literature on culture independent studies of the pig gut microbiota with an emphasis on its succession and alterations caused by diverse factors.

  17. Strategies for achieving high sequencing accuracy for low diversity samples and avoiding sample bleeding using illumina platform.

    PubMed

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer's, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  18. Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

    PubMed Central

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  19. Structural gene and complete amino acid sequence of Vibrio alginolyticus collagenase.

    PubMed Central

    Takeuchi, H; Shibano, Y; Morihara, K; Fukushima, J; Inami, S; Keil, B; Gilles, A M; Kawamoto, S; Okuda, K

    1992-01-01

    The DNA encoding the collagenase of Vibrio alginolyticus was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited both collagenase antigen and collagenase activity. The open reading frame from the ATG initiation codon was 2442 bp in length for the collagenase structural gene. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature collagenase consists of 739 amino acids with an Mr of 81875. The amino acid sequences of 20 polypeptide fragments were completely identical with the deduced amino acid sequences of the collagenase gene. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified collagenase reported previously. The analyses of both the DNA and amino acid sequences of the collagenase gene were rigorously performed, but we could not detect any significant sequence similarity to other collagenases. Images Fig. 2. PMID:1311172

  20. Fungal diversity in grape must and wine fermentation assessed by massive sequencing, quantitative PCR and DGGE

    PubMed Central

    Wang, Chunxiao; García-Fernández, David; Mas, Albert; Esteve-Zarzoso, Braulio

    2015-01-01

    The diversity of fungi in grape must and during wine fermentation was investigated in this study by culture-dependent and culture-independent techniques. Carignan and Grenache grapes were harvested from three vineyards in the Priorat region (Spain) in 2012, and nine samples were selected from the grape must after crushing and during wine fermentation. From culture-dependent techniques, 362 isolates were randomly selected and identified by 5.8S-ITS-RFLP and 26S-D1/D2 sequencing. Meanwhile, genomic DNA was extracted directly from the nine samples and analyzed by qPCR, DGGE and massive sequencing. The results indicated that grape must after crushing harbored a high species richness of fungi with Aspergillus tubingensis, Aureobasidium pullulans, or Starmerella bacillaris as the dominant species. As fermentation proceeded, the species richness decreased, and yeasts such as Hanseniaspora uvarum, Starmerella bacillaris and Saccharomyces cerevisiae successively occupied the must samples. The “terroir” characteristics of the fungus population are more related to the location of the vineyard than to grape variety. Sulfur dioxide treatment caused a low effect on yeast diversity by similarity analysis. Because of the existence of large population of fungi on grape berries, massive sequencing was more appropriate to understand the fungal community in grape must after crushing than the other techniques used in this study. Suitable target sequences and databases were necessary for accurate evaluation of the community and the identification of species by the 454 pyrosequencing of amplicons. PMID:26557110

  1. Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

    PubMed Central

    Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

    2016-01-01

    A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974

  2. Phylogeny and genetic diversity of Bridgeoporus nobilissimus inferred using mitochondrial and nuclear rDNA sequences

    USGS Publications Warehouse

    Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F.; Rodriguez, R.J.

    2003-01-01

    The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.

  3. Age-related decrease in TCR repertoire diversity measured with deep and normalized sequence profiling.

    PubMed

    Britanova, Olga V; Putintseva, Ekaterina V; Shugay, Mikhail; Merzlyak, Ekaterina M; Turchaninova, Maria A; Staroverov, Dmitriy B; Bolotin, Dmitriy A; Lukyanov, Sergey; Bogdanova, Ekaterina A; Mamedov, Ilgar Z; Lebedev, Yuriy B; Chudakov, Dmitriy M

    2014-03-15

    The decrease of TCR diversity with aging has never been studied by direct methods. In this study, we combined high-throughput Illumina sequencing with unique cDNA molecular identifier technology to achieve deep and precisely normalized profiling of TCR β repertoires in 39 healthy donors aged 6-90 y. We demonstrate that TCR β diversity per 10(6) T cells decreases roughly linearly with age, with significant reduction already apparent by age 40. The percentage of naive T cells showed a strong correlation with measured TCR diversity and decreased linearly up to age 70. Remarkably, the oldest group (average age 82 y) was characterized by a higher percentage of naive CD4(+) T cells, lower abundance of expanded clones, and increased TCR diversity compared with the previous age group (average age 62 y), suggesting the influence of age selection and association of these three related parameters with longevity. Interestingly, cross-analysis of individual TCR β repertoires revealed a set >10,000 of the most representative public TCR β clonotypes, whose abundance among the top 100,000 clones correlated with TCR diversity and decreased with aging.

  4. Multilocus sequence analysis (MLSA) of Bradyrhizobium strains: revealing high diversity of tropical diazotrophic symbiotic bacteria.

    PubMed

    Delamuta, Jakeline Renata Marçon; Ribeiro, Renan Augusto; Menna, Pâmela; Bangel, Eliane Villamil; Hungria, Mariangela

    2012-04-01

    Symbiotic association of several genera of bacteria collectively called as rhizobia and plants belonging to the family Leguminosae (=Fabaceae) results in the process of biological nitrogen fixation, playing a key role in global N cycling, and also bringing relevant contributions to the agriculture. Bradyrhizobium is considered as the ancestral of all nitrogen-fixing rhizobial species, probably originated in the tropics. The genus encompasses a variety of diverse bacteria, but the diversity captured in the analysis of the 16S rRNA is often low. In this study, we analyzed twelve Bradyrhizobium strains selected from previous studies performed by our group for showing high genetic diversity in relation to the described species. In addition to the 16S rRNA, five housekeeping genes (recA, atpD, glnII, gyrB and rpoB) were analyzed in the MLSA (multilocus sequence analysis) approach. Analysis of each gene and of the concatenated housekeeping genes captured a considerably higher level of genetic diversity, with indication of putative new species. The results highlight the high genetic variability associated with Bradyrhizobium microsymbionts of a variety of legumes. In addition, the MLSA approach has proved to represent a rapid and reliable method to be employed in phylogenetic and taxonomic studies, speeding the identification of the still poorly known diversity of nitrogen-fixing rhizobia in the tropics.

  5. Application of RAD Sequencing for Evaluating the Genetic Diversity of Domesticated Panax notoginseng (Araliaceae)

    PubMed Central

    Pan, Yuezhi; Wang, Xueqin; Sun, Guiling; Li, Fusheng; Gong, Xun

    2016-01-01

    Panax notoginseng, a traditional Chinese medicinal plant, has been cultivated and domesticated for approximately 400 years, mainly in Yunnan and Guangxi, two provinces in southwest China. This species was named according to cultivated rather than wild individuals, and no wild populations had been found until now. The genetic resources available on farms are important for both breeding practices and resource conservation. In the present study, the recently developed technology RADseq, which is based on next-generation sequencing, was used to analyze the genetic variation and differentiation of P. notoginseng. The nucleotide diversity and heterozygosity results indicated that P. notoginseng had low genetic diversity at both the species and population levels. Almost no genetic differentiation has been detected, and all populations were genetically similar due to strong gene flow and insufficient splitting time. Although the genetic diversity of P. notoginseng was low at both species and population levels, several traditional plantations had relatively high genetic diversity, as revealed by the He and π values and by the private allele numbers. These valuable genetic resources should be protected as soon as possible to facilitate future breeding projects. The possible geographical origin of Sanqi domestication was discussed based on the results of the genetic diversity analysis. PMID:27846268

  6. Multilocus sequence analysis (MLSA) of Bradyrhizobium strains: revealing high diversity of tropical diazotrophic symbiotic bacteria

    PubMed Central

    Delamuta, Jakeline Renata Marçon; Ribeiro, Renan Augusto; Menna, Pâmela; Bangel, Eliane Villamil; Hungria, Mariangela

    2012-01-01

    Symbiotic association of several genera of bacteria collectively called as rhizobia and plants belonging to the family Leguminosae (=Fabaceae) results in the process of biological nitrogen fixation, playing a key role in global N cycling, and also bringing relevant contributions to the agriculture. Bradyrhizobium is considered as the ancestral of all nitrogen-fixing rhizobial species, probably originated in the tropics. The genus encompasses a variety of diverse bacteria, but the diversity captured in the analysis of the 16S rRNA is often low. In this study, we analyzed twelve Bradyrhizobium strains selected from previous studies performed by our group for showing high genetic diversity in relation to the described species. In addition to the 16S rRNA, five housekeeping genes (recA, atpD, glnII, gyrB and rpoB) were analyzed in the MLSA (multilocus sequence analysis) approach. Analysis of each gene and of the concatenated housekeeping genes captured a considerably higher level of genetic diversity, with indication of putative new species. The results highlight the high genetic variability associated with Bradyrhizobium microsymbionts of a variety of legumes. In addition, the MLSA approach has proved to represent a rapid and reliable method to be employed in phylogenetic and taxonomic studies, speeding the identification of the still poorly known diversity of nitrogen-fixing rhizobia in the tropics. PMID:24031882

  7. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  8. A genome-to-genome analysis of associations between human genetic variation, HIV-1 sequence diversity, and viral control.

    PubMed

    Bartha, István; Carlson, Jonathan M; Brumme, Chanson J; McLaren, Paul J; Brumme, Zabrina L; John, Mina; Haas, David W; Martinez-Picado, Javier; Dalmau, Judith; López-Galíndez, Cecilio; Casado, Concepción; Rauch, Andri; Günthard, Huldrych F; Bernasconi, Enos; Vernazza, Pietro; Klimkait, Thomas; Yerly, Sabine; O'Brien, Stephen J; Listgarten, Jennifer; Pfeifer, Nico; Lippert, Christoph; Fusi, Nicolo; Kutalik, Zoltán; Allen, Todd M; Müller, Viktor; Harrigan, P Richard; Heckerman, David; Telenti, Amalio; Fellay, Jacques

    2013-10-29

    HIV-1 sequence diversity is affected by selection pressures arising from host genomic factors. Using paired human and viral data from 1071 individuals, we ran >3000 genome-wide scans, testing for associations between host DNA polymorphisms, HIV-1 sequence variation and plasma viral load (VL), while considering human and viral population structure. We observed significant human SNP associations to a total of 48 HIV-1 amino acid variants (p<2.4 × 10(-12)). All associated SNPs mapped to the HLA class I region. Clinical relevance of host and pathogen variation was assessed using VL results. We identified two critical advantages to the use of viral variation for identifying host factors: (1) association signals are much stronger for HIV-1 sequence variants than VL, reflecting the 'intermediate phenotype' nature of viral variation; (2) association testing can be run without any clinical data. The proposed genome-to-genome approach highlights sites of genomic conflict and is a strategy generally applicable to studies of host-pathogen interaction. DOI:http://dx.doi.org/10.7554/eLife.01123.001.

  9. HIV-1 intrapatient sequence diversity in the immunogenic V3 region

    SciTech Connect

    Korber, B.; Myers, G. ); Wolinsky, S.; Kunstman, K.; Levy, R.; Furtado, M.; Otto, P. . Medical School); Haynes, B. . Dept. of Medicine)

    1991-11-12

    The third hypervariable domain (V3) of the human immunodeficiency virus type-1 (HIV-1) envelope protein (env) can serve as an epitope for potent type-specific neutralizing antibodies (NAbs) -- thus short peptides predicted on the most commonly found variants of the antigenic tip of the V3 loop have been considered as potential candidates for an HIV peptide vaccine. To evaluate the extent of intrapatient variation in the immunogenic crest of the V3 loop, sequence sets were analyzed from individuals for whom multiple V3 sequences were available. Several strategies for selecting the best sets of hexapeptides to represent the variable tip of the V3 loop were considered and their effectiveness was evaluated by comparing them with the sequence sets from individuals. Most individuals carried at least one, and frequently many, variants that did not match any of the sequences from among the ten most common hexapeptides. Intrapatient viral sequence variation was increased by including sequences derived from brain biopsy specimens as well as from blood. Additionally, sequences obtained from brain specimens of different individuals had common elements which were not conserved in the corresponding blood samples, suggesting that certain amino acids in the V3 loop may be requisite for viral propagation in the CNS.

  10. Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences

    PubMed Central

    Strejcek, Michal; Wang, Qiong; Ridl, Jakub; Uhlik, Ondrej

    2015-01-01

    Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frameshifts (FS). Genes encoding for alpha subunits of biphenyl (bphA) and benzoate (benA) dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 44% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of maximum expected error filtering and single linkage pre-clustering proved to be the most efficient read processing approach. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study or available at https://github.com/strejcem/FBdenovo. The tool was also implemented into FunGene Pipeline available at http://fungene.cme.msu.edu/FunGenePipeline/. PMID:26635739

  11. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    SciTech Connect

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn Marie; Johnson, Courtney M; Martin, Stanton; Land, Miriam L; Lu, Tse-Yuan; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  12. [Origin and genetic diversity of Mongolian and Chinese sheep using mitochondrial DNA D-loop sequences].

    PubMed

    Luo, Yu-Zhu; Cheng, Shu-Ru; Batsuuri, Lkhagva; Badamdorj, D; Olivier, Hanotte; Han, Jian-Lin

    2005-12-01

    To determine the origin and gene diversity of the Chinese and Mongolian domestic sheep, a partial fragment of mitochondrial DNA D-loop was sequenced for total number of 314 individuals from nine Chinese sheep populations and 11 Mongolian sheep populations. The results show no difference in nucleotide composition between Chinese and Mongolian sheep mtDNA D-loop sequences. However, more variables were identified in Mongolian sheep (26.85% of the sites) than that in Chinese sheep (24.22%). In China, mtDNA haplotype diversity was the highest in Qinghai Tibetan sheep, followed then by Gansu Tibetan sheep, Gansu Alpine Merino, Qinghai Merino, Gannan Tibetan sheep, Small-tailed Han sheep, Tan sheep, Hu sheep and Minxian Black Fur sheep. In Mongolian sheep, mtDNA haplotype diversity was the highest in Bayad and Baidrag populations and the lowest in the Gobi-Altai population. In general, Mongolian sheep have a richer genetic diversity than the Chinese ones with larger number of haplotypes (86.06% (142/165) versus 78.83% (108/137)), higher haplotype diversity (Hd; 0.976 versus 0.936), higher nucleotide diversity (Pi (pi); 0. 036 versus 0.034) and higher average number of nucleotide differences (k; 23.50 versus 22.48). Phylogenetic analysis of the 217 haplotypes identified in both Mongolian and Chinese sheep supported the same origin of their domestication with three distinct maternal lineages defined as major haplotypes A, B and C, of which haplotype A are the commonest in all Chinese sheep populations and in the majority of Mongolian sheep populations (9/11) with an average frequency of 58.73%, followed by haplotype B present in eight of Chinese population and in all Mongolian sheep populations with an average frequency of 24.68%, and haplotype C present in eight Chinese and in 10 Mongolian sheep populations with an average frequency of 16.59%. Further network analysis of the phylogenetic relationship of the 87 haplotypes identified from 91 sequences retrieved from Gen

  13. Diversity and distribution of unicellular opisthokonts along the European coast analysed using high-throughput sequencing.

    PubMed

    Del Campo, Javier; Mallo, Diego; Massana, Ramon; de Vargas, Colomban; Richards, Thomas A; Ruiz-Trillo, Iñaki

    2015-09-01

    The opisthokonts are one of the major super groups of eukaryotes. It comprises two major clades: (i) the Metazoa and their unicellular relatives and (ii) the Fungi and their unicellular relatives. There is, however, little knowledge of the role of opisthokont microbes in many natural environments, especially among non-metazoan and non-fungal opisthokonts. Here, we begin to address this gap by analysing high-throughput 18S rDNA and 18S rRNA sequencing data from different European coastal sites, sampled at different size fractions and depths. In particular, we analyse the diversity and abundance of choanoflagellates, filastereans, ichthyosporeans, nucleariids, corallochytreans and their related lineages. Our results show the great diversity of choanoflagellates in coastal waters as well as a relevant representation of the ichthyosporeans and the uncultured marine opisthokonts (MAOP). Furthermore, we describe a new lineage of marine fonticulids (MAFO) that appears to be abundant in sediments. Taken together, our work points to a greater potential ecological role for unicellular opisthokonts than previously appreciated in marine environments, both in water column and sediments, and also provides evidence of novel opisthokont phylogenetic lineages. This study highlights the importance of high-throughput sequencing approaches to unravel the diversity and distribution of both known and novel eukaryotic lineages.

  14. Diverse and Widespread Contamination Evident in the Unmapped Depths of High Throughput Sequencing Data

    PubMed Central

    Lusk, Richard W.

    2014-01-01

    Trace quantities of contaminating DNA are widespread in the laboratory environment, but their presence has received little attention in the context of high throughput sequencing. This issue is highlighted by recent works that have rested controversial claims upon sequencing data that appear to support the presence of unexpected exogenous species. I used reads that preferentially aligned to alternate genomes to infer the distribution of potential contaminant species in a set of independent sequencing experiments. I confirmed that dilute samples are more exposed to contaminating DNA, and, focusing on four single-cell sequencing experiments, found that these contaminants appear to originate from a wide diversity of clades. Although negative control libraries prepared from ‘blank’ samples recovered the highest-frequency contaminants, low-frequency contaminants, which appeared to make heterogeneous contributions to samples prepared in parallel within a single experiment, were not well controlled for. I used these results to show that, despite heavy replication and plausible controls, contamination can explain all of the observations used to support a recent claim that complete genes pass from food to human blood. Contamination must be considered a potential source of signals of exogenous species in sequencing data, even if these signals are replicated in independent experiments, vary across conditions, or indicate a species which seems a priori unlikely to contaminate. Negative control libraries processed in parallel are essential to control for contaminant DNAs, but their limited ability to recover low-frequency contaminants must be recognized. PMID:25354084

  15. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    PubMed Central

    Chow, Cheryl-Emiliane T.; Winget, Danielle M.; White, Richard A.; Hallam, Steven J.; Suttle, Curtis A.

    2015-01-01

    Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs), remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10 m) and oxygen-starved basin (200 m) waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs) predicted across all 34 viral fosmids, 77.6% (n = 5010) had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P) waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI's non-redundant “nr” database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems. PMID:25914678

  16. Genetic diversity and relationship of chicory (Cichorium intybus L.) using sequence-related amplified polymorphism markers.

    PubMed

    Liang, X Y; Zhang, X Q; Bai, S Q; Huang, L K; Luo, X M; Ji, Y; Jiang, L F

    2014-09-26

    Chicory is a crop with economically important roles and is cultivated worldwide. The genetic diversity and relationship of 80 accessions of chicories and endives were evaluated by sequence-related amplified polymorphism (SRAP) markers to provide a theoretical basis for future breeding programs in China. The polymorphic rate was 96.83%, and the average polymorphic information content was 0.323, suggesting the rich genetic diversity of chicory. The genetic diversity degree of chicory was higher (GS = 0.677) than that of endive (GS = 0.701). The accessions with the highest genetic diversity (effective number of alleles, NE = 1.609; Nei's genetic diversity, H = 0.372; Shannon information index, I = 0.556) were from Italy. The richest genetic diversity was revealed in a chicory line (NE = 1.478, H = 0.289, I = 0.443) among the 3 types (line, wild, and cultivar). The chicory genetic structure of 8 geographical groups showed that the genetic differentiation coefficient (GST) was 14.20% and the number of immigrants per generation (Nm) was 3.020. A GST of 6.80% and an Nm of 6.853 were obtained from different types. This observation suggests that these chicory lines, especially those from the Mediterranean region, have potential for providing rich genetic resources for further breeding programs, that the chicory genetic structure among different countries obviously differs with a certain amount of gene flow, and that SRAP markers could be applied to analyze genetic relationships and classifications of Cichorium intybus and C. endivia.

  17. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    PubMed Central

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-01-01

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission. PMID:27530749

  18. High diversity of picornaviruses in rats from different continents revealed by deep sequencing.

    PubMed

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-Phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-08-17

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission.

  19. Application of mitochondrial genes sequences for measuring the genetic diversity of Arabian oryx.

    PubMed

    Khan, Haseeb A; Arif, Ibrahim A; Shobrak, Mohammad; Homaidan, Ali A Al; Farhan, Ahmad H Al; Sadoon, Mohammad Al

    2011-01-01

    Arabian oryx (Oryx leucoryx) had faced extinction in the wild more than three decades ago and was saved by the prudent efforts of captive breeding programs. A clear understanding of the molecular diversity of contemporary Arabian oryx population is important for the long term success of captive breeding and reintroduction of this potentially endangered species. We have sequenced the segments of mitochondrial DNA including12S rRNA, 16S rRNA, cytochrome b (Cyt-b) and control region (CR) genes of 24 captive-bred and reintroduced animals. Although the sequences of 12S rRNA, 16S rRNA and Cyt-b were found to be identical for all the samples, typical sequence variations in the CR gene were observed in the form of 7 haplotypes. One of these haplotypes has been reported earlier while the remaining 6 haplotypes are novel and represent different lineages from the founders. The haplotype and nucleotide diversities were found to be 0.789 and 0.009 respectively. The genetic distances among the 7 mtDNA haplotypes varied from 0.001 to 0.017. These findings are of potential relevance to the management of captive breeding programs for the conservation of Arabian oryx.

  20. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  1. An rpoD gene sequence based evaluation of cultured Pseudomonas diversity on different growth media.

    PubMed

    Ghyselinck, Jonas; Coorevits, An; Van Landschoot, Anita; Samyn, Emly; Heylen, Kim; De Vos, Paul

    2013-10-01

    The last decade has shown an increased interest in the utilization of bacteria for applications ranging from bioremediation to wastewater purification and promotion of plant growth. In order to extend the current number of micro-organism mediated applications, a continued quest for new agents is required. This study focused on the genus Pseudomonas, which is known to harbour strains with a very diverse set of interesting properties. The aim was to identify growth media that allow retrieval of a high Pseudomonas diversity, as such increasing the chance of isolating isolates with beneficial properties. Three cultivation media: trypticase soy agar (TSA), potato dextrose agar (PDA) and Pseudomonas isolation agar (PIA) were evaluated for their abilities to grow Pseudomonas strains. TSA and PDA were found to generate the largest Pseudomonas diversity. However, communities obtained with both media overlapped. Communities obtained with PIA, on the other hand, were unique. This indicated that the largest diversity is obtained by sampling from either PDA or TSA and from PIA in parallel. To evaluate biodiversity of the isolated Pseudomonas members on the media, an appropriate biomarker had to be identified. Hence, an introductory investigation of the taxonomic resolution of the 16S rRNA, rpoD, gyrB and rpoB genes was performed. The rpoD gene sequences not only had a high phylogenetic content and the highest taxonomic resolution amongst the genes investigated, it also had a gene phylogeny that related well with that of the 16S rRNA gene.

  2. Simple sequence repeat analysis of genetic diversity in primary core collection of peach (Prunus persica).

    PubMed

    Li, Tian-Hong; Li, Yin-Xia; Li, Zi-Chao; Zhang, Hong-Liang; Qi, Yong-Wen; Wang, Tao

    2008-01-01

    In this study, the genetic diversity of 51 cultivars in the primary core collection of peach (Prunus persica (L.) Batsch) was evaluated by using simple sequence repeats (SSRs). The phylogenetic relationships and the evolutionary history among different cultivars were determined on the basis of SSR data. Twenty-two polymorphic SSR primer pairs were selected, and a total of 111 alleles were identified in the 51 cultivars, with an average of 5 alleles per locus. According to traditional Chinese classification of peach cultivars, the 51 cultivars in the peach primary core collection belong to six variety groups. The SSR analysis revealed that the levels of the genetic diversity within each variety group were ranked as Sweet peach > Crisp peach > Flat peach > Nectarine > Honey Peach > Yellow fleshed peach. The genetic diversity among the Chinese cultivars was higher than that among the introduced cultivars. Cluster analysis by the unweighted pair group method with arithmetic averaging (UPGMA) placed the 51 cultivars into five linkage clusters. Cultivar members from the same variety group were distributed in different UPGMA clusters and some members from different variety groups were placed under the same cluster. Different variety groups could not be differentiated in accordance with SSR markers. The SSR analysis revealed rich genetic diversity in the peach primary core collection, representative of genetic resources of peach.

  3. Determining the cellular diversity of hepatitis C virus quasispecies by single-cell viral sequencing.

    PubMed

    McWilliam Leitch, E Carol; McLauchlan, John

    2013-12-01

    Single-cell genomics is emerging as an important tool in cellular biology. We describe for the first time a system to investigate RNA virus quasispecies diversity at the cellular level utilizing hepatitis C virus (HCV) replicons. A high-fidelity nested reverse transcription (RT)-PCR assay was developed, and validation using control transcripts of known copy number indicated a detection limit of 3 copies of viral RNA/reaction. This system was used to determine the cellular diversity of subgenomic JFH-1 HCV replicons constitutively expressed in Huh7 cells. Each cell contained a unique quasispecies that was much less diverse than the quasispecies of the bulk cell population from which the single cells were derived, suggesting the occurrence of independent evolution at the cellular level. An assessment of the replicative fitness of the predominant single-cell quasispecies variants indicated a modest reduction in fitness compared to the wild type. Real-time RT-PCR methods capable of determining single-cell viral loads were developed and indicated an average of 113 copies of replicon RNA per cell, correlating with calculated RNA copy numbers in the bulk cell population. This study introduces a single-cell RNA viral-sequencing method with numerous potential applications to explore host-virus interactions during infection. HCV quasispecies diversity varied greatly between cells in vitro, suggesting different within-cell evolutionary pathways. Such divergent trajectories in vivo could have implications for the evolution and establishment of antiviral-resistant variants and host immune escape mutants.

  4. Convergent Synthesis of Diverse Tetrahydropyridines via Rh(I)-Catalyzed C–H Functionalization Sequences

    PubMed Central

    2015-01-01

    A Rh-catalyzed C–H bond activation/alkenylation/electrocyclization cascade reaction provides diverse 1,2-dihydropyridines from simple and readily available precursors. The reaction can be carried out at low (<1%) Rh-catalyst loadings, and the use of the robust, air-stable Rh precatalyst, [RhCl(cod)]2, enables the cascade reaction to be easily performed on the benchtop. The 1,2-dihydropyridine products serve as extremely versatile synthetic intermediates for further elaboration often without isolation. The addition of electrophiles under kinetic or thermodynamic conditions provides a wide range of iminiums. Subsequent addition of a nucleophile then generates a diverse array of differently substituted piperidine products. Additionally, [3 + 2] and [4 + 2] cycloadditions of the 1,2-dihydropyridine intermediate provides access to bridged bicyclic structures such as tropanes and isoquinuclidines. These concise reaction sequences enable the formation of highly substituted piperidines in synthetically useful yields with excellent diastereoselectivity. PMID:25288871

  5. Mitochondrial DNA sequence diversity in two groups of Italian Veneto speakers from Veneto.

    PubMed

    Mogentale-Profizi, N; Chollet, L; Stévanovitch, A; Dubut, V; Poggi, C; Pradié, M P; Spadoni, J L; Gilles, A; Béraud-Colomb, E

    2001-03-01

    Although frequencies of mitochondrial DNA (mtDNA) haplogroups in the different European populations are rather homogenous, there are a few European populations or linguistic isolates that show different mtDNA haplogroup distributions; examples are the Saami and Ladin speakers from the eastern Italian Alps. MtDNA sequence diversity was analysed from subjects from two villages in Veneto. The first, Posina, is situated in the Venetian Alps near Vicenza. The second, Barco di Pravisdomini is a village on the plains near Venice. In spite of their common Veneto dialect, the two group populations have not preserved a genetic homogeneity; particularly, they show differences in T and J haplogroups frequencies. MtDNA diversity in these two groups seems to depend more on their geographic situation.

  6. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  7. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  8. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  9. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  10. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  11. Sequence types diversity of Legionella pneumophila isolates from environmental water sources in Guangzhou and Jiangmen, China.

    PubMed

    Guo, Jingyu; Liang, Ting; Hu, Chaohui; Lv, Ruichen; Yang, Xianwei; Cui, Yujun; Song, Youtao; Yang, Ruifu; Zhu, Qingyi; Song, Yajun

    2015-01-01

    In this study, 159 Legionella pneumophila strains isolated from various natural and artificial water sources in Guangzhou and Jiangmen, China, were subjected to genotyping by the sequence-based typing (SBT) scheme. These isolates were assigned into 53 sequence types (STs) (50 STs with seven loci data and three unidentified STs with incomplete loci profiles) with ST1 as the dominant one (14.5%), and the index of diversity (IOD) was 0.950. Eight new alleles and 34 new STs were reported here. Notably, most of the newly identified STs with seven loci data (24/34) contained no new allele, implying frequent recombination events in L. pneumophila. Five intragenic recombination events were identified in the concatenated sequences of seven loci. The diversity of STs in natural environmental isolates (41 STs, IOD=0.956) is higher than that of artificial environmental ones (17 STs, IOD=0.824). The ST patterns varied in isolates from these two sources: the most common STs from artificial water sources, ST1 and ST752 (39.2% and 13.7%), were only occasionally isolated from natural water sources (2.9% and 3.8%, respectively); while the predominant STs from natural water sources, ST1048, ST739 and ST1267 (15.2%, 6.7% and 6.7%), were less frequently seen in artificial environments (2.0%, 0% and 0%, respectively). We also found out that Legionnaires' disease associated STs might be more frequently isolated in artificial environments than in natural ones. Our data revealed remarkable genetic diversity of L. pneumophila isolates from environmental water systems of Guangzhou and Jiangmen, and the different ST distribution patterns between natural water and artificial water sources as well.

  12. Within-Host Nucleotide Diversity of Virus Populations: Insights from Next-Generation Sequencing

    PubMed Central

    Nelson, Chase W.; Hughes, Austin L.

    2014-01-01

    Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them. PMID:25481279

  13. Plasmid Diversity and Adaptation Analyzed by Massive Sequencing of Escherichia coli Plasmids.

    PubMed

    de Toro, María; Garcilláon-Barcia, M Pilar; De La Cruz, Fernando

    2014-12-01

    Whole-genome sequencing is revolutionizing the analysis of bacterial genomes. It leads to a massive increase in the amount of available data to be analyzed. Bacterial genomes are usually composed of one main chromosome and a number of accessory chromosomes, called plasmids. A recently developed methodology called PLACNET (for plasmid constellation networks) allows the reconstruction of the plasmids of a given genome. Thus, it opens an avenue for plasmidome analysis on a global scale. This work reviews our knowledge of the genetic determinants for plasmid propagation (conjugation and related functions), their diversity, and their prevalence in the variety of plasmids found by whole-genome sequencing. It focuses on the results obtained from a collection of 255 Escherichia coli plasmids reconstructed by PLACNET. The plasmids found in E. coli represent a nonaleatory subset of the plasmids found in proteobacteria. Potential reasons for the prevalence of some specific plasmid groups will be discussed and, more importantly, additional questions will be posed.

  14. Exploiting genes and functional diversity of chlorogenic acid and luteolin biosyntheses in Lonicera japonica and their substitutes.

    PubMed

    Yuan, Yuan; Wang, Zhouyong; Jiang, Chao; Wang, Xumin; Huang, Luqi

    2014-01-25

    Chlorogenic acids (CGAs) and luteolin are active compounds in Lonicera japonica, a plant of high medicinal value in traditional Chinese medicine. This study provides a comprehensive overview of gene families involved in chlorogenic acid and luteolin biosynthesis in L. japonica, as well as its substitutes Lonicera hypoglauca and Lonicera macranthoides. The gene sequence feature and gene expression patterns in various tissues and buds of the species were characterized. Bioinformatics analysis revealed that 14 chlorogenic acid and luteolin biosynthesis-related genes were identified from the L. japonica transcriptome assembly. Phylogenetic analyses suggested that the function of individual gene could be differentiation and induce active compound diversity. Their orthologous genes were also recognized in L. hypoglauca and L. macranthoides genomic datasets, except for LHCHS1 and LMC4H2. The expression patterns of these genes are different in the tissues of L. japonica, L. hypoglauca and L. macranthoides. Results also showed that CGAs were controlled in the first step of biosynthesis, whereas both steps controlled luteolin in the bud of L. japonica. The expression of LJFNS2 exhibited positive correlation with luteolin levels in L. japonica. This study provides significant information for understanding the functional diversity of gene families involved in chlorogenic acid and the luteolin biosynthesis, active compound diversity of L. japonica and its substitutes, and the different usages of the three species.

  15. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  16. Computer selection of oligonucleotide probes from amino acid sequences for use in gene library screening.

    PubMed

    Yang, J H; Ye, J H; Wallace, D C

    1984-01-11

    We present a computer program, FINPROBE, which utilizes known amino acid sequence data to deduce minimum redundancy oligonucleotide probes for use in screening cDNA or genomic libraries or in primer extension. The user enters the amino acid sequence of interest, the desired probe length, the number of probes sought, and the constraints on oligonucleotide synthesis. The computer generates a table of possible probes listed in increasing order of redundancy and provides the location of each probe in the protein and mRNA coding sequence. Activation of a next function provides the amino acid and mRNA sequences of each probe of interest as well as the complementary sequence and the minimum dissociation temperature of the probe. A final routine prints out the amino acid sequence of the protein in parallel with the mRNA sequence listing all possible codons for each amino acid.

  17. Uncultivated microbial eukaryotic diversity: a method to link ssu rRNA gene sequences with morphology.

    PubMed

    Hirst, Marissa B; Kita, Kelley N; Dawson, Scott C

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA "phylotypes" from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages

  18. Allelic Diversity and Population Structure in Oenococcus oeni as Determined from Sequence Analysis of Housekeeping Genes

    PubMed Central

    de las Rivas, Blanca; Marcobal, Ángela; Muñoz, Rosario

    2004-01-01

    Oenococcus oeni is the organism of choice for promoting malolactic fermentation in wine. The population biology of O. oeni is poorly understood and remains unclear. For a better understanding of the mode of genetic variation within this species, we investigated by using multilocus sequence typing (MLST) with the gyrB, pgm, ddl, recP, and mleA genes the genetic diversity and genetic relationships among 18 O. oeni strains isolated in various years from wines of the United States, France, Germany, Spain, and Italy. These strains have also been characterized by ribotyping and restriction fragment length polymorphism (RFLP) analysis of the PCR-amplified 16S-23S rRNA gene intergenic spacer region (ISR). Ribotyping grouped the strains into two groups; however, the RFLP analysis of the ISRs showed no differences in the strains analyzed. In contrast, MLST in oenococci had a good discriminatory ability, and we have found a higher genetic diversity than indicated by ribotyping analysis. All sequence types were represented by a single strain, and all the strains could be distinguished from each other because they had unique combinations of alleles. Strains assumed to be identical showed the same sequence type. Phylogenetic analyses indicated a panmictic population structure in O. oeni. Sequences were analyzed for evidence of recombination by split decomposition analysis and analysis of clustered polymorphisms. All results indicated that recombination plays a major role in creating the genetic heterogeneity of O. oeni. A low standardized index of association value indicated that the O. oeni genes analyzed are close to linkage equilibrium. This study constitutes the first step in the development of an MLST method for O. oeni and the first example of the application of MLST to a nonpathogenic food production bacteria. PMID:15574919

  19. Molecular sequence data of hepatitis B virus and genetic diversity after vaccination.

    PubMed

    van Ballegooijen, W Marijn; van Houdt, Robin; Bruisten, Sylvia M; Boot, Hein J; Coutinho, Roel A; Wallinga, Jacco

    2009-12-15

    The effect of vaccination programs on transmission of infectious disease is usually assessed by monitoring programs that rely on notifications of symptomatic illness. For monitoring of infectious diseases with a high proportion of asymptomatic cases or a low reporting rate, molecular sequence data combined with modern coalescent-based techniques offer a complementary tool to assess transmission. Here, the authors investigate the added value of using viral sequence data to monitor a vaccination program that was started in 1998 and was targeted against hepatitis B virus in men who have sex with men in Amsterdam, the Netherlands. The incidence in this target group, as estimated from the notifications of acute infections with hepatitis B virus, was low; therefore, there was insufficient power to show a significant change in incidence. In contrast, the genetic diversity, as estimated from the viral sequence collected from the target group, revealed a marked decrease after vaccination was introduced. Taken together, the findings suggest that introduction of vaccination coincided with a change in the target group toward behavior with a higher risk of infection. The authors argue that molecular sequence data provide a powerful additional monitoring instrument, next to conventional case registration, for assessing the impact of vaccination.

  20. Global Diversity Lines–A Five-Continent Reference Panel of Sequenced Drosophila melanogaster Strains

    PubMed Central

    Grenier, Jennifer K.; Arguello, J. Roman; Moreira, Margarida Cardoso; Gottipati, Srikanth; Mohammed, Jaaved; Hackett, Sean R.; Boughton, Rachel; Greenberg, Anthony J.; Clark, Andrew G.

    2015-01-01

    Reference collections of multiple Drosophila lines with accumulating collections of “omics” data have proven especially valuable for the study of population genetics and complex trait genetics. Here we present a description of a resource collection of 84 strains of Drosophila melanogaster whose genome sequences were obtained after 12 generations of full-sib inbreeding. The initial rationale for this resource was to foster development of a systems biology platform for modeling metabolic regulation by the use of natural polymorphisms as perturbations. As reference lines, they are amenable to repeated phenotypic measurements, and already a large collection of metabolic traits have been assayed. Another key feature of these strains is their widespread geographic origin, coming from Beijing, Ithaca, Netherlands, Tasmania, and Zimbabwe. After obtaining 12.5× coverage of paired-end Illumina sequence reads, SNP and indel calls were made with the GATK platform. Thorough quality control was enabled by deep sequencing one line to >100×, and single-nucleotide polymorphisms and indels were validated using ddRAD-sequencing as an orthogonal platform. In addition, a series of preliminary population genetic tests were performed with these single-nucleotide polymorphism data for assessment of data quality. We found 83 segregating inversions among the lines, and as expected these were especially abundant in the African sample. We anticipate that this will make a useful addition to the set of reference D. melanogaster strains, thanks to its geographic structuring and unusually high level of genetic diversity. PMID:25673134

  1. Genetic diversity analysis of Bt cotton genotypes in Pakistan using simple sequence repeat markers.

    PubMed

    Ullah, I; Iram, A; Iqbal, M Z; Nawaz, M; Hasni, S M; Jamil, S

    2012-03-14

    The popularity of genetically modified insect resistant (Bt) cotton has promoted large scale monocultures, which is thought to worsen the problem of crop genetic homogeneity. Information on genetic diversity among Bt cotton varieties is lacking. We evaluated genetic divergence among 19 Bt cotton genotypes using simple sequence repeat (SSR) markers. Thirty-seven of 104 surveyed primers were found informative. Fifty-two primers selected on the basis of reported intra-hirsutum polymorphism in a cotton marker database showed a high degree of polymorphism, 56% compared to 13% for randomly selected primers. A total of 177 loci were amplified, with an average of 1.57 loci per primer, generating 38 markers. The amplicons ranged in size from 98 to 256 bp. The genetic similarities among the 19 genotypes ranged from 0.902 to 0.982, with an average of 0.947, revealing a lack of diversity. Similarities among genotypes from public sector organizations were higher than genotypes developed by private companies. Hybrids were found to be more distant compared to commercial cultivars and advanced breeding lines. Cluster analysis grouped the 19 Bt cotton genotypes into three major clusters and two independent entries. Cultivars IR-3701, Ali Akbar-802 and advanced breeding line VH-259 grouped in subcluster B2, with very narrow genetic distances despite dissimilar parentage. We found a very high level of similarity among Pakistani-bred Bt cotton varieties, which means that genetically diverse recurrent parents should be included to enhance genetic diversity. The intra-hirsutum polymorphic SSRs were found to be highly informative for molecular genetic diversity studies in these cotton varieties.

  2. Sequence diversity within the capsular genes of Streptococcus pneumoniae serogroup 6 and 19.

    PubMed

    Elberse, Karin; Witteveen, Sandra; van der Heide, Han; van de Pol, Ingrid; Schot, Corrie; van der Ende, Arie; Berbers, Guy; Schouls, Leo

    2011-01-01

    The main virulence factor of Streptococcus pneumoniae is the capsule. The polysaccharides comprising this capsule are encoded by approximately 15 genes and differences in these genes result in different serotypes. The aim of this study was to investigate the sequence diversity of the capsular genes of serotypes 6A, 6B, 6C, 19A and 19F and to explore a possible effect of vaccination on variation and distribution of these serotypes in the Netherlands. The complete capsular gene locus was sequenced for 25 serogroup 6 and for 20 serogroup 19 isolates. If one or more genes varied in 10 or more base pairs from the reference sequence, it was designated as a capsular subtype. Allele-specific PCRs and specific gene sequencing of highly variable capsular genes were performed on 184 serogroup 6 and 195 serogroup 19 isolates to identify capsular subtypes. This revealed the presence of 6, 3 and a single capsular subtype within serotypes 6A, 6B and 6C, respectively. The serotype 19A and 19F isolates comprised 3 and 4 capsular subtypes, respectively. For serogroup 6, the genetic background, as determined by multi locus sequence typing (MLST) and multiple-locus variable number of tandem repeat analysis (MLVA), seemed to be closely related to the capsular subtypes, but this was less pronounced for serogroup 19 isolates. The data also suggest shifts in the occurrence of capsular subtypes within serotype 6A and 19A after introduction of the 7-valent pneumococcal vaccine. The shifts within these non-vaccine serotypes might indicate that these capsular subtypes are filling the niche of the vaccine serotypes. In conclusion, there is considerable DNA sequence variation of the capsular genes within pneumococcal serogroup 6 and 19. Such changes may result in altered polysaccharides or in strains that produce more capsular polysaccharides. Consequently, these altered capsules may be less sensitive for vaccine induced immunity.

  3. Targeted high-throughput growth hormone 1 gene sequencing reveals high within-breed genetic diversity in South African goats.

    PubMed

    Ncube, K T; Mdladla, K; Dzomba, E F; Muchadeyi, F C

    2016-06-01

    This study assessed the genetic diversity in the growth hormone 1 gene (GH1) within and between South African goat breeds. Polymerase chain reaction-targeted gene amplification together with Illumina MiSeq next-generation sequencing (NGS) was used to generate the full length (2.54 kb) of the growth hormone 1 gene and screen for SNPs in the South African Boer (SAB) (n = 17), Tankwa (n = 15) and South African village (n = 35) goat populations. A range of 27-58 SNPs per population were observed. Mutations resulting in amino acid changes were observed at exons 2 and 5. Higher within-breed diversity of 97.37% was observed within the population category consisting of SA village ecotypes and the Tankwa goats. Highest pairwise FST values ranging from 0.148 to 0.356 were observed between the SAB and both the South African village and Tankwa feral goat populations. Phylogenetic analysis indicated nine genetic clusters, which reflected close relationships between the South African populations and the other international breeds with the exception of the Italian Sarda breeds. Results imply greater potential for within-population selection programs, particularly with SA village goats.

  4. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  5. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  6. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  7. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  8. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  9. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  10. Inter simple sequence repeat (ISSR) analysis of genetic diversity in tef [Eragrostis tef (Zucc.) Trotter].

    PubMed

    Assefa, Kebebew; Merker, Arnulf; Tefera, Hailu

    2003-01-01

    The DNA polymorphism among 92 selected tef genotypes belonging to eight origin groups was assessed using eight inter simple sequence repeat (ISSR) primers. The objectives were to examine the possibility of using ISSR markers for unravelling genetic diversity in tef, and to assess the extent and pattern of genetic diversity in the test germplasm with respect to origin groups. The eight primers were able to separate or distinguish all of the 92 tef genotypes based on a total of 110 polymorphic bands among the test lines. The Jaccard similarity coefficient among the test genotypes ranged from 0.26 to 0.86, and at about 60 % similarity level the clustering of this matrix using the unweighted pair-group method based on arithmetic average (UPGMA) resulted in the formation of six major clusters of 2 to 37 lines with further eight lines remaining ungrouped. The standardized Nei genetic distance among the eight groups of origin ranged between 0.03 and 0.32. The UPGMA clustering using the standardized genetic distance matrix resulted in the identification of three clusters of the eight groups of origin with bootstrap values ranging from 56 to 97. The overall mean Shannon Weaver diversity index of the test lines was 0.73, indicating better resolution of genetic diversity in tef with ISSR markers than with phenotypic (morphological) traits used in previous studies. This can be attributed mainly to the larger number of loci generated for evaluation with ISSR analysis as compared to the few number of phenotypic traits amenable for assessment and which are further greatly affected by environment and genotype x environment interaction. Analysis of variance of mean Shannon Weaver diversity indices revealed substantial (P < or = 0.05) variation in the level of diversity among the eight groups of origin. In conclusion, our results indicate that ISSR can be useful as DNA-based molecular markers for studying genetic diversity and phylogenetic relationships, DNA fingerprinting for the

  11. High-resolution sequencing reveals unexplored archaeal diversity in freshwater wetland soils.

    PubMed

    Narrowe, Adrienne B; Angle, Jordan C; Daly, Rebecca A; Stefanik, Kay C; Wrighton, Kelly C; Miller, Christopher S

    2017-02-20

    Despite being key contributors to biogeochemical processes, archaea are frequently outnumbered by bacteria, and consequently are underrepresented in combined molecular surveys. Here, we demonstrate an approach to concurrently survey the archaea alongside the bacteria with high-resolution 16S rRNA gene sequencing, linking these community data to geochemical parameters. We applied this integrated analysis to hydric soils sampled across a model methane-emitting freshwater wetland. Geochemical profiles, archaeal communities, and bacterial communities were independently correlated with soil depth and water cover. Centimeters of soil depth and corresponding geochemical shifts consistently affected microbial community structure more than hundreds of meters of lateral distance. Methanogens with diverse metabolisms were detected across the wetland, but displayed surprising OTU-level partitioning by depth. Candidatus Methanoperedens spp. archaea thought to perform anaerobic oxidation of methane linked to iron reduction were abundant. Domain-specific sequencing also revealed unexpectedly diverse non-methane-cycling archaeal members. OTUs within the underexplored Woesearchaeota and Bathyarchaeota were prevalent across the wetland, with subgroups and individual OTUs exhibiting distinct occupancy and abundance distributions aligned with environmental gradients. This study adds to our understanding of ecological range for key archaeal taxa in a model freshwater wetland, and links these taxa and individual OTUs to hypotheses about processes governing biogeochemical cycling. This article is protected by copyright. All rights reserved.

  12. Phylogenetic analysis of sequences from diverse bacteria with homology to the Escherichia coli rho gene.

    PubMed Central

    Opperman, T; Richardson, J P

    1994-01-01

    Genes from Pseudomonas fluorescens, Chromatium vinosum, Micrococcus luteus, Deinococcus radiodurans, and Thermotoga maritima with homology to the Escherichia coli rho gene were cloned and sequenced, and their sequences were compared with other available sequences. The species for all of the compared sequences are members of five bacterial phyla, including Thermotogales, the most deeply diverged phylum. This suggests that a rho-like gene is ubiquitous in the Bacteria and was present in their common ancestor. The comparative analysis revealed that the Rho homologs are highly conserved, exhibiting a minimum identity of 50% of their amino acid residues in pairwise comparisons. The ATP-binding domain had a particularly high degree of conservation, consisting of some blocks with sequences of residues that are very similar to segments of the alpha and beta subunits of F1-ATPase and of other blocks with sequences that are unique to Rho. The RNA-binding domain is more diverged than the ATP-binding domain. However, one of its most highly conserved segments includes a RNP1-like sequence, which is known to be involved in RNA binding. Overall, the degree of similarity is lowest in the first 50 residues (the first half of the RNA-binding domain), in the putative connector region between the RNA-binding and the ATP-binding domains, and in the last 50 residues of the polypeptide. Since functionally defective mutants for E. coli Rho exist in all three of these segments, they represent important parts of Rho that have undergone adaptive evolution. PMID:8051015

  13. Diversity of lactic acid bacteria population in ripened Parmigiano Reggiano cheese.

    PubMed

    Gala, Elisabetta; Landi, Sara; Solieri, Lisa; Nocetti, Marco; Pulvirenti, Andrea; Giudici, Paolo

    2008-07-31

    The diversity of dominant lactic acid bacteria population in 12 months ripened Parmigiano Reggiano cheeses was investigated by a polyphasic approach including culture-dependent and independent methods. Traditional plating, isolation of LAB and identification by 16S rDNA analysis showed that strains belonging to Lactobacillus casei group were the most frequently isolated. Lactobacillus helveticus, Lactobacillus delbrueckii subsp. lactis, Lactobacillus parabuchneri, and Lactobacillus buchneri species were detected with lower frequency. PCR-denaturing gradient gel electrophoresis (DGGE) applied to DNA extracted directly from cheese samples and sequencing of rDNA amplicons confirmed the complex microbiological pattern of LAB in ripened Parmigiano Reggiano cheeses, with the significant exception of the Lactobacillus fermentum species, which dominated in several samples, but was not detected by cultivation. The present combination of different approaches can effectively describe the lactic acid bacteria population of Parmigiano Reggiano cheese in advanced stages of ripening, giving useful information for elucidating the role of LAB in determining the final cheese quality.

  14. Transcriptome sequencing of diverse peanut (arachis) wild species and the cultivated species reveals a wealth of untapped genetic variability

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies and improved bioinformatics methods have provided opportunities to study sequence variability in complex polyploid transcriptomes. In this study, we used a diverse panel of twenty-two Arachis accessions representing seven Arachis hypogaea market classes, A-, B...

  15. Abundance and Genetic Diversity of nifH Gene Sequences in Anthropogenically Affected Brazilian Mangrove Sediments

    PubMed Central

    Dias, Armando Cavalcante Franco; Pereira e Silva, Michele de Cassia; Cotta, Simone Raposo; Dini-Andreote, Francisco; Soares, Fábio Lino; Salles, Joana Falcão; Azevedo, João Lúcio; van Elsas, Jan Dirk

    2012-01-01

    Although mangroves represent ecosystems of global importance, the genetic diversity and abundance of functional genes that are key to their functioning scarcely have been explored. Here, we present a survey based on the nifH gene across transects of sediments of two mangrove systems located along the coast line of São Paulo state (Brazil) which differed by degree of disturbance, i.e., an oil-spill-affected and an unaffected mangrove. The diazotrophic communities were assessed by denaturing gradient gel electrophoresis (DGGE), quantitative PCR (qPCR), and clone libraries. The nifH gene abundance was similar across the two mangrove sediment systems, as evidenced by qPCR. However, the nifH-based PCR-DGGE profiles revealed clear differences between the mangroves. Moreover, shifts in the nifH gene diversities were noted along the land-sea transect within the previously oiled mangrove. The nifH gene diversity depicted the presence of nitrogen-fixing bacteria affiliated with a wide range of taxa, encompassing members of the Alphaproteobacteria, Betaproteobacteria, Gammaproteobacteria, Firmicutes, and also a group of anaerobic sulfate-reducing bacteria. We also detected a unique mangrove-specific cluster of sequences denoted Mgv-nifH. Our results indicate that nitrogen-fixing bacterial guilds can be partially endemic to mangroves, and these communities are modulated by oil contamination, which has important implications for conservation strategies. PMID:22941088

  16. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  17. Identification of tropomyosins as major allergens in antarctic krill and mantis shrimp and their amino acid sequence characteristics.

    PubMed

    Motoyama, Kanna; Suma, Yota; Ishizaki, Shoichiro; Nagashima, Yuji; Lu, Ying; Ushio, Hideki; Shiomi, Kazuo

    2008-01-01

    Tropomyosin represents a major allergen of decapod crustaceans such as shrimps and crabs, and its highly conserved amino acid sequence (>90% identity) is a molecular basis of the immunoglobulin E (IgE) cross-reactivity among decapods. At present, however, little information is available about allergens in edible crustaceans other than decapods. In this study, the major allergen in two species of edible crustaceans, Antarctic krill Euphausia superba and mantis shrimp Oratosquilla oratoria that are taxonomically distinct from decapods, was demonstrated to be tropomyosin by IgE-immunoblotting using patient sera. The cross-reactivity of the tropomyosins from both species with decapod tropomyosins was also confirmed by inhibition IgE immunoblotting. Sequences of the tropomyosins from both species were determined by complementary deoxyribonucleic acid cloning. The mantis shrimp tropomyosin has high sequence identity (>90% identity) with decapod tropomyosins, especially with fast-type tropomyosins. On the other hand, the Antarctic krill tropomyosin is characterized by diverse alterations in region 13-42, the amino acid sequence of which is highly conserved for decapod tropomyosins, and hence, it shares somewhat lower sequence identity (82.4-89.8% identity) with decapod tropomyosins than the mantis shrimp tropomyosin. Quantification by enzyme-linked immunosorbent assay revealed that Antarctic krill contains tropomyosin at almost the same level as decapods, suggesting that its allergenicity is equivalent to decapods. However, mantis shrimp was assumed to be substantially not allergenic because of the extremely low content of tropomyosin.

  18. Targeted recovery of novel phylogenetic diversity from next-generation sequence data.

    PubMed

    Lynch, Michael D J; Bartram, Andrea K; Neufeld, Josh D

    2012-11-01

    Next-generation sequencing technologies have led to recognition of a so-called 'rare biosphere'. These microbial operational taxonomic units (OTUs) are defined by low relative abundance and may be specifically adapted to maintaining low population sizes. We hypothesized that mining of low-abundance next-generation 16S ribosomal RNA (rRNA) gene data would lead to the discovery of novel phylogenetic diversity, reflecting microorganisms not yet discovered by previous sampling efforts. Here, we test this hypothesis by combining molecular and bioinformatic approaches for targeted retrieval of phylogenetic novelty within rare biosphere OTUs. We combined BLASTN network analysis, phylogenetics and targeted primer design to amplify 16S rRNA gene sequences from unique potential bacterial lineages, comprising part of the rare biosphere from a multi-million sequence data set from an Arctic tundra soil sample. Demonstrating the feasibility of the protocol developed here, three of seven recovered phylogenetic lineages represented extremely divergent taxonomic entities. These divergent target sequences correspond to (a) a previously unknown lineage within the BRC1 candidate phylum, (b) a sister group to the early diverging and currently recognized monospecific Cyanobacteria Gloeobacter, a genus containing multiple plesiomorphic traits and (c) a highly divergent lineage phylogenetically resolved within mitochondria. A comparison to twelve next-generation data sets from additional soils suggested persistent low-abundance distributions of these novel 16S rRNA genes. The results demonstrate this sequence analysis and retrieval pipeline as applicable for exploring underrepresented phylogenetic novelty and recovering taxa that may represent significant steps in bacterial evolution.

  19. Rare recombination events generate sequence diversity among balancer chromosomes in Drosophila melanogaster.

    PubMed

    Miller, Danny E; Cook, Kevin R; Yeganeh Kazemi, Nazanin; Smith, Clarissa B; Cockrell, Alexandria J; Hawley, R Scott; Bergman, Casey M

    2016-03-08

    Multiply inverted balancer chromosomes that suppress exchange with their homologs are an essential part of the Drosophila melanogaster genetic toolkit. Despite their widespread use, the organization of balancer chromosomes has not been characterized at the molecular level, and the degree of sequence variation among copies of balancer chromosomes is unknown. To map inversion breakpoints and study potential diversity in descendants of a structurally identical balancer chromosome, we sequenced a panel of laboratory stocks containing the most widely used X chromosome balancer, First Multiple 7 (FM7). We mapped the locations of FM7 breakpoints to precise euchromatic coordinates and identified the flanking sequence of breakpoints in heterochromatic regions. Analysis of SNP variation revealed megabase-scale blocks of sequence divergence among currently used FM7 stocks. We present evidence that this divergence arose through rare double-crossover events that replaced a female-sterile allele of the singed gene (sn(X2)) on FM7c with a sequence from balanced chromosomes. We propose that although double-crossover events are rare in individual crosses, many FM7c chromosomes in the Bloomington Drosophila Stock Center have lost sn(X2) by this mechanism on a historical timescale. Finally, we characterize the original allele of the Bar gene (B(1)) that is carried on FM7, and validate the hypothesis that the origin and subsequent reversion of the B(1) duplication are mediated by unequal exchange. Our results reject a simple nonrecombining, clonal mode for the laboratory evolution of balancer chromosomes and have implications for how balancer chromosomes should be used in the design and interpretation of genetic experiments in Drosophila.

  20. Rare recombination events generate sequence diversity among balancer chromosomes in Drosophila melanogaster

    PubMed Central

    Miller, Danny E.; Cook, Kevin R.; Yeganeh Kazemi, Nazanin; Smith, Clarissa B.; Cockrell, Alexandria J.; Hawley, R. Scott; Bergman, Casey M.

    2016-01-01

    Multiply inverted balancer chromosomes that suppress exchange with their homologs are an essential part of the Drosophila melanogaster genetic toolkit. Despite their widespread use, the organization of balancer chromosomes has not been characterized at the molecular level, and the degree of sequence variation among copies of balancer chromosomes is unknown. To map inversion breakpoints and study potential diversity in descendants of a structurally identical balancer chromosome, we sequenced a panel of laboratory stocks containing the most widely used X chromosome balancer, First Multiple 7 (FM7). We mapped the locations of FM7 breakpoints to precise euchromatic coordinates and identified the flanking sequence of breakpoints in heterochromatic regions. Analysis of SNP variation revealed megabase-scale blocks of sequence divergence among currently used FM7 stocks. We present evidence that this divergence arose through rare double-crossover events that replaced a female-sterile allele of the singed gene (snX2) on FM7c with a sequence from balanced chromosomes. We propose that although double-crossover events are rare in individual crosses, many FM7c chromosomes in the Bloomington Drosophila Stock Center have lost snX2 by this mechanism on a historical timescale. Finally, we characterize the original allele of the Bar gene (B1) that is carried on FM7, and validate the hypothesis that the origin and subsequent reversion of the B1 duplication are mediated by unequal exchange. Our results reject a simple nonrecombining, clonal mode for the laboratory evolution of balancer chromosomes and have implications for how balancer chromosomes should be used in the design and interpretation of genetic experiments in Drosophila. PMID:26903656

  1. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  2. First insights into the microbial diversity in the omasum and reticulum of bovine using Illumina sequencing.

    PubMed

    Peng, Shuai; Yin, Jigang; Liu, Xiaolei; Jia, Boyin; Chang, Zhiguang; Lu, Huijun; Jiang, Ning; Chen, Qijun

    2015-08-01

    The digestive systems of mammals harbor a complex gut microbiome, comprising bacteria and other microorganisms that confer metabolic and immunological benefits to the host. Ruminants that digest plant-based foods have a four-compartment stomach consisting of the rumen, reticulum, omasum, and abomasum. The microorganisms in the stomach are essential for providing the host with critical nutrients. However, the majority of these microorganisms are unknown species. The microbiome of the stomach is diverse, and the majority of these organisms cannot be cultured. Next-generation sequencing (NGS) combined with bioinformatic analysis tools have allowed the dissection of the composition of the microbiome in samples collected from a specific environment. In this study, for the first time, the bacterial composition in two compartments, the reticulum and the omasum, of bovine were analyzed using a metagenomic approach and compared to the bacterial composition of the rumen. These data will assist in understanding the biology of ruminants and benefit the agricultural industry. The diversity and composition of the bacterial community in samples collected from the rumen, reticulum, and omasum of bovines in the Changchun Region of Northeast China were analyzed by sequencing the V3 region of the 16S rRNA gene using a barcoded Illumina paired-end sequencing technique, and the primary composition of the microbiome in the rumen, reticulum, and omasum of the bovines was determined. These microbiomes contained 17 phyla and 107 genera in all three samples. Five phyla, Bacteroidetes, Firmicutes, Proteobacteria, Spirochaetes, and Lentisphaerae, were the most abundant taxonomic groups. Additionally, the different stomach compartments harbored different compositions of the microorganisms.

  3. Phylogenetic Analysis of Geographically Diverse Radopholus similis via rDNA Sequence Reveals a Monomorphic Motif.

    PubMed

    Kaplan, D T; Thomas, W K; Frisse, L M; Sarah, J L; Stanton, J M; Speijer, P R; Marin, D H; Opperman, C H

    2000-06-01

    The nucleic acid sequences of rDNA ITS1 and the rDNA D2/D3 expansion segment were compared for 57 burrowing nematode isolates collected from Australia, Cameroon, Central America, Cuba, Dominican Republic, Florida, Guadeloupe, Hawaii, Nigeria, Honduras, Indonesia, Ivory Coast, Puerto Rico, South Africa, and Uganda. Of the 57 isolates, 55 were morphologically similar to Radopholus similis and seven were citrus-parasitic. The nucleic acid sequences for PCR-amplified ITS1 and for the D2/D3 expansion segment of the 28S rDNA gene were each identical for all putative R. similis. Sequence divergence for both the ITS1 and the D2/D3 was concordant with morphological differences that distinguish R. similis from other burrowing nematode species. This result substantiates previous observations that the R. similis genome is highly conserved across geographic regions. Autapomorphies that would delimit phylogenetic lineages of non-citrus-parasitic R. similis from those that parasitize citrus were not observed. The data presented herein support the concept that R. similis is comprised of two pathotypes-one that parasitizes citrus and one that does not.

  4. Human retroviruses and aids, 1992. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Korber, B.; Berzofsky, J.A.; Pavlakis, G.N.; Smith, R.F.

    1992-10-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) HIV and SIV Nucleotide Sequences; (H) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions below of the parts of the compendium, the user should read the individual introductions for each part.

  5. Determining the Cellular Diversity of Hepatitis C Virus Quasispecies by Single-Cell Viral Sequencing

    PubMed Central

    McLauchlan, John

    2013-01-01

    Single-cell genomics is emerging as an important tool in cellular biology. We describe for the first time a system to investigate RNA virus quasispecies diversity at the cellular level utilizing hepatitis C virus (HCV) replicons. A high-fidelity nested reverse transcription (RT)-PCR assay was developed, and validation using control transcripts of known copy number indicated a detection limit of 3 copies of viral RNA/reaction. This system was used to determine the cellular diversity of subgenomic JFH-1 HCV replicons constitutively expressed in Huh7 cells. Each cell contained a unique quasispecies that was much less diverse than the quasispecies of the bulk cell population from which the single cells were derived, suggesting the occurrence of independent evolution at the cellular level. An assessment of the replicative fitness of the predominant single-cell quasispecies variants indicated a modest reduction in fitness compared to the wild type. Real-time RT-PCR methods capable of determining single-cell viral loads were developed and indicated an average of 113 copies of replicon RNA per cell, correlating with calculated RNA copy numbers in the bulk cell population. This study introduces a single-cell RNA viral-sequencing method with numerous potential applications to explore host-virus interactions during infection. HCV quasispecies diversity varied greatly between cells in vitro, suggesting different within-cell evolutionary pathways. Such divergent trajectories in vivo could have implications for the evolution and establishment of antiviral-resistant variants and host immune escape mutants. PMID:24049174

  6. Sequence Diversity, Intersubgroup Relationships, and Origins of the Mouse Leukemia Gammaretroviruses of Laboratory and Wild Mice

    PubMed Central

    Bamunusinghe, Devinka; Naghashfar, Zohreh; Buckler-White, Alicia; Plishka, Ronald; Baliji, Surendranath; Liu, Qingping; Kassner, Joshua; Oler, Andrew J.; Hartley, Janet

    2016-01-01

    ABSTRACT Mouse leukemia viruses (MLVs) are found in the common inbred strains of laboratory mice and in the house mouse subspecies of Mus musculus. Receptor usage and envelope (env) sequence variation define three MLV host range subgroups in laboratory mice: ecotropic, polytropic, and xenotropic MLVs (E-, P-, and X-MLVs, respectively). These exogenous MLVs derive from endogenous retroviruses (ERVs) that were acquired by the wild mouse progenitors of laboratory mice about 1 million years ago. We analyzed the genomes of seven MLVs isolated from Eurasian and American wild mice and three previously sequenced MLVs to describe their relationships and identify their possible ERV progenitors. The phylogenetic tree based on the receptor-determining regions of env produced expected host range clusters, but these clusters are not maintained in trees generated from other virus regions. Colinear alignments of the viral genomes identified segmental homologies to ERVs of different host range subgroups. Six MLVs show close relationships to a small xenotropic ERV subgroup largely confined to the inbred mouse Y chromosome. env variations define three E-MLV subtypes, one of which carries duplications of various sizes, sequences, and locations in the proline-rich region of env. Outside the env region, all E-MLVs are related to different nonecotropic MLVs. These results document the diversity in gammaretroviruses isolated from globally distributed Mus subspecies, provide insight into their origins and relationships, and indicate that recombination has had an important role in the evolution of these mutagenic and pathogenic agents. IMPORTANCE Laboratory mice carry mouse leukemia viruses (MLVs) of three host range groups which were acquired from their wild mouse progenitors. We sequenced the complete genomes of seven infectious MLVs isolated from geographically separated Eurasian and American wild mice and compared them with endogenous germ line retroviruses (ERVs) acquired early in

  7. Diversity and distribution of culturable lactic acid bacterial species in Indonesian Sayur Asin

    PubMed Central

    Mangunwardoyo, Wibowo; Abinawanto; Salamah, Andi; Sukara, Endang; Sulistiani; Dinoto, Achmad

    2016-01-01

    Background and Objectives: Lactic acid bacteria (LAB) play important roles in processing of Sayur Asin (spontaneously fermented mustard). Unfortunately, information about LAB in Indonesian Sayur Asin, prepared by traditional manufactures which is important as baseline data for maintenance of food quality and safety, is unclear. The aim of this study was to describe the diversity and distribution of culturable lactic acid bacteria in Sayur Asin of Indonesia. Materials and Methods: Four Sayur Asin samples (fermentation liquor and fermented mustard) were collected at harvesting times (3–7 days after fermentation) from two traditional manufactures in Tulung Agung (TA) and Kediri (KDR), East Java provinces, Indonesia. LAB strains were isolated by using MRS agar method supplemented with 1% CaCO 3 and characterized morphologically. Identification of the strains was performed basedon 16S rDNA analysis and the phylogenetic tree was drawn to understand the phylogenetic relationship of the collected strains. Results: Different profiles were detected in total count of the plates, salinity and pH of fermenting liquor of Sayur Asin in TA and KDR provinces. A total of 172 LAB isolates were successfully isolated and identified based on their 16S rDNA sequences. Phylogenetic analysis of 27 representative LAB strains from Sayur Asin showed that these strains belonged to 5 distinct species namely Lactobacilus farciminis (N=32), L. fermentum (N=4), L. namurensis (N=15), L. plantarum (N=118) and L. parafarraginis (N=1). Strains D5-S-2013 and B4-S-2013 showed a close phylogenetic relationship with L. composti and L. paralimentarius, respectively where as the sequence had slightly lower similarity of lower than 99%, suggesting that they may be classified into novel species and need further investigation due to exhibition of significant differences in their nucleotide sequences. Lactobacillus plantarum was found being dominant in all sayur asin samples. Conclusion: Lactobacilli were

  8. Impact of Human Immunodeficiency Virus Type-1 Sequence Diversity on Antiretroviral Therapy Outcomes

    PubMed Central

    Langs-Barlow, Allison; Paintsil, Elijah

    2014-01-01

    Worldwide circulating HIV-1 genomes show extensive variation represented by different subtypes, polymorphisms and drug-resistant strains. Reports on the impact of sequence variation on antiretroviral therapy (ART) outcomes are mixed. In this review, we summarize relevant published data from both resource-rich and resource-limited countries in the last 10 years on the impact of HIV-1 sequence diversity on treatment outcomes. The prevalence of transmission of drug resistant mutations (DRMs) varies considerably, ranging from 0% to 27% worldwide. Factors such as geographic location, access and availability to ART, duration since inception of treatment programs, quality of care, risk-taking behaviors, mode of transmission, and viral subtype all dictate the prevalence in a particular geographical region. Although HIV-1 subtype may not be a good predictor of treatment outcome, review of emerging evidence supports the fact that HIV-1 genome sequence-resulting from natural polymorphisms or drug-associated mutations-matters when it comes to treatment outcomes. Therefore, continued surveillance of drug resistant variants in both treatment-naïve and treatment-experienced populations is needed to reduce the transmission of DRMs and to optimize the efficacy of the current ART armamentarium. PMID:25333465

  9. Single molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae

    PubMed Central

    Conlan, Sean; Thomas, Pamela J.; Deming, Clayton; Park, Morgan; Lau, Anna F.; Dekker, John P.; Snitkin, Evan S.; Clark, Tyson A.; Luong, Khai; Song, Yi; Tsai, Yu-Chih; Boitano, Matthew; Gupta, Jyoti; Brooks, Shelise Y.; Schmidt, Brian; Young, Alice C.; Thomas, James W.; Bouffard, Gerard G.; Blakesley, Robert W.; Mullikin, James C.; Korlach, Jonas; Henderson, David K.; Frank, Karen M.; Palmore, Tara N.; Segre, Julia A.

    2014-01-01

    Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common healthcare-associated infections nearly impossible to treat. We performed comprehensive surveillance and genomic sequencing to identify carbapenem-resistant Enterobacteriaceae in the NIH Clinical Center patient population and hospital environment in order to to articulate the diversity of carbapenemase-encoding plasmids and survey the mobility of and assess the mobility of these plasmids between bacterial species. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem-resistance genes on a wide array of plasmids. Klebsiella pneumoniae and Enterobacter cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, overriding the epidemiological scenario of plasmid transfer between organisms within this patient. We did, however, find evidence supporting horizontal transfer of carbapenemase-encoding plasmids between Klebsiella pneumoniae, Enterobacter cloacae and Citrobacter freundii in the hospital environment. Our comprehensive sequence data, with full plasmid identification, challenges assumptions about horizontal gene transfer events within patients and identified wider possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by Klebsiella pneumoniae, Escherichia coli, Enterobacter cloacae and Pantoea species, from unrelated patients and the hospital environment. PMID:25232178

  10. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses.

    PubMed

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3-6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1-2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site.

  11. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    PubMed Central

    Naveed, Muhammad; Mubeen, Samavia; khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization. PMID:25477935

  12. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

    PubMed

    Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

    2013-04-01

    Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.

  13. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    PubMed Central

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

  14. In vivo sequence diversity of the protease of human immunodeficiency virus type 1: presence of protease inhibitor-resistant variants in untreated subjects.

    PubMed Central

    Lech, W J; Wang, G; Yang, Y L; Chee, Y; Dorman, K; McCrae, D; Lazzeroni, L C; Erickson, J W; Sinsheimer, J S; Kaplan, A H

    1996-01-01

    We have evaluated the sequence diversity of the protease human immunodeficiency virus type 1 in vivo. Our analysis of 246 protease coding domain sequences obtained from 12 subjects indicates that amino acid substitutions predicted to give rise to protease inhibitor resistance may be present in patients who have not received protease inhibitors. In addition, we demonstrated that amino acid residues directly involved in enzyme-substrate interactions may be varied in infected individuals. Several of these substitutions occurred in combination either more or less frequently than would be expected if their appearance was independent, suggesting that one substitution may compensate for the effects of another. Taken together, our analysis indicates that the human immunodeficiency virus type 1 protease has flexibility sufficient to vary critical subsites in vivo, thereby retaining enzyme function and viral pathogenicity. PMID:8627733

  15. CDR3 clonotype and amino acid motif diversity of BV19 expressing circulating human CD8 T cells

    PubMed Central

    Yassai, Maryam B.; Demos, Wendy; Janczak, Teresa; Naumova, Elena N.; Gorski, Jack

    2015-01-01

    Generating a detailed description of human T cell repertoire diversity is an important goal in the study of human immunology. The circulation is the source of most T cells used for studies in humans. Here we use high throughput sequencing of TCR BV19 transcripts from CD8 T cells derived from unmanipulated PBMC from an older HLA-A2 individual to provide a quantitative and qualitative description of the clonotypic CDR3 nucleotide and amino acid composition of the TCR β-chain from this subset of circulating CD8 T cells. Aggregated samples from six time points spanning ~ 1.5 years were analyzed to smooth possible temporal fluctuation. BV19 encompasses the well studied RS-encoding clonotypes involved in recognition of the M158–66 epitope from influenza A in HLA-A2 individuals. The clonotype distribution was diverse, complex and self-similar. The amino acid composition was generally skewed in favor of glycines and there were specific amino acids observed at higher frequency at the NDN start position. The motif repertoire distribution was also diverse, complex and self-similar with respect to CDR3 length, NDN start and length. PMID:26593155

  16. Sequence Diversity of VP4 and VP7 Genes of Human Rotavirus Strains in Saudi Arabia.

    PubMed

    Abdel-Moneim, Ahmed S; Al-Malky, Mater I R; Alsulaimani, Adnan A A; Abuelsaad, Abdelaziz S A; Mohamed, Imad; Ismail, Ayman K

    2015-12-01

    Group A rotavirus is responsible for inducing severe diarrhea in young children worldwide. Rotavirus vaccines are used to control the disease in many countries. In the current study, the sequences of human rotavirus G and P types in Saudi Arabia are reported and compared to different relevant published sequences. In addition, the VP4 and VP7 genes of the G1P[8] strains are compared to different antigenic epitopes of the rotavirus vaccines. Stool samples were collected from children under 2 years suffering from severe diarrhea. Screening of the rotavirus-positive samples was performed with rapid antigen detection kit. RNA was amplified from rotavirus-positive samples by reverse transcriptase polymerase chain reaction assay for both VP4 and VP7 genes. Direct sequencing of the VP4 and VP7 genes was conducted and the obtained sequences were compared to each other and to the rotavirus vaccines. Both G1P[8] G1P[4] genotypes were detected. Phylogenetic analysis revealed that the detected strains belong to G1 lineage 1 and 2, P[8] lineage 3, and to P[4] lineage 5. Multiple amino acid substitutions were detected between the Saudi RVA strains and the commonly used vaccines. The current findings emphasize the importance of the continuous surveillance of the circulating rotavirus strains, which is crucial for monitoring virus evolution and helping in predicting the protection level afforded by rotavirus vaccines.

  17. Sequence-Based Predictions of Lipooligosaccharide Diversity in the Neisseriaceae and Their Implication in Pathogenicity

    PubMed Central

    Stein, Daniel C.; Miller, Clinton J.; Bhoopalan, Senthil V.; Sommer, Daniel D.

    2011-01-01

    Endotoxin [Lipopolysaccharide (LPS)/Lipooligosaccharide (LOS)] is an important virulence determinant in gram negative bacteria. While the genetic basis of endotoxin production and its role in disease in the pathogenic Neisseria has been extensively studied, little research has focused on the genetic basis of LOS biosynthesis in commensal Neisseria. We determined the genomic sequences of a variety of commensal Neisseria strains, and compared these sequences, along with other genomic sequences available from various sequencing centers from commensal and pathogenic strains, to identify genes involved in LOS biosynthesis. This allowed us to make structural predictions as to differences in LOS seen between commensal and pathogenic strains. We determined that all neisserial strains possess a conserved set of genes needed to make a common 3-Deoxy-D-manno-octulosonic acid -heptose core structure. However, significant genomic differences in glycosyl transferase genes support the published literature indicating compositional differences in the terminal oligosaccharides. This was most pronounced in commensal strains that were distally related to the gonococcus and meningococcus. These strains possessed a homolog of heptosyltransferase III, suggesting that they differ from the pathogenic strains by the presence a third heptose. Furthermore, most commensal strains possess homologs of genes needed to synthesize lipopolysaccharide (LPS). N. cinerea, a commensal species that is highly related to the gonococcus has lost the ability to make sialyltransferase. Overall genomic comparisons of various neisserial strains indicate that significant recombination/genetic acquisition/loss has occurred within the genus, and this muddles proper speciation. PMID:21533118

  18. Coupled enhancer and coding sequence evolution of a homeobox gene shaped leaf diversity

    PubMed Central

    Vuolo, Francesco; Mentink, Remco A.; Hajheidari, Mohsen; Bailey, C. Donovan; Filatov, Dmitry A.; Tsiantis, Miltos

    2016-01-01

    Here we investigate mechanisms underlying the diversification of biological forms using crucifer leaf shape as an example. We show that evolution of an enhancer element in the homeobox gene REDUCED COMPLEXITY (RCO) altered leaf shape by changing gene expression from the distal leaf blade to its base. A single amino acid substitution evolved together with this regulatory change, which reduced RCO protein stability, preventing pleiotropic effects caused by its altered gene expression. We detected hallmarks of positive selection in these evolved regulatory and coding sequence variants and showed that modulating RCO activity can improve plant physiological performance. Therefore, interplay between enhancer and coding sequence evolution created a potentially adaptive path for morphological evolution. PMID:27852629

  19. Completion of the amino acid sequence of the alpha 1 chain from type I calf skin collagen. Amino acid sequence of alpha 1(I)B8.

    PubMed Central

    Glanville, R W; Breitkreutz, D; Meitinger, M; Fietzek, P P

    1983-01-01

    The complete amino acid sequence of the 279-residue CNBr peptide CB8 from the alpha 1 chain of type I calf skin collagen is presented. It was determined by sequencing overlapping fragments of CB8 produced by Staphylococcus aureus V8 proteinase, trypsin, Endoproteinase Arg-C and hydroxylamine. Tryptic cleavages were also made specific for lysine by blocking arginine residues with cyclohexane-1,2-dione. This completes the amino acid sequence analysis of the 1054-residues-long alpha (I) chain of calf skin collagen. PMID:6354180

  20. Predictable conformational diversity in foldamers of sugar amino acids.

    PubMed

    Menyhard, Dora K; Hudaky, Ilona; Jákli, Imre; Juhász, György; Perczel, András

    2017-03-27

    Systematic conformational search was carried out for monomers and homohexamers of furanoid β-amino acids: cis-(S,R) and trans-(S,S) stereoisomers of aminocyclopentane carboxylic acid (ACPC), two different aminofuranuronic-acids (AFU(α) and AFU(β)), their isopropylidene derivatives (AFU(ip)) as well as the key intermediate β-aminotetrahydrofurancarboxylic acid (ATFC). Stereochemistry of the building blocks was chosen to match with that of natural sugar amino acid (xylose and ribose) precursors. Results show that hexamers of cis furanoid β-amino acids show great variability: while hydrophobic cyclopentane (cis(ACPC)6), and hydrophilic (cisXylAFU(α/β))6 foldamers favor two different zigzagged conformation as hexamers, the backbone fold turns into a helix in case of (cisATFC)6 (10-helix) and (cisAFU(ip))6 (14-helix). Trans stereochemistry resulted in hexamers exclusively of right-handed helix conformation, (H12(P))6, regardless of their polarity. We found that the preferred oligomeric structure of cis/(S,R)AFU(α/β) is conformationally compatible with β-pleated sheets, while that of the trans/(S,S) units match with α-helices of α-proteins.

  1. Using Whole Genome Analysis to Examine Recombination across Diverse Sequence Types of Staphylococcus aureus

    PubMed Central

    Driebe, Elizabeth M.; Sahl, Jason W.; Roe, Chandler; Bowers, Jolene R.; Schupp, James M.; Gillece, John D.; Kelley, Erin; Price, Lance B.; Pearson, Talima R.; Hepp, Crystal M.; Brzoska, Pius M.; Cummings, Craig A.; Furtado, Manohar R.; Andersen, Paal S.; Stegger, Marc; Engelthaler, David M.; Keim, Paul S.

    2015-01-01

    Staphylococcus aureus is an important clinical pathogen worldwide and understanding this organism's phylogeny and, in particular, the role of recombination, is important both to understand the overall spread of virulent lineages and to characterize outbreaks. To further elucidate the phylogeny of S. aureus, 35 diverse strains were sequenced using whole genome sequencing. In addition, 29 publicly available whole genome sequences were included to create a single nucleotide polymorphism (SNP)-based phylogenetic tree encompassing 11 distinct lineages. All strains of a particular sequence type fell into the same clade with clear groupings of the major clonal complexes of CC8, CC5, CC30, CC45 and CC1. Using a novel analysis method, we plotted the homoplasy density and SNP density across the whole genome and found evidence of recombination throughout the entire chromosome, but when we examined individual clonal lineages we found very little recombination. However, when we analyzed three branches of multiple lineages, we saw intermediate and differing levels of recombination between them. These data demonstrate that in S. aureus, recombination occurs across major lineages that subsequently expand in a clonal manner. Estimated mutation rates for the CC8 and CC5 lineages were different from each other. While the CC8 lineage rate was similar to previous studies, the CC5 lineage was 100-fold greater. Fifty known virulence genes were screened in all genomes in silico to determine their distribution across major clades. Thirty-three genes were present variably across clades, most of which were not constrained by ancestry, indicating horizontal gene transfer or gene loss. PMID:26161978

  2. An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data.

    PubMed Central

    Adzhubei, I A; Adzhubei, A A; Neidle, S

    1998-01-01

    We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship. PMID:9399866

  3. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain

    PubMed Central

    Lake, Blue B.; Ai, Rizi; Kaeser, Gwendolyn E.; Salathia, Neeraj S.; Yung, Yun C.; Liu, Rui; Wildberg, Andre; Gao, Derek; Fung, Ho-Lim; Chen, Song; Vijayaraghavan, Raakhee; Wong, Julian; Chen, Allison; Sheng, Xiaoyan; Kaper, Fiona; Shen, Richard; Ronaghi, Mostafa; Fan, Jian-Bing; Wang, Wei; Chun, Jerold; Zhang, Kun

    2016-01-01

    The human brain has enormously complex cellular diversity and connectivities fundamental to our neural functions, yet difficulties in interrogating individual neurons has impeded understanding of the underlying transcriptional landscape. We developed a scalable approach to sequence and quantify RNA molecules in isolated neuronal nuclei from post-mortem brain, generating 3,227 sets of single neuron data from six distinct regions of the cerebral cortex. Using an iterative clustering and classification approach, we identified 16 neuronal subtypes that were further annotated on the basis of known markers and cortical cytoarchitecture. These data demonstrate a robust and scalable method for identifying and categorizing single nuclear transcriptomes, revealing shared genes sufficient to distinguish novel and orthologous neuronal subtypes as well as regional identity within the human brain. PMID:27339989

  4. A dominant conformational role for amino acid diversity in minimalist protein–protein interfaces

    SciTech Connect

    Gilbreth, Ryan N.; Esaki, Kaori; Koide, Akiko; Sidhu, Sachdev S.; Koide, Shohei

    2008-08-01

    Recent studies have shown that highly simplified interaction surfaces consisting of combinations of just two amino acids, Tyr and Ser, exhibit high affinity and specificity. The high functional levels of such minimalist interfaces might thus indicate small contributions of greater amino acid diversity seen in natural interfaces. Toward addressing this issue, we have produced a pair of binding proteins built on the fibronectin type III scaffold, termed “monobodies.” One monobody contains the Tyr/Ser binary-code interface (termed YS) and the other contains an expanded amino acid diversity interface (YSX), but both bind to an identical target, maltose-binding protein. The YSX monobody bound with higher affinity, a slower off rate and a more favorable enthalpic contribution than the YS monobody. High-resolution X-ray crystal structures revealed that both proteins bound to an essentially identical epitope, providing a unique opportunity to directly investigate the role of amino acid diversity in a protein interaction interface. Surprisingly, Tyr still dominates the YSX paratope and the additional amino acid types are primarily used to conformationally optimize contacts made by tyrosines. Scanning mutagenesis showed that while all contacting Tyr side chains are essential in the YS monobody, the YSX interface was more tolerant to mutations. These results suggest that the conformational, not chemical, diversity of additional types of amino acids provided higher functionality and evolutionary robustness, supporting the dominant role of Tyr and the importance of conformational diversity in forming protein interaction interfaces.

  5. Seasonal diversity and dynamics of haptophytes in the Skagerrak, Norway, explored by high-throughput sequencing

    PubMed Central

    Egge, Elianne Sirnæs; Johannessen, Torill Vik; Andersen, Tom; Eikrem, Wenche; Bittner, Lucie; Larsen, Aud; Sandaa, Ruth-Anne; Edvardsen, Bente

    2015-01-01

    Microalgae in the division Haptophyta play key roles in the marine ecosystem and in global biogeochemical processes. Despite their ecological importance, knowledge on seasonal dynamics, community composition and abundance at the species level is limited due to their small cell size and few morphological features visible under the light microscope. Here, we present unique data on haptophyte seasonal diversity and dynamics from two annual cycles, with the taxonomic resolution and sampling depth obtained with high-throughput sequencing. From outer Oslofjorden, S Norway, nano- and picoplanktonic samples were collected monthly for 2 years, and the haptophytes targeted by amplification of RNA/cDNA with Haptophyta-specific 18S rDNA V4 primers. We obtained 156 operational taxonomic units (OTUs), from c. 400.000 454 pyrosequencing reads, after rigorous bioinformatic filtering and clustering at 99.5%. Most OTUs represented uncultured and/or not yet 18S rDNA-sequenced species. Haptophyte OTU richness and community composition exhibited high temporal variation and significant yearly periodicity. Richness was highest in September–October (autumn) and lowest in April–May (spring). Some taxa were detected all year, such as Chrysochromulina simplex, Emiliania huxleyi and Phaeocystis cordata, whereas most calcifying coccolithophores only appeared from summer to early winter. We also revealed the seasonal dynamics of OTUs representing putative novel classes (clades HAP-3–5) or orders (clades D, E, F). Season, light and temperature accounted for 29% of the variation in OTU composition. Residual variation may be related to biotic factors, such as competition and viral infection. This study provides new, in-depth knowledge on seasonal diversity and dynamics of haptophytes in North Atlantic coastal waters. PMID:25893259

  6. Expanding the diversity of unnatural cell surface sialic acids

    SciTech Connect

    Luchansky, Sarah J.; Goon, Scarlett; Bertozzi, Carolyn R.

    2003-10-30

    Novel chemical reactivity can be introduced onto cell surfaces through metabolic oligosaccharide engineering. This technique exploits the substrate promiscuity of cellular biosynthetic enzymes to deliver unnatural monosaccharides bearing bioorthogonal functional groups into cellular glycans. For example, derivatives of N-acetylmannosamine (ManNAc) are converted by the cellular biosynthetic machinery into the corresponding sialic acids and subsequently delivered to the cell surface in the form of sialoglycoconjugates. Analogs of N-acetylglucosamine (GlcNAc) and N-acetylgalactosamine (GalNAc) are also metabolized and incorporated into cell surface glycans, likely through the sialic acid and GalNAc salvage pathways, respectively. Furthermore, GlcNAc analogs can be incorporated into nucleocytoplasmic proteins in place of {beta}-O-GlcNAc residues. These pathways have been exploited to integrate unique electrophiles such as ketones and azides into the target glycoconjugate class. These functional groups can be further elaborated in a chemoselective fashion by condensation with hydrazides and by Staudinger ligation, respectively, thereby introducing detectable probes onto the cell. In conclusion, sialic acid derivatives are efficient vehicles for delivery of bulky functional groups to cell surfaces and masking of their hydroxyl groups improves their cellular uptake and utilization. Furthermore, the successful introduction of photoactivatable aryl azides into cell surface glycans opens up new avenues for studying sialic acid-binding proteins and elucidating the role of sialic acid in essential processes such as signaling and cell adhesion.

  7. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor.

  8. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing.

    PubMed

    Manske, Magnus; Miotto, Olivo; Campino, Susana; Auburn, Sarah; Almagro-Garcia, Jacob; Maslen, Gareth; O'Brien, Jack; Djimde, Abdoulaye; Doumbo, Ogobara; Zongo, Issaka; Ouedraogo, Jean-Bosco; Michon, Pascal; Mueller, Ivo; Siba, Peter; Nzila, Alexis; Borrmann, Steffen; Kiara, Steven M; Marsh, Kevin; Jiang, Hongying; Su, Xin-Zhuan; Amaratunga, Chanaki; Fairhurst, Rick; Socheat, Duong; Nosten, Francois; Imwong, Mallika; White, Nicholas J; Sanders, Mandy; Anastasi, Elisa; Alcock, Dan; Drury, Eleanor; Oyola, Samuel; Quail, Michael A; Turner, Daniel J; Ruano-Rubio, Valentin; Jyothi, Dushyanth; Amenga-Etego, Lucas; Hubbart, Christina; Jeffreys, Anna; Rowlands, Kate; Sutherland, Colin; Roper, Cally; Mangano, Valentina; Modiano, David; Tan, John C; Ferdig, Michael T; Amambua-Ngwa, Alfred; Conway, David J; Takala-Harrison, Shannon; Plowe, Christopher V; Rayner, Julian C; Rockett, Kirk A; Clark, Taane G; Newbold, Chris I; Berriman, Matthew; MacInnis, Bronwyn; Kwiatkowski, Dominic P

    2012-07-19

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. Here we describe methods for the large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short-term culture. Analysis of 86,158 exonic single nucleotide polymorphisms that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for the exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome.

  9. Genetic diversity analysis of okra (Abelmoschus esculentus L.) by inter-simple sequence repeat (ISSR) markers.

    PubMed

    Yuan, C Y; Zhang, C; Wang, P; Hu, S; Chang, H P; Xiao, W J; Lu, X T; Jiang, S B; Ye, J Z; Guo, X H

    2014-04-25

    Okra (Abelmoschus esculentus L.) is not only a nutrient-rich vegetable but also an important medicinal herb. Inter-simple sequence repeat (ISSR) markers were employed to investigate the genetic diversity and differentiation of 24 okra genotypes. In this study, the PCR products were separated by electrophoresis on 8% nondenaturing polyacrylamide gel and visualized by silver staining. The 22 ISSR primers produced 289 amplified DNA fragments, and 145 (50%) fragments were polymorphic. The 289 markers were used to construct the dendrogram based on the unweighted pair-group method with arithmetic average (UPGMA) cluster analysis. The dendrogram indicated that 24 okras were clustered into 4 geographically distinct groups. The average polymorphism information content (PIC) was 0.531929, which showed that the majority of primers were informative. The high values of allele frequency, genetic diversity, and heterozygosity showed that primer-sample combinations produced measurable fragments. The mean distances ranged from 0.045455 to 0.454545. The dendrogram indicated that the ISSR markers succeeded in distinguishing most of the 24 varieties in relation to their genetic backgrounds and geographical origins.

  10. Molecular Diversity Assessment Using Sequence Related Amplified Polymorphism (SRAP) Markers in Vicia faba L.

    PubMed Central

    Alghamdi, Salem S.; Al-Faifi, Sulieman A.; Migdadi, Hussein M.; Khan, Muhammad Altaf; El-Harty, Ehab H.; Ammar, Megahed H.

    2012-01-01

    Sequence-related amplified polymorphism (SRAP) markers were used to assess the genetic diversity and relationship among 58 faba bean (Vicia faba L.) genotypes. Fourteen SRAP primer combinations amplified a total of 1036 differently sized well-resolved peaks (fragments), of which all were polymorphic with a 0.96 PIC value and discriminated all of the 58 faba bean genotypes. An average pairwise similarity of 21% was revealed among the genotypes ranging from 2% to 65%. At a similarity of 28%, UPGMA clustered the genotypes into three main groups comprising 78% of the genotypes. The local landraces and most of the Egyptian genotypes in addition to the Sudan genotypes were grouped in the first main cluster. The advanced breeding lines were scattered in the second and third main clusters with breeding lines from the ICARDA and genotypes introduced from Egypt. At a similarity of 47%, all the genotypes formed separated clusters with the exceptions of Hassawi 1 and Hassawi 2. Group analysis of the genotypes according to their geographic origin and type showed that the landraces were grouped according to their origin, while others were grouped according to their seed type. To our knowledge, this is the first application of SRAP markers for the assessment of genetic diversity in faba bean. Such information will be useful to determine optimal breeding strategies to allow continued progress in faba bean breeding. PMID:23211669

  11. Investigating genetic diversity in sapucaia using inter simple sequence repeat markers.

    PubMed

    Borges, R C; Santos, F M G; Maia, M C C; Lima, P S C; Valente, S E S

    2016-08-19

    Sapucaia is a tree species originating from the Brazilian Amazon and is widely distributed in Brazil, especially in the mid-north region (Piauí and Maranhão states). Its seeds are rich in calories and proteins, and possess great potential for commercialization. Little is known about the genetic variability in the germplasm of most Lecythis species. Here, 11 inter-simple sequence repeat primers were used to estimate the genetic variability among 17 accessions, and to determine the levels of genetic variation and the standards of population structure in sapucaia. The accessions were obtained from the active germplasm bank (AGB) of Embrapa Meio-Norte, Teresina, PI, Brazil, and corresponded to four occurrence areas. Ninety-six loci were analyzed among the studied individuals. High variation was found at the species level, where the percentage of polymorphic bands was 94.79%, Nei's genetic diversity (h) was 0.3110, and Shannon's index (I) was 0.4732. In the analyzed populations, the percentage polymorphism ranged from 20.83 to 94.79%, Nei's genetic diversity ranged from 0.0863 to 0.2969, and Shannon's index ranged from 0.1260 to 0.4457. Significant genetic differentiation was detected among the populations (ΦST = 10.66%); however, the greatest genetic differentiation was found within the populations (89.34%), between which there was an intermediate level of gene flow (Nm = 1.10). Accessions BGS 2 and BGS 4 were the most divergent, whereas accessions BGS 14 and BGS 15 were the most similar. Therefore, sapucaia analyzed from the AGB present an elevated level of genetic diversity and may have potential use in genetic breeding programs.

  12. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  13. Structural diversity and biological significance of lipoteichoic acid in Gram-positive bacteria: focusing on beneficial probiotic lactic acid bacteria.

    PubMed

    Shiraishi, Tsukasa; Yokota, Shinichi; Fukiya, Satoru; Yokota, Atsushi

    2016-01-01

    Bacterial cell surface molecules are at the forefront of host-bacterium interactions. Teichoic acids are observed only in Gram-positive bacteria, and they are one of the main cell surface components. Teichoic acids play important physiological roles and contribute to the bacterial interaction with their host. In particular, lipoteichoic acid (LTA) anchored to the cell membrane has attracted attention as a host immunomodulator. Chemical and biological characteristics of LTA from various bacteria have been described. However, most of the information concerns pathogenic bacteria, and information on beneficial bacteria, including probiotic lactic acid bacteria, is insufficient. LTA is structurally diverse. Strain-level structural diversity of LTA is suggested to underpin its immunomodulatory activities. Thus, the structural information on LTA in probiotics, in particular strain-associated diversity, is important for understanding its beneficial roles associated with the modulation of immune response. Continued accumulation of structural information is necessary to elucidate the detailed physiological roles and significance of LTA. In this review article, we summarize the current state of knowledge on LTA structure, in particular the structure of LTA from lactic acid bacteria. We also describe the significance of structural diversity and biological roles of LTA.

  14. Structural diversity and biological significance of lipoteichoic acid in Gram-positive bacteria: focusing on beneficial probiotic lactic acid bacteria

    PubMed Central

    SHIRAISHI, Tsukasa; YOKOTA, Shinichi; FUKIYA, Satoru; YOKOTA, Atsushi

    2016-01-01

    Bacterial cell surface molecules are at the forefront of host-bacterium interactions. Teichoic acids are observed only in Gram-positive bacteria, and they are one of the main cell surface components. Teichoic acids play important physiological roles and contribute to the bacterial interaction with their host. In particular, lipoteichoic acid (LTA) anchored to the cell membrane has attracted attention as a host immunomodulator. Chemical and biological characteristics of LTA from various bacteria have been described. However, most of the information concerns pathogenic bacteria, and information on beneficial bacteria, including probiotic lactic acid bacteria, is insufficient. LTA is structurally diverse. Strain-level structural diversity of LTA is suggested to underpin its immunomodulatory activities. Thus, the structural information on LTA in probiotics, in particular strain-associated diversity, is important for understanding its beneficial roles associated with the modulation of immune response. Continued accumulation of structural information is necessary to elucidate the detailed physiological roles and significance of LTA. In this review article, we summarize the current state of knowledge on LTA structure, in particular the structure of LTA from lactic acid bacteria. We also describe the significance of structural diversity and biological roles of LTA. PMID:27867802

  15. K-Pax2: Bayesian identification of cluster-defining amino acid positions in large sequence datasets

    PubMed Central

    Grad, Yonatan; Cobey, Sarah; Puranen, Juha Santeri; Corander, Jukka

    2015-01-01

    The recent growth in publicly available sequence data has introduced new opportunities for studying microbial evolution and spread. Because the pace of sequence accumulation tends to exceed the pace of experimental studies of protein function and the roles of individual amino acids, statistical tools to identify meaningful patterns in protein diversity are essential. Large sequence alignments from fast-evolving micro-organisms are particularly challenging to dissect using standard tools from phylogenetics and multivariate statistics because biologically relevant functional signals are easily masked by neutral variation and noise. To meet this need, a novel computational method is introduced that is easily executed in parallel using a cluster environment and can handle thousands of sequences with minimal subjective input from the user. The usefulness of this kind of machine learning is demonstrated by applying it to nearly 5000 haemagglutinin sequences of influenza A/H3N2.Antigenic and 3D structural mapping of the results show that the method can recover the major jumps in antigenic phenotype that occurred between 1968 and 2013 and identify specific amino acids associated with these changes. The method is expected to provide a useful tool to uncover patterns of protein evolution. PMID:28348810

  16. Chromosomal Organization and Sequence Diversity of Genes Encoding Lachrymatory Factor Synthase in Allium cepa L.

    PubMed

    Masamura, Noriya; McCallum, John; Khrustaleva, Ludmila; Kenel, Fernand; Pither-Joyce, Meegham; Shono, Jinji; Suzuki, Go; Mukai, Yasuhiko; Yamauchi, Naoki; Shigyo, Masayoshi

    2012-06-01

    Lachrymatory factor synthase (LFS) catalyzes the formation of lachrymatory factor, one of the most distinctive traits of bulb onion (Allium cepa L.). Therefore, we used LFS as a model for a functional gene in a huge genome, and we examined the chromosomal organization of LFS in A. cepa by multiple approaches. The first-level analysis completed the chromosomal assignment of LFS gene to chromosome 5 of A. cepa via the use of a complete set of A. fistulosum-shallot (A. cepa L. Aggregatum group) monosomic addition lines. Subsequent use of an F(2) mapping population from the interspecific cross A. cepa × A. roylei confirmed the assignment of an LFS locus to this chromosome. Sequence comparison of two BAC clones bearing LFS genes, LFS amplicons from diverse germplasm, and expressed sequences from a doubled haploid line revealed variation consistent with duplicated LFS genes. Furthermore, the BAC-FISH study using the two BAC clones as a probe showed that LFS genes are localized in the proximal region of the long arm of the chromosome. These results suggested that LFS in A. cepa is transcribed from at least two loci and that they are localized on chromosome 5.

  17. Expanding LAGLIDADG endonuclease scaffold diversity by rapidly surveying evolutionary sequence space

    PubMed Central

    Jacoby, Kyle; Metzger, Michael; Shen, Betty W.; Certo, Michael T.; Jarjour, Jordan; Stoddard, Barry L.; Scharenberg, Andrew M.

    2012-01-01

    LAGLIDADG homing endonucleases (LHEs) are a family of highly specific DNA endonucleases capable of recognizing target sequences ∼20 bp in length, thus drawing intense interest for their potential academic, biotechnological and clinical applications. Methods for rational design of LHEs to cleave desired target sites are presently limited by a small number of high-quality native LHEs to serve as scaffolds for protein engineering—many are unsatisfactory for gene targeting applications. One strategy to address such limitations is to identify close homologs of existing LHEs possessing superior biophysical or catalytic properties. To test this concept, we searched public sequence databases to identify putative LHE open reading frames homologous to the LHE I-AniI and used a DNA binding and cleavage assay using yeast surface display to rapidly survey a subset of the predicted proteins. These proteins exhibited a range of capacities for surface expression and also displayed locally altered binding and cleavage specificities with a range of in vivo cleavage activities. Of these enzymes, I-HjeMI demonstrated the greatest activity in vivo and was readily crystallizable, allowing a comparative structural analysis. Taken together, our results suggest that even highly homologous LHEs offer a readily accessible resource of related scaffolds that display diverse biochemical properties for biotechnological applications. PMID:22334611

  18. Characterization of fatty acid-producing wastewater microbial communities using next generation sequencing technologies

    EPA Science Inventory

    While wastewater represents a viable source of bacterial biodiesel production, very little is known on the composition of these microbial communities. We studied the taxonomic diversity and succession of microbial communities in bioreactors accumulating fatty acids using 454-pyro...

  19. Evaluation of cytochrome b mtDNA sequences in genetic diversity studies of Channa marulius (Channidae: Perciformes).

    PubMed

    Habib, Maria; Lakra, W S; Mohindra, Vindhya; Khare, Praveen; Barman, A S; Singh, Akanksha; Lal, Kuldeep K; Punia, Peyush; Khan, Asif A

    2011-02-01

    Channa marulius (Hamilton, 1822) is a commercially important freshwater fish and a potential candidate species for aquaculture. The present study evaluated partial Cytochrome b gene sequence of mtDNA for determining the genetic variation in wild populations of C. marulius. Genomic DNA extracted from C. marulius samples (n = 23) belonging to 3 distant rivers; Mahanadi, Teesta and Yamuna was analyzed. Sequencing of 307 bp Cytochrome b mtDNA fragment revealed the presence of 5 haplotypes with haplotype diversity value of 0.763 and nucleotide diversity value of 0.0128. Single population specific haplotype was observed in Mahanadi and Yamuna samples and 3 haplotypes in Teesta samples. The analysis of data demonstrated the suitability of partial Cytochrome b sequence in determining the genetic diversity in C. marulius population.

  20. Diverse gene sequences are overexpressed in werner syndrome fibroblasts undergoing premature replicative senescence.

    PubMed

    Murano, S; Thweatt, R; Shmookler Reis, R J; Jones, R A; Moerman, E J; Goldstein, S

    1991-08-01

    Genes that play a role in the senescent arrest of cellular replication are likely to be overexpressed in human diploid fibroblasts (HDF) derived from subjects with Werner syndrome (WS) because these cells have a severely curtailed replicative life span. To identify some of these genes, a cDNA library was constructed from WS HDF after they had been serum depleted and repleted (5 days in medium containing 1% serum followed by 24 h in medium containing 20% serum). Differential screening of 7,500 colonies revealed 102 clones that hybridized preferentially with [32P]cDNA derived from RNA of WS cells compared with [32P]cDNA derived from normal HDF. Cross-hybridization and partial DNA sequence determination identified 18 independent gene sequences, 9 of them known and 9 unknown. The known genes included alpha 1(I) procollagen, alpha 2(I) procollagen, fibronectin, ferritin heavy chain, insulinlike growth factor-binding protein-3 (IGFBP-3), osteonectin, human tissue plasminogen activator inhibitor type I, thrombospondin, and alpha B-crystallin. The nine unknown clones included two novel gene sequences and seven additional sequences that contained both novel segments and the Alu class of repetitive short interspersed nuclear elements; five of these seven Alu+ clones also contained the long interpersed nuclear element I (KpnI) family of repetitive elements. Northern (RNA) analysis, using the 18 sequences as probes, showed higher levels of these mRNAs in WS HDF than in normal HDF. Five selected mRNAs studied in greater detail [alpha 1(I) procollagen, fibronectin, insulinlike growth factor-binding protein-3, WS3-10, and WS9-14] showed higher mRNA levels in both WS and late-passage normal HDF than in early-passage normal HDF at various intervals following serum depletion/repletion and after subculture and growth from sparse to high-density confluent arrest. These results indicate that senescence of both WS and normal HDF is accompanied by overexpression of similar sets of

  1. Fallacy of the Unique Genome: Sequence Diversity within Single Helicobacter pylori Strains.

    PubMed

    Draper, Jenny L; Hansen, Lori M; Bernick, David L; Abedrabbo, Samar; Underwood, Jason G; Kong, Nguyet; Huang, Bihua C; Weis, Allison M; Weimer, Bart C; van Vliet, Arnoud H M; Pourmand, Nader; Solnick, Jay V; Karplus, Kevin; Ottemann, Karen M

    2017-02-21

    Many bacterial genomes are highly variable but nonetheless are typically published as a single assembled genome. Experiments tracking bacterial genome evolution have not looked at the variation present at a given point in time. Here, we analyzed the mouse-passaged Helicobacter pylori strain SS1 and its parent PMSS1 to assess intra- and intergenomic variability. Using high sequence coverage depth and experimental validation, we detected extensive genome plasticity within these H. pylori isolates, including movement of the transposable element IS607, large and small inversions, multiple single nucleotide polymorphisms, and variation in cagA copy number. The cagA gene was found as 1 to 4 tandem copies located off the cag island in both SS1 and PMSS1; this copy number variation correlated with protein expression. To gain insight into the changes that occurred during mouse adaptation, we also compared SS1 and PMSS1 and observed 46 differences that were distinct from the within-genome variation. The most substantial was an insertion in cagY, which encodes a protein required for a type IV secretion system function. We detected modifications in genes coding for two proteins known to affect mouse colonization, the HpaA neuraminyllactose-binding protein and the FutB α-1,3 lipopolysaccharide (LPS) fucosyltransferase, as well as genes predicted to modulate diverse properties. In sum, our work suggests that data from consensus genome assemblies from single colonies may be misleading by failing to represent the variability present. Furthermore, we show that high-depth genomic sequencing data of a population can be analyzed to gain insight into the normal variation within bacterial strains.IMPORTANCE Although it is well known that many bacterial genomes are highly variable, it is nonetheless traditional to refer to, analyze, and publish "the genome" of a bacterial strain. Variability is usually reduced ("only sequence from a single colony"), ignored ("just publish the consensus

  2. Diversity of Secondary Structure in Catalytic Peptides with β-Turn-Biased Sequences

    PubMed Central

    2016-01-01

    X-ray crystallography has been applied to the structural analysis of a series of tetrapeptides that were previously assessed for catalytic activity in an atroposelective bromination reaction. Common to the series is a central Pro-Xaa sequence, where Pro is either l- or d-proline, which was chosen to favor nucleation of canonical β-turn secondary structures. Crystallographic analysis of 35 different peptide sequences revealed a range of conformational states. The observed differences appear not only in cases where the Pro-Xaa loop-region is altered, but also when seemingly subtle alterations to the flanking residues are introduced. In many instances, distinct conformers of the same sequence were observed, either as symmetry-independent molecules within the same unit cell or as polymorphs. Computational studies using DFT provided additional insight into the analysis of solid-state structural features. Select X-ray crystal structures were compared to the corresponding solution structures derived from measured proton chemical shifts, 3J-values, and 1H–1H-NOESY contacts. These findings imply that the conformational space available to simple peptide-based catalysts is more diverse than precedent might suggest. The direct observation of multiple ground state conformations for peptides of this family, as well as the dynamic processes associated with conformational equilibria, underscore not only the challenge of designing peptide-based catalysts, but also the difficulty in predicting their accessible transition states. These findings implicate the advantages of low-barrier interconversions between conformations of peptide-based catalysts for multistep, enantioselective reactions. PMID:28029251

  3. Assessing Genetic Diversity among Brettanomyces Yeasts by DNA Fingerprinting and Whole-Genome Sequencing

    PubMed Central

    Crauwels, Sam; Zhu, Bo; Steensels, Jan; Busschaert, Pieter; De Samblanx, Gorik; Marchal, Kathleen; Willems, Kris A.

    2014-01-01

    Brettanomyces yeasts, with the species Brettanomyces (Dekkera) bruxellensis being the most important one, are generally reported to be spoilage yeasts in the beer and wine industry due to the production of phenolic off flavors. However, B. bruxellensis is also known to be a beneficial contributor in certain fermentation processes, such as the production of certain specialty beers. Nevertheless, despite its economic importance, Brettanomyces yeasts remain poorly understood at the genetic and genomic levels. In this study, the genetic relationship between more than 50 Brettanomyces strains from all presently known species and from several sources was studied using a combination of DNA fingerprinting techniques. This revealed an intriguing correlation between the B. bruxellensis fingerprints and the respective isolation source. To further explore this relationship, we sequenced a (beneficial) beer isolate of B. bruxellensis (VIB X9085; ST05.12/22) and compared its genome sequence with the genome sequences of two wine spoilage strains (AWRI 1499 and CBS 2499). ST05.12/22 was found to be substantially different from both wine strains, especially at the level of single nucleotide polymorphisms (SNPs). In addition, there were major differences in the genome structures between the strains investigated, including the presence of large duplications and deletions. Gene content analysis revealed the presence of 20 genes which were present in both wine strains but absent in the beer strain, including many genes involved in carbon and nitrogen metabolism, and vice versa, no genes that were missing in both AWRI 1499 and CBS 2499 were found in ST05.12/22. Together, this study provides tools to discriminate Brettanomyces strains and provides a first glimpse at the genetic diversity and genome plasticity of B. bruxellensis. PMID:24814796

  4. Assessing genetic diversity among Brettanomyces yeasts by DNA fingerprinting and whole-genome sequencing.

    PubMed

    Crauwels, Sam; Zhu, Bo; Steensels, Jan; Busschaert, Pieter; De Samblanx, Gorik; Marchal, Kathleen; Willems, Kris A; Verstrepen, Kevin J; Lievens, Bart

    2014-07-01

    Brettanomyces yeasts, with the species Brettanomyces (Dekkera) bruxellensis being the most important one, are generally reported to be spoilage yeasts in the beer and wine industry due to the production of phenolic off flavors. However, B. bruxellensis is also known to be a beneficial contributor in certain fermentation processes, such as the production of certain specialty beers. Nevertheless, despite its economic importance, Brettanomyces yeasts remain poorly understood at the genetic and genomic levels. In this study, the genetic relationship between more than 50 Brettanomyces strains from all presently known species and from several sources was studied using a combination of DNA fingerprinting techniques. This revealed an intriguing correlation between the B. bruxellensis fingerprints and the respective isolation source. To further explore this relationship, we sequenced a (beneficial) beer isolate of B. bruxellensis (VIB X9085; ST05.12/22) and compared its genome sequence with the genome sequences of two wine spoilage strains (AWRI 1499 and CBS 2499). ST05.12/22 was found to be substantially different from both wine strains, especially at the level of single nucleotide polymorphisms (SNPs). In addition, there were major differences in the genome structures between the strains investigated, including the presence of large duplications and deletions. Gene content analysis revealed the presence of 20 genes which were present in both wine strains but absent in the beer strain, including many genes involved in carbon and nitrogen metabolism, and vice versa, no genes that were missing in both AWRI 1499 and CBS 2499 were found in ST05.12/22. Together, this study provides tools to discriminate Brettanomyces strains and provides a first glimpse at the genetic diversity and genome plasticity of B. bruxellensis.

  5. Application of culture culture-independent molecular biology based methods to evaluate acetic acid bacteria diversity during vinegar processing.

    PubMed

    Ilabaca, Carolina; Navarrete, Paola; Mardones, Pamela; Romero, Jaime; Mas, Albert

    2008-08-15

    Acetic acid bacteria (AAB) are considered fastidious microorganisms because they are difficult to isolate and cultivate. Different molecular approaches were taken to detect AAB diversity, independently of their capacity to grow in culture media. Those methods were tested in samples that originated during traditional vinegar production. Bacterial diversity was assessed by analysis of 16S rRNA gene, obtained by PCR amplifications of DNA extracted directly from the acetification container. Bacterial composition was analyzed by RFLP-PCR of 16S rRNA gene, Temporal Temperature Gradient Gel Electrophoresis (TTGE) separation of amplicons containing region V3-V5 of 16S rRNA gene and cloning of those amplicons. TTGE bands and clones were grouped based on their electrophoretic pattern similarity and sequenced to be compared with reference strains. The main microorganism identified in vinegar was Acetobacter pasteurianus, which at the end of the acetification process was considered to be the only microorganism present. The diversity was the highest at 2% acetic acid, where indefinite species of Gluconacetobacter xylinus/europaeus/intermedius were also present.

  6. Diversity of lactic acid bacteria in suan-tsai and fu-tsai, traditional fermented mustard products of Taiwan.

    PubMed

    Chao, Shiou-Huei; Wu, Ruei-Jie; Watanabe, Koichi; Tsai, Ying-Chieh

    2009-11-15

    Fu-tsai and suan-tsai are spontaneously fermented mustard products traditionally prepared by the Hakka tribe of Taiwan. We chose 5 different processing stages of these products for analysis of the microbial community of lactic acid bacteria (LAB) by 16S rRNA gene sequencing. From 500 LAB isolates we identified 119 representative strains belonging to 5 genera and 18 species, including Enterococcus (1 species), Lactobacillus (11 species), Leuconostoc (3 species), Pediococcus (1 species), and Weissella (2 species). The LAB composition of mustard fermented for 3 days, known as the Mu sample, was the most diverse, with 11 different LAB species being isolated. We used sequence analysis of the 16S rRNA gene to identify the LAB strains and analysis of the dnaA, pheS, and rpoA genes to identify 13 LAB strains for which identification by 16S rRNA gene sequences was not possible. These 13 strains were found to belong to 5 validated known species: Lactobacillus farciminis, Leuconostoc mesenteroides, Leuconostoc pseudomesenteroides, Weissella cibaria, and Weissella paramesenteroides, and 5 possibly novel Lactobacillus species. These results revealed that there is a high level of diversity in LAB at the different stages of fermentation in the production of suan-tsai and fu-tsai.

  7. Sequence diversity and enzyme activity of ferric-chelate reductase LeFRO1 in tomato.

    PubMed

    Kong, Danyu; Chen, Chunlin; Wu, Huilan; Li, Ye; Li, Junming; Ling, Hong-Qing

    2013-11-20

    Ferric-chelate reductase which functions in the reduction of ferric to ferrous iron on root surface is a critical protein for iron homeostasis in strategy I plants. LeFRO1 is a major ferric-chelate reductase involved in iron uptake in tomato. To identify the natural variations of LeFRO1 and to assess their effect on the ferric-chelate reductase activity, we cloned the coding sequences of LeFRO1 from 16 tomato varieties collected from different regions, and detected three types of LeFRO1 (LeFRO1(MM), LeFRO1(Ailsa) and LeFRO1(Monita)) with five amino acid variations at the positions 21, 24, 112, 195 and 582. Enzyme activity assay revealed that the three types of LeFRO1 possessed different ferric-chelate reductase activity (LeFRO1(Ailsa) > LeFRO1(MM) > LeFRO1(Monita)). The 112th amino acid residue Ala of LeFRO1 is critical for maintaining the high activity of ferric-chelate reductase, because modification of this amino acid resulted in a significant reduction of enzyme activity. Further, we showed that the combination of the amino acid residue Ile at the site 24 with Lys at the site 582 played a positive role in the enzyme activity of LeFRO1. In conclusion, the findings are helpful to understand the natural adaptation mechanisms of plants to iron-limiting stress, and may provide new knowledge to select and manipulate LeFRO1 for improving the iron deficiency tolerance in tomato.

  8. The okra (Abelmoschus esculentus) transcriptome as a source for gene sequence information and molecular markers for diversity analysis.

    PubMed

    Schafleitner, Roland; Kumar, Sanjeet; Lin, Chen-Yu; Hegde, Satish Gajanana; Ebert, Andreas

    2013-03-15

    A combined leaf and pod transcriptome of okra (Abelmoschus esculentus (L.) Moench) has been produced by RNA sequencing and short read assembly. More than 150,000 unigenes were obtained, comprising some 46 million base pairs of sequence information. More than 55% of the unigenes were annotated through sequence comparison with databases. The okra transcriptome sequences were mined for simple sequence repeat (SSR) markers. From 935 non-redundant SSR motifs identified in the unigene set, 199 were chosen for testing in a germplasm set, resulting in 161 polymorphic SSR markers. From this set, 19 markers were selected for a diversity analysis on 65 okra accessions comprising three different species, revealing 58 different genotypes and resulted in clustering of the accessions according to species and geographic origin. The okra gene sequence information and the marker resource are made available to the research community for functional genomics and breeding research.

  9. Sequence Diversity and Large-Scale Typing of SNPs in the Human Apolipoprotein E Gene

    PubMed Central

    Nickerson, Deborah A.; Taylor, Scott L.; Fullerton, Stephanie M.; Weiss, Kenneth M.; Clark, Andrew G.; Stengård, Jari H.; Salomaa, Veikko; Boerwinkle, Eric; Sing, Charles F.

    2000-01-01

    A common strategy for genotyping large samples begins with the characterization of human single nucleotide polymorphisms (SNPs) by sequencing candidate regions in a small sample for SNP discovery. This is usually followed by typing in a large sample those sites observed to vary in a smaller sample. We present results from a systematic investigation of variation at the human apolipoprotein E locus (APOE), as well as the evaluation of the two-tiered sampling strategy based on these data. We sequenced 5.5 kb spanning the entire APOE genomic region in a core sample of 72 individuals, including 24 each of African-Americans from Jackson, Mississippi; European-Americans from Rochester, Minnesota; and Europeans from North Karelia, Finland. This sequence survey detected 21 SNPs and 1 multiallelic indel, 14 of which had not been previously reported. Alleles varied in relative frequency among the populations, and 10 sites were polymorphic in only a single population sample. Oligonucleotide ligation assays (OLA) were developed for 20 of these sites (omitting the indel and a closely-linked SNP). These were then scored in 2179 individuals sampled from the same three populations (n = 843, 884, and 452, respectively). Relative allele frequencies were generally consistent with estimates from the core sample, although variation was found in some populations in the larger sample at SNPs that were monomorphic in the corresponding smaller core sample. Site variation in the larger samples showed no systematic deviation from Hardy-Weinberg expectation. The large OLA sample clearly showed that variation in many, but not all, of OLA-typed SNPs is significantly correlated with the classical protein-coding variants, implying that there may be important substructure within the classical ɛ2, ɛ3, and ɛ4 alleles. Comparison of the levels and patterns of polymorphism in the core samples with those estimated for the OLA-typed samples shows how nucleotide diversity is underestimated when

  10. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  11. Simple sequence repeat marker diversity in cassava landraces: genetic diversity and differentiation in an asexually propagated crop.

    PubMed

    Fregene, M A; Suarez, M; Mkumbira, J; Kulembeka, H; Ndedya, E; Kulaya, A; Mitchel, S; Gullberg, U; Rosling, H; Dixon, A G O; Dean, R; Kresovich, S

    2003-10-01

    Cassava (Manihot esculenta) is an allogamous, vegetatively propagated, Neotropical crop that is also widely grown in tropical Africa and Southeast Asia. To elucidate genetic diversity and differentiation in the crop's primary and secondary centers of diversity, and the forces shaping them, SSR marker variation was assessed at 67 loci in 283 accessions of cassava landraces from Africa (Tanzania and Nigeria) and the Neotropics (Brazil, Colombia, Peru, Venezuela, Guatemala, Mexico and Argentina). Average gene diversity (i.e., genetic diversity) was high in all countries, with an average heterozygosity of 0.5358 +/- 0.1184. Although the highest was found in Brazilian and Colombian accessions, genetic diversity in Neotropical and African materials is comparable. Despite the low level of differentiation [F(st)(theta) = 0.091 +/- 0.005] found among country samples, sufficient genetic distance (1-proportion of shared alleles) existed between individual genotypes to separate African from Neotropical accessions and to reveal a more pronounced substructure in the African landraces. Forces shaping differences in allele frequency at SSR loci and possibly counterbalancing successive founder effects involve probably spontaneous recombination, as assessed by parent-offspring relationships, and farmer-selection for adaptation.

  12. Fungal diversity in oxygen-depleted regions of the Arabian Sea revealed by targeted environmental sequencing combined with cultivation.

    PubMed

    Jebaraj, Cathrine S; Raghukumar, Chandralata; Behnke, Anke; Stoeck, Thorsten

    2010-03-01

    In order to study fungal diversity in oxygen minimum zones of the Arabian Sea, we analyzed 1440 cloned small subunit rRNA gene (18S rRNA gene) sequences obtained from environmental samples using three different PCR primer sets. Restriction fragment length polymorphism (RFLP) analyses yielded 549 distinct RFLP patterns, 268 of which could be assigned to fungi (Dikarya and zygomycetes) after sequence analyses. The remaining 281 RFLP patterns represented a variety of nonfungal taxa, even when using putatively fungal-specific primers. A substantial number of fungal sequences were closely related to environmental sequences from a range of other anoxic marine habitats, but distantly related to known sequences of described fungi. Community similarity analyses suggested distinctively different structures of fungal communities from normoxic sites, seasonally anoxic sites and permanently anoxic sites, suggesting different adaptation strategies of fungal communities to prevailing oxygen conditions. Additionally, we obtained 26 fungal cultures from the study sites, most of which were closely related (>97% sequence similarity) to well-described Dikarya. This indicates that standard cultivation mainly produces more of what is already known. However, two of these cultures were highly divergent to known sequences and seem to represent novel fungal groups on high taxonomic levels. Interestingly, none of the cultured isolates is identical to any of the environmental sequences obtained. Our study demonstrates the importance of a multiple-primer approach combined with cultivation to obtain deeper insights into the true fungal diversity in environmental samples and to enable adequate intersample comparisons of fungal communities.

  13. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  14. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  15. DNA Sequence Analyses Reveal Abundant Diversity, Endemism and Evidence for Asian Origin of the Porcini Mushrooms

    PubMed Central

    Feng, Bang; Xu, Jianping; Wu, Gang; Zeng, Nian-Kai; Li, Yan-Chun; Tolgor, Bau; Kost, Gerhard W.; Yang, Zhu L.

    2012-01-01

    The wild gourmet mushroom Boletus edulis and its close allies are of significant ecological and economic importance. They are found throughout the Northern Hemisphere, but despite their ubiquity there are still many unresolved issues with regard to the taxonomy, systematics and biogeography of this group of mushrooms. Most phylogenetic studies of Boletus so far have characterized samples from North America and Europe and little information is available on samples from other areas, including the ecologically and geographically diverse regions of China. Here we analyzed DNA sequence variation in three gene markers from samples of these mushrooms from across China and compared our findings with those from other representative regions. Our results revealed fifteen novel phylogenetic species (about one-third of the known species) and a newly identified lineage represented by Boletus sp. HKAS71346 from tropical Asia. The phylogenetic analyses support eastern Asia as the center of diversity for the porcini sensu stricto clade. Within this clade, B. edulis is the only known holarctic species. The majority of the other phylogenetic species are geographically restricted in their distributions. Furthermore, molecular dating and geological evidence suggest that this group of mushrooms originated during the Eocene in eastern Asia, followed by dispersal to and subsequent speciation in other parts of Asia, Europe, and the Americas from the middle Miocene through the early Pliocene. In contrast to the ancient dispersal of porcini in the strict sense in the Northern Hemisphere, the occurrence of B. reticulatus and B. edulis sensu lato in the Southern Hemisphere was probably due to recent human-mediated introductions. PMID:22629418

  16. Reduced representation genome sequencing suggests low diversity on the sex chromosomes of tonkean macaque monkeys.

    PubMed

    Evans, Ben J; Zeng, Kai; Esselstyn, Jacob A; Charlesworth, Brian; Melnick, Don J

    2014-09-01

    In species with separate sexes, social systems can differ in the relative variances of male versus female reproductive success. Papionin monkeys (macaques, mangabeys, mandrills, drills, baboons, and geladas) exhibit hallmarks of a high variance in male reproductive success, including a female-biased adult sex ratio and prominent sexual dimorphism. To explore the potential genomic consequences of such sex differences, we used a reduced representation genome sequencing approach to quantifying polymorphism at sites on autosomes and sex chromosomes of the tonkean macaque (Macaca tonkeana), a species endemic to the Indonesian island of Sulawesi. The ratio of nucleotide diversity of the X chromosome to that of the autosomes was less than the value (0.75) expected with a 1:1 sex ratio and no sex differences in the variance in reproductive success. However, the significance of this difference was dependent on which outgroup was used to standardize diversity levels. Using a new model that includes the effects of varying population size, sex differences in mutation rate between the autosomes and X chromosome, and GC-biased gene conversion (gBGC) or selection on GC content, we found that the maximum-likelihood estimate of the ratio of effective population size of the X chromosome to that of the autosomes was 0.68, which did not differ significantly from 0.75. We also found evidence for 1) a higher level of purifying selection on genic than nongenic regions, 2) gBGC or natural selection favoring increased GC content, 3) a dynamic demography characterized by population growth and contraction, 4) a higher mutation rate in males than females, and 5) a very low polymorphism level on the Y chromosome. These findings shed light on the population genomic consequences of sex differences in the variance in reproductive success, which appear to be modest in the tonkean macaque; they also suggest the occurrence of hitchhiking on the Y chromosome.

  17. Analysis of the genetic diversity of beach plums by simple sequence repeat markers.

    PubMed

    Wang, X M; Wu, W L; Zhang, C H; Zhang, Y P; Li, W L; Huang, T

    2015-08-19

    The purpose of this study was to measure the genetic diversity of wild beach plum and cultivated species, and to determine the species relationships using SSRs markers. An analysis of genetic diversity from ten beach plum germplasms was carried out using 11 simple sequence repeat (SSR) primers selected from 35 primers to generate distinct PCR products. From this plant material, 44 allele variations were detected, with 3-5 alleles identified from each primer. The analysis showed that the genetic similarity coefficient varied from 0.721 ± 0.155 to 0.848 ± 0.136 within each of the ten beach plum germplasms and changed within the range of 0.551 ± 0.084 to 0.695 ± 0.073 between any two pairs of germplasms. According to the genetic dissimilarity coefficient matrix, a cluster analysis of SSRs using the unweighted pair group mean average method in the NTSYSpc 2.10 software revealed that the ten germplasms could be divided into two groups at the dissimilarity coefficient of 0.606. Class I included 77.8, 12.5, 30, and 33.3% of MM, MI, NY, and CM, respectively. Class II contains the remaining 9 beach plum germplasms. The markers generated by 11 SSR primers proved very effective in distinguishing the beach plum germplasm resources. It was clear that the geographical distribution did not correspond with the genetic relationships among the different beach plum strains. This result will be of value to beach plum breeding programs.

  18. Genetic diversity among air yam (Dioscorea bulbifera) varieties based on single sequence repeat markers.

    PubMed

    Silva, D M; Siqueira, M V B M; Carrasco, N F; Mantello, C C; Nascimento, W F; Veasey, E A

    2016-05-23

    Dioscorea is the largest genus in the Dioscoreaceae family, and includes a number of economically important species including the air yam, D. bulbifera L. This study aimed to develop new single sequence repeat primers and characterize the genetic diversity of local varieties that originated in several municipalities of Brazil. We developed an enriched genomic library for D. bulbifera resulting in seven primers, six of which were polymorphic, and added four polymorphic loci developed for other Dioscorea species. This resulted in 10 polymorphic primers to evaluate 42 air yam accessions. Thirty-three alleles (bands) were found, with an average of 3.3 alleles per locus. The discrimination power ranged from 0.113 to 0.834, with an average of 0.595. Both principal coordinate and cluster analyses (using the Jaccard Index) failed to clearly separate the accessions according to their origins. However, the 13 accessions from Conceição dos Ouros, Minas Gerais State were clustered above zero on the principal coordinate 2 axis, and were also clustered into one subgroup in the cluster analysis. Accessions from Ubatuba, São Paulo State were clustered below zero on the same principal coordinate 2 axis, except for one accession, although they were scattered in several subgroups in the cluster analysis. Therefore, we found little spatial structure in the accessions, although those from Conceição dos Ouros and Ubatuba exhibited some spatial structure, and that there is a considerable level of genetic diversity in D. bulbifera maintained by traditional farmers in Brazil.

  19. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities.

  20. The amino acid sequences of the Fd fragments of two human γ heavy chains

    PubMed Central

    Press, E. M.; Hogg, N. M.

    1970-01-01

    The amino acid sequences of the Fd fragments of two human pathological immunoglobulins of the immunoglobulin G1 class are reported. Comparison of the two sequences shows that the heavy-chain variable regions are similar in length to those of the light chains. The existence of heavy chain variable region subgroups is also deduced, from a comparison of these two sequences with those of another γ 1 chain, Eu, a μ chain, Ou, and the partial sequence of a fourth γ 1 chain, Ste. Carbohydrate has been found to be linked to an aspartic acid residue in the variable region of one of the γ 1 chains, Cor. PMID:5449120

  1. Effect of sludge age on the bacterial diversity of bench scale sequencing batch reactors.

    PubMed

    Akarsubasi, Alper Tunga; Eyice, Ozge; Miskin, Ian; Head, Ian M; Curtis, Thomas P

    2009-04-15

    Sludge age or mean cell residence time (MCRT) plays a crucial role in design and operation of wastewater treatment plants. The change in performance, for example micropollutant removal, associated with changes in MCRT is often attributed to changes in microbial diversity. We operated four identical laboratory-scale sequencing batch reactors (two test and two control) in parallel for 212 days. Sludge age was decreased gradually (from 10.4to 2.6 days) in experimental reactors whereas it was kept constant (10.4 days) in control reactors. The reactor performance and biomass changed in a manner consistent with our understanding of the effect of sludge age on a reactors performance: the effluent quality and biomass declined with decreasing MCRT. The composition of the bacterial and ammonia-oxidizing bacterial communities in four reactors was analyzed using denaturing gradient gel electrophoresis (DGGE), and similarities in band patterns were measured using the Dice coefficient. The overall similarity between the communities in reactors run at different sludge ages was indistinguishable from the similarity in communities in reactors run at identical sludge ages. This was true for both the general bacterial communities and putative AOB communities. The number of detectable bands in DGGE profiles was also unaffected by sludge age (p approximately 0.5 in both cases). Initially, the detectable diversity of activated sludge communities in all four reactors clustered with time, regardless of their designation or sludge age; however, these clusters were only weakly supported by bootstrap analysis. However, after 135 days, a sludge age specific clustering was observed in the bacterial community but not the putative ammonia-oxidizing bacterial community. The mean self-similarity of each reactor decreased, variance increased, and the number of detectable bands in DGGE profiles decreased over time in all reactors. The changes observed with time are consistent with ecological drift

  2. Sequence diversity of human caliciviruses recovered from children with diarrhea in Mendoza, Argentina, 1995-1998.

    PubMed

    Martínez, Norma; Espul, Carlos; Cuello, Hector; Zhong, Weiming; Jiang, Xi; Matson, David O; Berke, Tamas

    2002-06-01

    Human caliciviruses were detected by EIA and/or RT-PCR in stool specimens from children with diarrhea treated at out- or in-patient facilities between 1995 and 1998 in Mendoza, Argentina. Mexico virus-like strains detected by primers NV36/51 were transiently prevalent in 1995/1996. Significantly more human caliciviruses were detected when primers were designed from contemporaneously circulating strains. Nucleotide sequences of a highly conserved region in the RNA polymerase gene of 10 selected human caliciviruses were determined. Eight strains were Norwalk-like viruses and two strains were Sapporo-like viruses. Seven of the eight Norwalk-like viruses also were positive by the recombinant Mexico virus antigen EIA. The seven Mexico virus EIA-positive strains revealed two patterns in the RNA polymerase sequences: two strains were closest to Mexico virus and the other five strains were closest to Lordsdale virus. One of the five "Lordsdale" viruses was found to be a naturally occurring recombinant between the Mexico virus and Lordsdale human calicivirus genetic clusters [Jiang et al., (1999b) Archives of Virology 144:2377-2387]. The Mexico virus EIA-negative strain had 73-77% nucleotide identity with the closest related Norwalk-like viruses, indicating it might belong to a new genetic cluster of the Norwalk-like virus genus. The two Sapporo-like viruses were distinct genetically; one belonged to the Houston/90 or Parkville cluster and the other to a new cluster. Some strains appeared to have short periods of prevalence and locally adapted primer pairs significantly increased detection rates. The finding of high diversity of circulating strains, including recombinant strains and strains with previously unrecognized genetic identities, highlights a need for studies of human caliciviruses in these children and other populations.

  3. Fallacy of the Unique Genome: Sequence Diversity within Single Helicobacter pylori Strains

    PubMed Central

    Hansen, Lori M.; Bernick, David L.; Abedrabbo, Samar; Underwood, Jason G.; Kong, Nguyet; Huang, Bihua C.; Weis, Allison M.; Pourmand, Nader

    2017-01-01

    ABSTRACT Many bacterial genomes are highly variable but nonetheless are typically published as a single assembled genome. Experiments tracking bacterial genome evolution have not looked at the variation present at a given point in time. Here, we analyzed the mouse-passaged Helicobacter pylori strain SS1 and its parent PMSS1 to assess intra- and intergenomic variability. Using high sequence coverage depth and experimental validation, we detected extensive genome plasticity within these H. pylori isolates, including movement of the transposable element IS607, large and small inversions, multiple single nucleotide polymorphisms, and variation in cagA copy number. The cagA gene was found as 1 to 4 tandem copies located off the cag island in both SS1 and PMSS1; this copy number variation correlated with protein expression. To gain insight into the changes that occurred during mouse adaptation, we also compared SS1 and PMSS1 and observed 46 differences that were distinct from the within-genome variation. The most substantial was an insertion in cagY, which encodes a protein required for a type IV secretion system function. We detected modifications in genes coding for two proteins known to affect mouse colonization, the HpaA neuraminyllactose-binding protein and the FutB α-1,3 lipopolysaccharide (LPS) fucosyltransferase, as well as genes predicted to modulate diverse properties. In sum, our work suggests that data from consensus genome assemblies from single colonies may be misleading by failing to represent the variability present. Furthermore, we show that high-depth genomic sequencing data of a population can be analyzed to gain insight into the normal variation within bacterial strains. PMID:28223462

  4. PCR Primers to Study the Diversity of Expressed Fungal Genes Encoding Lignocellulolytic Enzymes in Soils Using High-Throughput Sequencing

    PubMed Central

    Barbi, Florian; Bragalini, Claudia; Vallon, Laurent; Prudent, Elsa; Dubost, Audrey; Fraissinet-Tachet, Laurence; Marmeisse, Roland; Luis, Patricia

    2014-01-01

    Plant biomass degradation in soil is one of the key steps of carbon cycling in terrestrial ecosystems. Fungal saprotrophic communities play an essential role in this process by producing hydrolytic enzymes active on the main components of plant organic matter. Open questions in this field regard the diversity of the species involved, the major biochemical pathways implicated and how these are affected by external factors such as litter quality or climate changes. This can be tackled by environmental genomic approaches involving the systematic sequencing of key enzyme-coding gene families using soil-extracted RNA as material. Such an approach necessitates the design and evaluation of gene family-specific PCR primers producing sequence fragments compatible with high-throughput sequencing approaches. In the present study, we developed and evaluated PCR primers for the specific amplification of fungal CAZy Glycoside Hydrolase gene families GH5 (subfamily 5) and GH11 encoding endo-β-1,4-glucanases and endo-β-1,4-xylanases respectively as well as Basidiomycota class II peroxidases, corresponding to the CAZy Auxiliary Activity family 2 (AA2), active on lignin. These primers were experimentally validated using DNA extracted from a wide range of Ascomycota and Basidiomycota species including 27 with sequenced genomes. Along with the published primers for Glycoside Hydrolase GH7 encoding enzymes active on cellulose, the newly design primers were shown to be compatible with the Illumina MiSeq sequencing technology. Sequences obtained from RNA extracted from beech or spruce forest soils showed a high diversity and were uniformly distributed in gene trees featuring the global diversity of these gene families. This high-throughput sequencing approach using several degenerate primers constitutes a robust method, which allows the simultaneous characterization of the diversity of different fungal transcripts involved in plant organic matter degradation and may lead to the

  5. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  6. Evaluation of genetic diversity and pedigree within crapemyrtle (Lagerstroemia spp.) cultivars using simple sequence repeat (SSR) markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genetic diversity was estimated for 93 crapemyrtle (Lagerstroemia spp.) cultivars (51 L. indica cultivars, 5 L. fauriei cultivars, and 37 interspecific hybrids) using 78 simple sequence repeat (SSR) markers. SSR loci were highly variable among the cultivars, detecting an average of 6.6 alleles per l...

  7. Palynological composition of a Lower Cretaceous South American tropical sequence: Climatic implications and diversity comparisons with other latitudes.

    USGS Publications Warehouse

    Mejia-Velasquez, Paula J.; Dilcher, David L.; Jaramillo, Carlos A.; Fortini, Lucas B.; Manchester, Steven R.

    2012-01-01

    Premise of the study: Reconstruction of floristic patterns during the early diversification of angiosperms is impeded by the scarce fossil record, especially in tropical latitudes. Here we collected quantitative palynological data from a stratigraphic sequence in tropical South America to provide floristic and climatic insights into such tropical environments during the Early Cretaceous. Methods: We reconstructed the floristic composition of an Aptian-Albian tropical sequence from central Colombia using quantitative palynology (rarefied species richness and abundance) and used it to infer its predominant climatic conditions. Additionally, we compared our results with available quantitative data from three other sequences encompassing 70 floristic assemblages to determine latitudinal diversity patterns. Key results: Abundance of humidity indicators was higher than that of aridity indicators (61% vs. 10%). Additionally, we found an angiosperm latitudinal diversity gradient (LDG) for the Aptian, but not for the Albian, and an inverted LDG of the overall diversity for the Albian. Angiosperm species turnover during the Albian, however, was higher in humid tropics. Conclusions: There were humid climates in northwestern South America during the Aptian-Albian interval contrary to the widespread aridity expected for the tropical belt. The Albian inverted overall LDG is produced by a faster increase in per-sample angiosperm and pteridophyte diversity in temperate latitudes. However, humid tropical sequences had higher rates of floristic turnover suggesting a higher degree of morphological variation than in temperate regions.

  8. Multilocus Sequence Typing and rtxA Toxin Gene Sequencing Analysis of Kingella kingae Isolates Demonstrates Genetic Diversity and International Clones

    PubMed Central

    Basmaci, Romain; Yagupsky, Pablo; Ilharreborde, Brice; Guyot, Kathleen; Porat, Nurith; Chomton, Marilyn; Thiberge, Jean-Michel; Mazda, Keyvan; Bingen, Edouard

    2012-01-01

    Background Kingella kingae, a normal component of the upper respiratory flora, is being increasingly recognized as an important invasive pathogen in young children. Genetic diversity of this species has not been studied. Methods We analyzed 103 strains from different countries and clinical origins by a new multilocus sequence-typing (MLST) schema. Putative virulence gene rtxA, encoding an RTX toxin, was also sequenced, and experimental virulence of representative strains was assessed in a juvenile-rat model. Results Thirty-six sequence-types (ST) and nine ST-complexes (STc) were detected. The main STc 6, 14 and 23 comprised 23, 17 and 20 strains respectively, and were internationally distributed. rtxA sequencing results were mostly congruent with MLST, and showed horizontal transfer events. Of interest, all members of the distantly related ST-6 (n = 22) and ST-5 (n = 4) harboured a 33 bp duplication or triplication in their rtxA sequence, suggesting that this genetic trait arose through selective advantage. The animal model revealed significant differences in virulence among strains of the species. Conclusion MLST analysis reveals international spread of ST-complexes and will help to decipher acquisition and evolution of virulence traits and diversity of pathogenicity among K. kingae strains, for which an experimental animal model is now available. PMID:22693588

  9. Diversity, distribution, and evolution of tomato viruses in China uncovered by small RNA sequencing.

    PubMed

    Xu, Chenxi; Sun, Xuepeng; Taylor, Angela; Jiao, Chen; Xu, Yimin; Cai, Xiaofeng; Wang, Xiaoli; Ge, Chenhui; Pan, Guanghui; Wang, Quanxi; Fei, Zhangjun; Wang, Quanhua

    2017-03-22

    Tomato is a major vegetable crop that has tremendous popularity. However, viral disease is still a major factor limiting tomato production. Here we report the tomato virome identified through sequencing small RNAs of 170 field-grown samples collected in China. A total of 22 viruses were identified including both well-documented and newly detected viruses. The tomato viral community is dominated by a few species, and they exhibit polymorphisms and recombination in the genomes with coldspots and hotspots. Most samples were co-infected by multiple viruses and the majority of identified viruses are positive-sense single-stranded RNA viruses. Evolutionary analysis of one of the most dominant tomato viruses, Tomato yellow leaf curl virus (TYLCV), predicts its origin and the time back to its most recent common ancestor. The broadly sampled data has enabled us to identify several unreported viruses in tomato including a completely new virus, which has a genome of ∼13.4 kb and groups with aphid-transmitted viruses in genus Cytorhabdovirus Although both DNA and RNA viruses can trigger the biogenesis of virus-derived small interfering RNAs (vsiRNAs), we show that features such as length distribution, paired distance and base selection bias of vsiRNA sequences reflect different plant Dicer-like proteins and Argonautes involved in vsiRNA biogenesis. Collectively, this study offers insights into host-virus interaction in tomato and provides valuable information to facilitate the management of viral diseases.IMPORTANCE Tomato is an important source of micronutrient in human diet and is extensively consumed in the world. Virus is among the major constrains to tomato production. Categorizing virus species that are capable of infecting tomato and understanding their diversity and evolution are challenging due to difficulties in detecting such fast evolving biological entities. Here we report the landscape of tomato virome in China, the leading country of tomato production. We

  10. Diversity and Variation of Bacterial Community Revealed by MiSeq Sequencing in Chinese Dark Teas

    PubMed Central

    Fu, Jianyu; Lv, Haipeng; Chen, Feng

    2016-01-01

    Chinese dark teas (CDTs) are now among the popular tea beverages worldwide due to their unique health benefits. Because the production of CDTs involves fermentation that is characterized by the effect of microbes, microorganisms are believed to play critical roles in the determination of the chemical characteristics of CDTs. Some dominant fungi have been identified from CDTs. In contrast, little, if anything, is known about the composition of bacterial community in CDTs. This study was set to investigate the diversity and variation of bacterial community in four major types of CDTs from China. First, the composition of the bacterial community of CDTs was determined using MiSeq sequencing. From the four typical CDTs, a total of 238 genera that belong to 128 families of bacteria were detected, including most of the families of beneficial bacteria known to be associated with fermented food. While different types of CDTs had generally distinct bacterial structures, the two types of brick teas produced from adjacent regions displayed strong similarity in bacterial composition, suggesting that the producing environment and processing condition perhaps together influence bacterial succession in CDTs. The global characterization of bacterial communities in CDTs is an essential first step for us to understand their function in fermentation and their potential impact on human health. Such knowledge will be important guidance for improving the production of CDTs with higher quality and elevated health benefits. PMID:27690376

  11. Range-azimuth decouple beamforming for frequency diverse array with Costas-sequence modulated frequency offsets

    NASA Astrophysics Data System (ADS)

    Wang, Zhe; Wang, Wen-Qin; Shao, Huaizong

    2016-12-01

    Different from the phased-array using the same carrier frequency for each transmit element, the frequency diverse array (FDA) uses a small frequency offset across the array elements to produce range-angle-dependent transmit beampattern. FDA radar provides new application capabilities and potentials due to its range-dependent transmit array beampattern, but the FDA using linearly increasing frequency offsets will produce a range and angle coupled transmit beampattern. In order to decouple the range-azimuth beampattern for FDA radar, this paper proposes a uniform linear array (ULA) FDA using Costas-sequence modulated frequency offsets to produce random-like energy distribution in the transmit beampattern and thumbtack transmit-receive beampattern. In doing so, the range and angle of targets can be unambiguously estimated through matched filtering and subspace decomposition algorithms in the receiver signal processor. Moreover, random-like energy distributed beampattern can also be utilized for low probability of intercept (LPI) radar applications. Numerical results show that the proposed scheme outperforms the standard FDA in focusing the transmit energy, especially in the range dimension.

  12. Whole-exome sequencing of pancreatic cancer defines genetic diversity and therapeutic targets

    PubMed Central

    Witkiewicz, Agnieszka K.; McMillan, Elizabeth A.; Balaji, Uthra; Baek, GuemHee; Lin, Wan-Chi; Mansour, John; Mollaee, Mehri; Wagner, Kay-Uwe; Koduru, Prasad; Yopp, Adam; Choti, Michael A.; Yeo, Charles J.; McCue, Peter; White, Michael A.; Knudsen, Erik S.

    2015-01-01

    Pancreatic ductal adenocarcinoma (PDA) has a dismal prognosis and insights into both disease etiology and targeted intervention are needed. A total of 109 micro-dissected PDA cases were subjected to whole-exome sequencing. Microdissection enriches tumour cellularity and enhances mutation calling. Here we show that environmental stress and alterations in DNA repair genes associate with distinct mutation spectra. Copy number alterations target multiple tumour suppressive/oncogenic loci; however, amplification of MYC is uniquely associated with poor outcome and adenosquamous subtype. We identify multiple novel mutated genes in PDA, with select genes harbouring prognostic significance. RBM10 mutations associate with longer survival in spite of histological features of aggressive disease. KRAS mutations are observed in >90% of cases, but codon Q61 alleles are selectively associated with improved survival. Oncogenic BRAF mutations are mutually exclusive with KRAS and define sensitivity to vemurafenib in PDA models. High-frequency alterations in Wnt signalling, chromatin remodelling, Hedgehog signalling, DNA repair and cell cycle processes are observed. Together, these data delineate new genetic diversity of PDA and provide insights into prognostic determinants and therapeutic targets. PMID:25855536

  13. Analysis of the genetic diversity of Lonicera japonica Thumb. using inter-simple sequence repeat markers.

    PubMed

    He, H Y; Zhang, D; Qing, H; Yang, Y

    2017-01-23

    Inter-simple sequence repeats (ISSRs) were used to analyze the genetic diversity of 21 accessions obtained from four provinces in China, Shandong, Henan, Hebei, and Sichuan. A total of 272 scored bands were generated using the eight primers previously screened across 21 accessions, of which 267 were polymorphic (98.16%). Genetic similarity coefficients varied from 0.4816 to 0.9118, with an average of 0.6337. The UPGMA dendrogram grouped 21 accessions into two main clusters. Cluster A comprised four Lonicera macranthoides Hand. Mazz. accessions, of which J10 was found to be from Sichuan, and J17, J18, and J19 were found to be from Shandong. Cluster B comprised 17 Lonicera japonica Thumb. accessions, divided into the wild accession J16 and the other 16 cultivars. The results of the principal component analysis were comparable to the cluster analysis. Therefore, the ISSR markers could be effectively used to distinguish interspecific and intraspecific variations, which may facilitate identification of Lonicera japonica cultivars for planting, medicinal use, and germplasm conservation.

  14. The amino acid sequence of goat beta-lactoglobulin.

    PubMed

    Préaux, G; Braunitzer, G; Schrank, B; Stangl, A

    1979-11-01

    The isolation of beta-lactoglobulin from milk of the goat is described. The purified protein was checked for purity and has been characterized by its gross composition and end groups. The native or the modified protein was then degraded by tryptic and cyanogen bromide cleavage. The cleavage products were isolated and sequenced in the sequenator using a Quadrol and propyne program. These data provide the complete sequence of beta-lactoglobulin of the goat. The results are discussed and compared particularly with bovine beta-lactoglobulin components AB. Some biological aspects are described.

  15. Layered materials with coexisting acidic and basic sites for catalytic one-pot reaction sequences.

    PubMed

    Motokura, Ken; Tada, Mizuki; Iwasawa, Yasuhiro

    2009-06-17

    Acidic montmorillonite-immobilized primary amines (H-mont-NH(2)) were found to be excellent acid-base bifunctional catalysts for one-pot reaction sequences, which are the first materials with coexisting acid and base sites active for acid-base tamdem reactions. For example, tandem deacetalization-Knoevenagel condensation proceeded successfully with the H-mont-NH(2), affording the corresponding condensation product in a quantitative yield. The acidity of the H-mont-NH(2) was strongly influenced by the preparation solvent, and the base-catalyzed reactions were enhanced by interlayer acid sites.

  16. Next-generation DNA sequencing reveals that low fungal diversity in house dust is associated with childhood asthma development

    PubMed Central

    Dannemiller, Karen C.; Mendell, Mark J.; Macher, Janet M.; Kumagai, Kazukiyo; Bradman, Asa; Holland, Nina; Harley, Kim; Eskenazi, Brenda; Peccia, Jordan

    2013-01-01

    Dampness and visible mold in homes are associated with asthma development, but causal mechanisms remain unclear. The goal of this research was to explore associations among measured dampness, fungal exposure, and childhood asthma development without the bias of culture-based microbial analysis. In the low-income, Latino CHAMACOS birth cohort, house dust was collected at age 12 months, and asthma status was determined at age 7 years. The current analysis included 13 asthma cases and 28 controls. Next-generation DNA sequencing methods quantified fungal taxa and diversity. Lower fungal diversity (number of fungal operational taxonomic units) was significantly associated with increased risk of asthma development: unadjusted odds ratio (OR) 4.80 (95% confidence interval (CI) 1.04–22.1). Control for potential confounders strengthened this relationship. Decreased diversity within the genus Cryptococcus was significantly associated with increased asthma risk (OR 21.0, 95% CI 2.16–204). No fungal taxon (species, genus, class) was significantly positively associated with asthma development, and one was significantly negatively associated. Elevated moisture was associated with increased fungal diversity, and moisture/mold indicators were associated with four fungal taxa. Next-generation DNA sequencing provided comprehensive estimates of fungal identity and diversity, demonstrating significant associations between low fungal diversity and childhood asthma development in this community. PMID:24883433

  17. Synthesis of gamma,delta-unsaturated glycolic acids via sequenced brook and Ireland--claisen rearrangements.

    PubMed

    Schmitt, Daniel C; Johnson, Jeffrey S

    2010-03-05

    Organozinc, -magnesium, and -lithium nucleophiles initiate a Brook/Ireland-Claisen rearrangement sequence of allylic silyl glyoxylates resulting in the formation of gamma,delta-unsaturated alpha-silyloxy acids.

  18. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  19. Intermediary Metabolism in Protists: a Sequence-based View of Facultative Anaerobic Metabolism in Evolutionarily Diverse Eukaryotes

    PubMed Central

    Ginger, Michael L.; Fritz-Laylin, Lillian K.; Fulton, Chandler; Cande, W. Zacheus; Dawson, Scott C.

    2011-01-01

    Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2–3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H2 in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes. PMID:21036663

  20. Intermediary metabolism in protists: a sequence-based view of facultative anaerobic metabolism in evolutionarily diverse eukaryotes.

    PubMed

    Ginger, Michael L; Fritz-Laylin, Lillian K; Fulton, Chandler; Cande, W Zacheus; Dawson, Scott C

    2010-12-01

    Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2-3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H(2) in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes.

  1. Multilocus Sequence Analysis for the Assessment of Phylogenetic Diversity and Biogeography in Hyphomonas Bacteria from Diverse Marine Environments

    PubMed Central

    Li, Guizhen; Liu, Yang; Sun, Fengqin; Shao, Zongze

    2014-01-01

    Hyphomonas, a genus of budding, prosthecate bacteria, are primarily found in the marine environment. Seven type strains, and 35 strains from our collections of Hyphomonas, isolated from the Pacific Ocean, Atlantic Ocean, Arctic Ocean, South China Sea and the Baltic Sea, were investigated in this study using multilocus sequence analysis (MLSA). The phylogenetic structure of these bacteria was evaluated using the 16S rRNA gene, and five housekeeping genes (leuA, clpA, pyrH, gatA and rpoD) as well as their concatenated sequences. Our results showed that each housekeeping gene and the concatenated gene sequence all yield a higher taxonomic resolution than the 16S rRNA gene. The 42 strains assorted into 12 groups. Each group represents an independent species, which was confirmed by virtual DNA-DNA hybridization (DDH) estimated from draft genome sequences. Hyphomonas MLSA interspecies and intraspecies boundaries ranged from 93.3% to 96.3%, similarity calculated using a combined DDH and MLSA approach. Furthermore, six novel species (groups I, II, III, IV, V and XII) of the genus Hyphomonas exist, based on sequence similarities of the MLSA and DDH values. Additionally, we propose that the leuA gene (93.0% sequence similarity across our dataset) alone could be used as a fast and practical means for identifying species within Hyphomonas. Finally, Hyphomonas' geographic distribution shows that strains from the same area tend to cluster together as discrete species. This study provides a framework for the discrimination and phylogenetic analysis of the genus Hyphomonas for the first time, and will contribute to a more thorough understanding of the biological and ecological roles of this genus. PMID:25019154

  2. Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo

    PubMed Central

    Wilkinson, Eduan; Vallari, Ana; McArthur, Carole; Sthreshley, Larry; Brennan, Catherine A.; Cloherty, Gavin; de Oliveira, Tulio

    2017-01-01

    ABSTRACT As the epidemiological epicenter of the human immunodeficiency virus (HIV) pandemic, the Democratic Republic of the Congo (DRC) is a reservoir of circulating HIV strains exhibiting high levels of diversity and recombination. In this study, we characterized HIV specimens collected in two rural areas of the DRC between 2001 and 2003 to identify rare strains of HIV. The env gp41 region was sequenced and characterized for 172 HIV-positive specimens. The env sequences were predominantly subtype A (43.02%), but 7 other subtypes (33.14%), 20 circulating recombinant forms (CRFs; 11.63%), and 20 unclassified (11.63%) sequences were also found. Of the rare and unclassified subtypes, 18 specimens were selected for next-generation sequencing (NGS) by a modified HIV-switching mechanism at the 5′ end of the RNA template (SMART) method to obtain full-genome sequences. NGS produced 14 new complete genomes, which included pure subtype C (n = 2), D (n = 1), F1 (n = 1), H (n = 3), and J (n = 1) genomes. The two subtype C genomes and one of the subtype H genomes branched basal to their respective subtype branches but had no evidence of recombination. The remaining 6 genomes were complex recombinants of 2 or more subtypes, including subtypes A1, F, G, H, J, and K and unclassified fragments, including one subtype CRF25 isolate, which branched basal to all CRF25 references. Notably, all recombinant subtype H fragments branched basal to the H clade. Spatial-geographical analysis indicated that the diverse sequences identified here did not expand globally. The full-genome and subgenomic sequences identified in our study population significantly increase the documented diversity of the strains involved in the continually evolving HIV-1 pandemic. IMPORTANCE Very little is known about the ancestral HIV-1 strains that founded the global pandemic, and very few complete genome sequences are available from patients in the Congo Basin, where HIV-1 expanded early in the global pandemic

  3. Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo.

    PubMed

    Rodgers, Mary A; Wilkinson, Eduan; Vallari, Ana; McArthur, Carole; Sthreshley, Larry; Brennan, Catherine A; Cloherty, Gavin; de Oliveira, Tulio

    2017-03-15

    As the epidemiological epicenter of the human immunodeficiency virus (HIV) pandemic, the Democratic Republic of the Congo (DRC) is a reservoir of circulating HIV strains exhibiting high levels of diversity and recombination. In this study, we characterized HIV specimens collected in two rural areas of the DRC between 2001 and 2003 to identify rare strains of HIV. The env gp41 region was sequenced and characterized for 172 HIV-positive specimens. The env sequences were predominantly subtype A (43.02%), but 7 other subtypes (33.14%), 20 circulating recombinant forms (CRFs; 11.63%), and 20 unclassified (11.63%) sequences were also found. Of the rare and unclassified subtypes, 18 specimens were selected for next-generation sequencing (NGS) by a modified HIV-switching mechanism at the 5' end of the RNA template (SMART) method to obtain full-genome sequences. NGS produced 14 new complete genomes, which included pure subtype C (n = 2), D (n = 1), F1 (n = 1), H (n = 3), and J (n = 1) genomes. The two subtype C genomes and one of the subtype H genomes branched basal to their respective subtype branches but had no evidence of recombination. The remaining 6 genomes were complex recombinants of 2 or more subtypes, including subtypes A1, F, G, H, J, and K and unclassified fragments, including one subtype CRF25 isolate, which branched basal to all CRF25 references. Notably, all recombinant subtype H fragments branched basal to the H clade. Spatial-geographical analysis indicated that the diverse sequences identified here did not expand globally. The full-genome and subgenomic sequences identified in our study population significantly increase the documented diversity of the strains involved in the continually evolving HIV-1 pandemic.IMPORTANCE Very little is known about the ancestral HIV-1 strains that founded the global pandemic, and very few complete genome sequences are available from patients in the Congo Basin, where HIV-1 expanded early in the global pandemic. By

  4. Genome sequence of the acid-tolerant strain Rhizobium sp. LPU83.

    PubMed

    Wibberg, Daniel; Tejerizo, Gonzalo Torres; Del Papa, María Florencia; Martini, Carla; Pühler, Alfred; Lagares, Antonio; Schlüter, Andreas; Pistorio, Mariano

    2014-04-20

    Rhizobia are important members of the soil microbiome since they enter into nitrogen-fixing symbiosis with different legume host plants. Rhizobium sp. LPU83 is an acid-tolerant Rhizobium strain featuring a broad-host-range. However, it is ineffective in nitrogen fixation. Here, the improved draft genome sequence of this strain is reported. Genome sequence information provides the basis for analysis of its acid tolerance, symbiotic properties and taxonomic classification.

  5. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly).

  6. Single-chain structure of human ceruloplasmin: the complete amino acid sequence of the whole molecule.

    PubMed Central

    Takahashi, N; Ortel, T L; Putnam, F W

    1984-01-01

    We have determined the amino acid sequence of the amino-terminal 67,000-dalton (67-kDa) fragment of human ceruloplasmin and have established overlapping sequences between the 67-kDa and 50-kDa fragments and between the 50-kDa and 19-kDa fragments. The 67-kDa fragment contains 480 amino acid residues and three glucosamine oligosaccharides. These results together with our previous sequence data for the 50-kDa and 19-kDa fragments complete the amino acid sequence of human ceruloplasmin. The polypeptide chain has a total of 1,046 amino acid residues (Mr 120,085) and has attachment sites for four glucosamine oligosaccharides; together these account for the total molecular mass of human ceruloplasmin (132 kDa). The sequence analysis of the peptides overlapping the fragments showed that one additional amino acid, arginine, is present between the 67-kDa and 50-kDa fragments, and another, lysine, is between the 50-kDa and 19-kDa fragments. Only two apparent sites of amino acid interchange have been identified in the polypeptide chain. Both involve a single-point interchange of glycine and lysine that would result in a difference in charge. The results of the complete sequence analysis verified that human ceruloplasmin is composed of a single polypeptide chain and that the subunit-like fragments are produced by proteolytic cleavage during purification (and possibly also in vivo). PMID:6582496

  7. Poly (beta-L-malic acid) production by diverse phylogenetic clades of Aureobasidium pullulans

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Poly (beta-L-malic acid) (PMA) is a natural biopolyester that has pharmaceutical applications and other potential uses. Here we examine PMA production by genetically diverse phylogenetic clades of the fungus A. pullulans. Thirty-six strains of A. pullulans were isolated for this study from various...

  8. Diverse Functions of Retinoic Acid in Brain Vascular Development

    PubMed Central

    Bonney, Stephanie; Harrison-Uy, Susan; Mishra, Swati; MacPherson, Amber M.; Choe, Youngshik; Li, Dan; Jaminet, Shou-Ching; Fruttiger, Marcus; Pleasure, Samuel J.

    2016-01-01

    As neural structures grow in size and increase metabolic demand, the CNS vasculature undergoes extensive growth, remodeling, and maturation. Signals from neural tissue act on endothelial cells to stimulate blood vessel ingression, vessel patterning, and acquisition of mature brain vascular traits, most notably the blood–brain barrier. Using mouse genetic and in vitro approaches, we identified retinoic acid (RA) as an important regulator of brain vascular development via non-cell-autonomous and cell-autonomous regulation of endothelial WNT signaling. Our analysis of globally RA-deficient embryos (Rdh10 mutants) points to an important, non-cell-autonomous function for RA in the development of the vasculature in the neocortex. We demonstrate that Rdh10 mutants have severe defects in cerebrovascular development and that this phenotype correlates with near absence of endothelial WNT signaling, specifically in the cerebrovasculature, and substantially elevated expression of WNT inhibitors in the neocortex. We show that RA can suppress the expression of WNT inhibitors in neocortical progenitors. Analysis of vasculature in non-neocortical brain regions suggested that RA may have a separate, cell-autonomous function in brain endothelial cells to inhibit WNT signaling. Using both gain and loss of RA signaling approaches, we show that RA signaling in brain endothelial cells can inhibit WNT-β-catenin transcriptional activity and that this is required to moderate the expression of WNT target Sox17. From this, a model emerges in which RA acts upstream of the WNT pathway via non-cell-autonomous and cell-autonomous mechanisms to ensure the formation of an adequate and stable brain vascular plexus. SIGNIFICANCE STATEMENT Work presented here provides novel insight into important yet little understood aspects of brain vascular development, implicating for the first time a factor upstream of endothelial WNT signaling. We show that RA is permissive for cerebrovascular growth via

  9. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  10. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  11. Sequence Diversity and Antigenic Variation at the rag Locus of Porphyromonas gingivalis

    PubMed Central

    Hall, Lucinda M. C.; Fawell, Stuart C.; Shi, Xiaoju; Faray-Kele, Marie-Claire; Aduse-Opoku, Joseph; Whiley, Robert A.; Curtis, Michael A.

    2005-01-01

    The rag locus of Porphyromonas gingivalis W50 encodes RagA, a predicted tonB-dependent receptor protein, and RagB, a lipoprotein that constitutes an immunodominant outer membrane antigen. The low G+C content of the locus, an association with mobility elements, and an apparent restricted distribution in the species suggested that the locus had arisen by horizontal gene transfer. In the present study, we have demonstrated that there are four divergent alleles of the rag locus. The original rag allele found in W50 was renamed rag-1, while three novel alleles, rag-2 to rag-4, were found in isolates lacking rag-1. The three novel alleles encoded variants of RagA with 63 to 71% amino acid identity to RagA1 and each other and variants of RagB with 43 to 56% amino acid identity. The RagA/B proteins have homology to numerous Bacteroides proteins, including SusC/D, implicated in polysaccharide uptake. Monoclonal and polyclonal antibodies raised against RagB1 of P. gingivalis W50 did not cross-react with proteins from isolates carrying different alleles. In a laboratory collection of 168 isolates, 26% carried rag-1, 36% carried rag-2, 25% carried rag-3, and 14% carried rag-4 (including the type strain, ATCC 33277). Restriction profiles of the locus in different isolates demonstrated polymorphism within each allele, some of which is accounted for by the presence or absence of insertion sequence elements. By reference to a previously published study on virulence in a mouse model (M. L. Laine and A. J. van Winkelhoff, Oral Microbiol. Immunol. 13:322-325, 1998), isolates that caused serious disease in mice were significantly more likely to carry rag-1 than other rag alleles. PMID:15972517

  12. Diverse Sources of C. difficile Infection Identified on Whole-Genome Sequencing

    PubMed Central

    Eyre, David W.; Cule, Madeleine L.; Wilson, Daniel J.; Griffiths, David; Vaughan, Alison; O’Connor, Lily; Ip, Camilla L.C.; Golubchik, Tanya; Batty, Elizabeth M.; Finney, John M.; Wyllie, David H.; Didelot, Xavier; Piazza, Paolo; Bowden, Rory; Dingle, Kate E.; Harding, Rosalind M.

    2013-01-01

    BACKGROUND It has been thought that Clostridium difficile infection is transmitted predominantly within health care settings. However, endemic spread has hampered identification of precise sources of infection and the assessment of the efficacy of interventions. METHODS From September 2007 through March 2011, we performed whole-genome sequencing on isolates obtained from all symptomatic patients with C. difficile infection identified in health care settings or in the community in Oxfordshire, United Kingdom. We compared single-nucleotide variants (SNVs) between the isolates, using C. difficile evolution rates estimated on the basis of the first and last samples obtained from each of 145 patients, with 0 to 2 SNVs expected between transmitted isolates obtained less than 124 days apart, on the basis of a 95% prediction interval. We then identified plausible epidemiologic links among genetically related cases from data on hospital admissions and community location. RESULTS Of 1250 C. difficile cases that were evaluated, 1223 (98%) were successfully sequenced. In a comparison of 957 samples obtained from April 2008 through March 2011 with those obtained from September 2007 onward, a total of 333 isolates (35%) had no more than 2 SNVs from at least 1 earlier case, and 428 isolates (45%) had more than 10 SNVs from all previous cases. Reductions in incidence over time were similar in the two groups, a finding that suggests an effect of interventions targeting the transition from exposure to disease. Of the 333 patients with no more than 2 SNVs (consistent with transmission), 126 patients (38%) had close hospital contact with another patient, and 120 patients (36%) had no hospital or community contact with another patient. Distinct subtypes of infection continued to be identified throughout the study, which suggests a considerable reservoir of C. difficile. CONCLUSIONS Over a 3-year period, 45% of C. difficile cases in Oxfordshire were genetically distinct from all

  13. SETG: Nucleic Acid Extraction and Sequencing for In Situ Life Detection on Mars

    NASA Astrophysics Data System (ADS)

    Mojarro, A.; Hachey, J.; Tani, J.; Smith, A.; Bhattaru, S. A.; Pontefract, A.; Doebler, R.; Brown, M.; Ruvkun, G.; Zuber, M. T.; Carr, C. E.

    2016-10-01

    We are developing an integrated nucleic acid extraction and sequencing instrument: the Search for Extra-Terrestrial Genomes (SETG) for in situ life detection on Mars. Our goals are to identify related or unrelated nucleic acid-based life on Mars.

  14. Draft Genome Sequence of Cyanobacterium sp. Strain IPPAS B-1200 with a Unique Fatty Acid Composition

    PubMed Central

    Starikov, Alexander Y.; Usserbaeva, Aizhan A.; Sinetova, Maria A.; Sarsekeyeva, Fariza K.; Zayadan, Bolatkhan K.; Ustinova, Vera V.; Kupriyanova, Elena V.; Los, Dmitry A.

    2016-01-01

    Here, we report the draft genome of Cyanobacterium sp. IPPAS strain B-1200, isolated from Lake Balkhash, Kazakhstan, and characterized by the unique fatty acid composition of its membrane lipids, which are enriched with myristic and myristoleic acids. The approximate genome size is 3.4 Mb, and the predicted number of coding sequences is 3,119. PMID:27856596

  15. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  16. Parvalbumins from coelacanth muscle. III. Amino acid sequence of the major component.

    PubMed

    Jauregui-Adell, J; Pechere, J F

    1978-09-26

    The primary structure of the major parvalbumin (pI = 4.52) from coelacanth muscle (Latimeria chalumnae) has been determined. Sequence analysis of the tryptic peptides, in some cases obtained with beta-trypsin, accounts for the total amino acid content of the protein. Chymotryptic peptides provide appropriate sequence overlaps, to complete the localization of the tryptic peptides. Examination of the amino acid sequence of this protein shows the typical structure of a beta-parvalbumin. Its position in the dendrogram of related calcium-binding proteins corresponds to that usually accepted for crossopterygians.

  17. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-04

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs.

  18. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  19. High-Throughput Sequencing of Microbial Community Diversity and Dynamics during Douchi Fermentation

    PubMed Central

    Tu, Zong-cai; Wang, Xiao-lan

    2016-01-01

    Douchi is a type of Chinese traditional fermented food that is an important source of protein and is used in flavouring ingredients. The end product is affected by the microbial community present during fermentation, but exactly how microbes influence the fermentation process remains poorly understood. We used an Illumina MiSeq approach to investigate bacterial and fungal community diversity during both douchi-koji making and fermentation. A total of 181,443 high quality bacterial 16S rRNA sequences and 221,059 high quality fungal internal transcribed spacer reads were used for taxonomic classification, revealing eight bacterial and three fungal phyla. Firmicutes, Actinobacteria and Proteobacteria were the dominant bacterial phyla, while Ascomycota and Zygomycota were the dominant fungal phyla. At the genus level, Staphylococcus and Weissella were the dominant bacteria, while Aspergillus and Lichtheimia were the dominant fungi. Principal coordinate analysis showed structural separation between the composition of bacteria in koji making and fermentation. However, multivariate analysis of variance based on unweighted UniFrac distances did identify distinct differences (p <0.05), and redundancy analysis identified two key genera that are largely responsible for the differences in bacterial composition between the two steps. Staphylococcus was enriched in koji making, while Corynebacterium was enriched in fermentation. This is the first investigation to integrate douchi fermentation and koji making and fermentation processes through this technological approach. The results provide insight into the microbiome of the douchi fermentation process, and reveal a structural separation that may be stratified by the environment during the production of this traditional fermented food. PMID:27992473

  20. Fingerprinting the Asterid Species Using Subtracted Diversity Array Reveals Novel Species-Specific Sequences

    PubMed Central

    Mantri, Nitin; Olarte, Alexandra; Li, Chun Guang; Xue, Charlie; Pang, Edwin C. K.

    2012-01-01

    Background Asterids is one of the major plant clades comprising of many commercially important medicinal species. One of the major concerns in medicinal plant industry is adulteration/contamination resulting from misidentification of herbal plants. This study reports the construction and validation of a microarray capable of fingerprinting medicinally important species from the Asterids clade. Methodology/Principal Findings Pooled genomic DNA of 104 non-asterid angiosperm and non-angiosperm species was subtracted from pooled genomic DNA of 67 asterid species. Subsequently, 283 subtracted DNA fragments were used to construct an Asterid-specific array. The validation of Asterid-specific array revealed a high (99.5%) subtraction efficiency. Twenty-five Asterid species (mostly medicinal) representing 20 families and 9 orders within the clade were hybridized onto the array to reveal its level of species discrimination. All these species could be successfully differentiated using their hybridization patterns. A number of species-specific probes were identified for commercially important species like tea, coffee, dandelion, yarrow, motherwort, Japanese honeysuckle, valerian, wild celery, and yerba mate. Thirty-seven polymorphic probes were characterized by sequencing. A large number of probes were novel species-specific probes whilst some of them were from chloroplast region including genes like atpB, rpoB, and ndh that have extensively been used for fingerprinting and phylogenetic analysis of plants. Conclusions/Significance Subtracted Diversity Array technique is highly efficient in fingerprinting species with little or no genomic information. The Asterid-specific array could fingerprint all 25 species assessed including three species that were not used in constructing the array. This study validates the use of chloroplast genes for bar-coding (fingerprinting) plant species. In addition, this method allowed detection of several new loci that can be explored to solve

  1. Phylogenetic diversity of the Bacillus pumilus group and the marine ecotype revealed by multilocus sequence analysis.

    PubMed

    Liu, Yang; Lai, Qiliang; Dong, Chunming; Sun, Fengqin; Wang, Liping; Li, Guangyu; Shao, Zongze

    2013-01-01

    Bacteria closely related to Bacillus pumilus cannot be distinguished from such other species as B. safensis, B. stratosphericus, B. altitudinis and B. aerophilus simply by 16S rRNA gene sequence. In this report, 76 marine strains were subjected to phylogenetic analysis based on 7 housekeeping genes to understand the phylogeny and biogeography in comparison with other origins. A phylogenetic tree based on the 7 housekeeping genes concatenated in the order of gyrB-rpoB-pycA-pyrE-mutL-aroE-trpB was constructed and compared with trees based on the single genes. All these trees exhibited a similar topology structure with small variations. Our 79 strains were divided into 6 groups from A to F; Group A was the largest and contained 49 strains close to B. altitudinis. Additional two large groups were presented by B. safensis and B. pumilus respectively. Among the housekeeping genes, gyrB and pyrE showed comparatively better resolution power and may serve as molecular markers to distinguish these closely related strains. Furthermore, a recombinant phylogenetic tree based on the gyrB gene and containing 73 terrestrial and our isolates was constructed to detect the relationship between marine and other sources. The tree clearly showed that the bacteria of marine origin were clustered together in all the large groups. In contrast, the cluster belonging to B. safensis was mainly composed of bacteria of terrestrial origin. Interestingly, nearly all the marine isolates were at the top of the tree, indicating the possibility of the recent divergence of this bacterial group in marine environments. We conclude that B. altitudinis bacteria are the most widely spread of the B. pumilus group in marine environments. In summary, this report provides the first evidence regarding the systematic evolution of this bacterial group, and knowledge of their phylogenetic diversity will help in the understanding of their ecological role and distribution in marine environments.

  2. Analysis of cloned cDNA and genomic sequences for phytochrome: complete amino acid sequences for two gene products expressed in etiolated Avena.

    PubMed Central

    Hershey, H P; Barker, R F; Idler, K B; Lissemore, J L; Quail, P H

    1985-01-01

    Cloned cDNA and genomic sequences have been analyzed to deduce the amino acid sequence of phytochrome from etiolated Avena. Restriction endonuclease site polymorphism between clones indicates that at least four phytochrome genes are expressed in this tissue. Sequence analysis of two complete and one partial coding region shows approximately 98% homology at both the nucleotide and amino acid levels, with the majority of amino acid changes being conservative. High sequence homology is also found in the 5'-untranslated region but significant divergence occurs in the 3'-untranslated region. The phytochrome polypeptides are 1128 amino acid residues long corresponding to a molecular mass of 125 kdaltons. The known protein sequence at the chromophore attachment site occurs only once in the polypeptide, establishing that phytochrome has a single chromophore per monomer covalently linked to Cys-321. Computer analyses of the amino acid sequences have provided predictions regarding a number of structural features of the phytochrome molecule. PMID:3001642

  3. Metabolic diversity in biohydrogenation of polyunsaturated fatty acids by lactic acid bacteria involving conjugated fatty acid production.

    PubMed

    Kishino, Shigenobu; Ogawa, Jun; Yokozeki, Kenzo; Shimizu, Sakayu

    2009-08-01

    Lactobacillus plantarum AKU 1009a effectively transforms linoleic acid to conjugated linoleic acids of cis-9,trans-11-octadecadienoic acid (18:2) and trans-9,trans-11-18:2. The transformation of various polyunsaturated fatty acids by washed cells of L. plantarum AKU 1009a was investigated. Besides linoleic acid, alpha-linolenic acid [cis-9,cis-12,cis-15-octadecatrienoic acid (18:3)], gamma-linolenic acid (cis-6,cis-9,cis-12-18:3), columbinic acid (trans-5,cis-9,cis-12-18:3), and stearidonic acid [cis-6,cis-9,cis-12,cis-15-octadecatetraenoic acid (18:4)] were found to be transformed. The fatty acids transformed by the strain had the common structure of a C18 fatty acid with the cis-9,cis-12 diene system. Three major fatty acids were produced from alpha-linolenic acid, which were identified as cis-9,trans-11,cis-15-18:3, trans-9,trans-11,cis-15-18:3, and trans-10,cis-15-18:2. Four major fatty acids were produced from gamma-linolenic acid, which were identified as cis-6,cis-9,trans-11-18:3, cis-6,trans-9,trans-11-18:3, cis-6,trans-10-18:2, and trans-10-octadecenoic acid. The strain transformed the cis-9,cis-12 diene system of C18 fatty acids into conjugated diene systems of cis-9,trans-11 and trans-9,trans-11. These conjugated dienes were further saturated into the trans-10 monoene system by the strain. The results provide valuable information for understanding the pathway of biohydrogenation by anaerobic bacteria and for establishing microbial processes for the practical production of conjugated fatty acids, especially those produced from alpha-linolenic acid and gamma-linolenic acid.

  4. Purification, characterization and partial amino acid sequence of glycogen synthase from Saccharomyces cerevisiae.

    PubMed Central

    Carabaza, A; Arino, J; Fox, J W; Villar-Palasi, C; Guinovart, J J

    1990-01-01

    Glycogen synthase from Saccharomyces cerevisiae was purified to homogeneity. The enzyme showed a subunit molecular mass of 80 kDa. The holoenzyme appears to be a tetramer. Antibodies developed against purified yeast glycogen synthase inactivated the enzyme in yeast extracts and allowed the detection of the protein in Western blots. Amino acid analysis showed that the enzyme is very rich in glutamate and/or glutamine residues. The N-terminal sequence (11 amino acid residues) was determined. In addition, selected tryptic-digest peptides were purified by reverse-phase h.p.l.c. and submitted to gas-phase sequencing. Up to eight sequences (79 amino acid residues) could be aligned with the human muscle enzyme sequence. Levels of identity range between 37 and 100%, indicating that, although human and yeast glycogen synthases probably share some conserved regions, significant differences in their primary structure should be expected. Images Fig. 1. Fig. 2. Fig. 3. PMID:2114092

  5. Amino acid sequence of anionic peroxidase from the windmill palm tree Trachycarpus fortunei.

    PubMed

    Baker, Margaret R; Zhao, Hongwei; Sakharov, Ivan Yu; Li, Qing X

    2014-12-10

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications.

  6. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  7. Amino acid sequence of homologous rat atrial peptides: natriuretic activity of native and synthetic forms.

    PubMed Central

    Seidah, N G; Lazure, C; Chrétien, M; Thibault, G; Garcia, R; Cantin, M; Genest, J; Nutt, R F; Brady, S F; Lyle, T A

    1984-01-01

    A substance called atrial natriuretic factor (ANF), localized in secretory granules of atrial cardiocytes, was isolated as four homologous natriuretic peptides from homogenates of rat atria. The complete sequence of the longest form showed that it is composed of 33 amino acids. The three other shorter forms (2-33, 3-33, and 8-33) represent amino-terminally truncated versions of the 33 amino acid parent molecule as shown by analysis of sequence, amino acid composition, or both. The proposed primary structure agrees entirely with the amino acid composition and reveals no significant sequence homology with any known protein or segment of protein. The short form ANF-(8-33) was synthesized by a multi-fragment condensation approach and the synthetic product was shown to exhibit specific activity comparable to that of the natural ANF-(3-33). PMID:6232612

  8. Nucleotide and deduced amino acid sequences of a new subtilisin from an alkaliphilic Bacillus isolate.

    PubMed

    Saeki, Katsuhisa; Magallones, Marietta V; Takimura, Yasushi; Hatada, Yuji; Kobayashi, Tohru; Kawai, Shuji; Ito, Susumu

    2003-10-01

    The gene for a new subtilisin from the alkaliphilic Bacillus sp. KSM-LD1 was cloned and sequenced. The open reading frame of the gene encoded a 97 amino-acid prepro-peptide plus a 307 amino-acid mature enzyme that contained a possible catalytic triad of residues, Asp32, His66, and Ser224. The deduced amino acid sequence of the mature enzyme (LD1) showed approximately 65% identity to those of subtilisins SprC and SprD from alkaliphilic Bacillus sp. LG12. The amino acid sequence identities of LD1 to those of previously reported true subtilisins and high-alkaline proteases were below 60%. LD1 was characteristically stable during incubation with surfactants and chemical oxidants. Interestingly, an oxidizable Met residue is located next to the catalytic Ser224 of the enzyme as in the cases of the oxidation-susceptible subtilisins reported to date.

  9. Diverse C-terminal sequences involved in Flavobacterium johnsoniae protein secretion.

    PubMed

    Kulkarni, Surashree S; Zhu, Yongtao; Brendel, Colton J; McBride, Mark J

    2017-04-10

    Flavobacterium johnsoniae and many related bacteria secrete proteins across the outer membrane using the type IX secretion system (T9SS). Proteins secreted by T9SSs have amino-terminal signal peptides for export across the cytoplasmic membrane by the Sec system and carboxy-terminal domains (CTDs) targeting them for secretion across the outer membrane by the T9SS. Most but not all T9SS CTDs belong to family TIGR04183 (type-A CTDs). We functionally characterized diverse CTDs for secretion by the F. johnsoniae T9SS. Attachment of the CTDs from F. johnsoniae RemA, AmyB, and ChiA to the foreign protein sfGFP that had a signal peptide at the amino terminus resulted in secretion across the outer membrane. In each case approximately 80 to 100 amino acids from the extreme carboxy-termini was needed for efficient secretion. Several type-A CTDs from distantly related members of the phylum Bacteroidetes functioned in F. johnsoniae, supporting secretion of sfGFP by the F. johnsoniae T9SS. F. johnsoniae SprB requires the T9SS for secretion but lacks a type-A CTD. It has a conserved C-terminal domain belonging to family TIGR04131, which we refer to as a type-B CTD. The CTD of SprB was required for its secretion, but attachment of C-terminal regions of SprB of up to 1182 amino acids to sfGFP failed to result in secretion. Additional features outside of the C-terminal region of SprB may be required for its secretion.Importance Type IX protein secretion systems (T9SSs) are common in but limited to members of the phylum Bacteroidetes Most proteins that are secreted by T9SSs have conserved carboxy-terminal domains that belong to either protein domain family TIGR04183 (type-A CTDs) or TIGR04131 (type-B CTDs). Here we identify features of T9SS CTDs of F. johnsoniae that are required for protein secretion and demonstrate that type-A CTDs from distantly related members of the phylum function with the F. johnsoniae T9SS to secrete the foreign protein sfGFP. In contrast, type-B CTDs failed

  10. Shark myelin basic protein: amino acid sequence, secondary structure, and self-association.

    PubMed

    Milne, T J; Atkins, A R; Warren, J A; Auton, W P; Smith, R

    1990-09-01

    Myelin basic protein (MBP) from the Whaler shark (Carcharhinus obscurus) has been purified from acid extracts of a chloroform/methanol pellet from whole brains. The amino acid sequence of the majority of the protein has been determined and compared with the sequences of other MBPs. The shark protein has only 44% homology with the bovine protein, but, in common with other MBPs, it has basic residues distributed throughout the sequence and no extensive segments that are predicted to have an ordered secondary structure in solution. Shark MBP lacks the triproline sequence previously postulated to form a hairpin bend in the molecule. The region containing the putative consensus sequence for encephalitogenicity in the guinea pig contains several substitutions, thus accounting for the lack of activity of the shark protein. Studies of the secondary structure and self-association have shown that shark MBP possesses solution properties similar to those of the bovine protein, despite the extensive differences in primary structure.

  11. Evolution of a zoonotic pathogen: investigating prophage diversity in enterohaemorrhagic Escherichia coli O157 by long-read sequencing

    PubMed Central

    Shaaban, Sharif; Cowley, Lauren A.; McAteer, Sean P.; Jenkins, Claire; Dallman, Timothy J.; Bono, James L.

    2016-01-01

    Enterohaemorrhagic Escherichia coli (EHEC) O157 is a zoonotic pathogen for which colonization of cattle and virulence in humans is associated with multiple horizontally acquired genes, the majority present in active or cryptic prophages. Our understanding of the evolution and phylogeny of EHEC O157 continues to develop primarily based on core genome analyses; however, such short-read sequences have limited value for the analysis of prophage content and its chromosomal location. In this study, we applied Single Molecule Real Time (SMRT) sequencing, using the Pacific Biosciences long-read sequencing platform, to isolates selected from the main sub-clusters of this clonal group. Prophage regions were extracted from these sequences and from published reference strains. Genome position and prophage diversity were analysed along with genetic content. Prophages could be assigned to clusters, with smaller prophages generally exhibiting less diversity and preferential loss of structural genes. Prophages encoding Shiga toxin (Stx) 2a and Stx1a were the most diverse, and more variable compared to prophages encoding Stx2c, further supporting the hypothesis that Stx2c-prophage integration was ancestral to acquisition of other Stx types. The concept that phage type (PT) 21/28 (Stx2a+, Stx2c+) strains evolved from PT32 (Stx2c+) was supported by analysis of strains with excised Stx-encoding prophages. Insertion sequence elements were over-represented in prophage sequences compared to the rest of the genome, showing integration in key genes such as stx and an excisionase, the latter potentially acting to capture the bacteriophage into the genome. Prophage profiling should allow more accurate prediction of the pathogenic potential of isolates. PMID:28348836

  12. Function and evolutionary diversity of fatty acid amino acid conjugates (FACs)in Lepidopteran caterpillars

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Fatty acid amino acid conjugates (FACs) in regurgitant of larval Spodoptera exigua1 were initially identified as plant volatile elicitors and research has been focused on this apparent ecological disadvantage rather than on possible benefit for the caterpillar itself. Recently, we demonstrated that...

  13. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  14. An analysis of amino acid sequences surrounding archaeal glycoprotein sequons.

    PubMed

    Abu-Qarn, Mehtap; Eichler, Jerry

    2007-05-01

    Despite having provided the first example of a prokaryal glycoprotein, little is known of the rules governing the N-glycosylation process in Archaea. As in Eukarya and Bacteria, archaeal N-glycosylation takes place at the Asn residues of Asn-X-Ser/Thr sequons. Since not all sequons are utilized, it is clear that other factors, including the context in which a sequon exists, affect glycosylation efficiency. As yet, the contribution to N-glycosylation made by sequon-bordering residues and other related factors in Archaea remains unaddressed. In the following, the surroundings of Asn residues confirmed by experiment as modified were analyzed in an attempt to define sequence rules and requirements for archaeal N-glycosylation.

  15. Diversity of Δ12 Fatty Acid Desaturases in Santalaceae and Their Role in Production of Seed Oil Acetylenic Fatty Acids*

    PubMed Central

    Okada, Shoko; Zhou, Xue-Rong; Damcevski, Katherine; Gibb, Nerida; Wood, Craig; Hamberg, Mats; Haritos, Victoria S.

    2013-01-01

    Plants in the Santalaceae family, including the native cherry Exocarpos cupressiformis and sweet quandong Santalum acuminatum, accumulate ximenynic acid (trans-11-octadecen-9-ynoic acid) in their seed oil and conjugated polyacetylenic fatty acids in root tissue. Twelve full-length genes coding for microsomal Δ12 fatty acid desaturases (FADs) from the two Santalaceae species were identified by degenerate PCR. Phylogenetic analysis of the predicted amino acid sequences placed five Santalaceae FADs with Δ12 FADs, which include Arabidopsis thaliana FAD2. When expressed in yeast, the major activity of these genes was Δ12 desaturation of oleic acid, but unusual activities were also observed: i.e. Δ15 desaturation of linoleic acid as well as trans-Δ12 and trans-Δ11 desaturations of stearolic acid (9-octadecynoic acid). The trans-12-octadecen-9-ynoic acid product was also detected in quandong seed oil. The two other FAD groups (FADX and FADY) were present in both species; in a phylogenetic tree of microsomal FAD enzymes, FADX and FADY formed a unique clade, suggesting that are highly divergent. The FADX group enzymes had no detectable Δ12 FAD activity but instead catalyzed cis-Δ13 desaturation of stearolic acid when expressed in yeast. No products were detected for the FADY group when expressed recombinantly. Quantitative PCR analysis showed that the FADY genes were expressed in leaf rather than developing seed of the native cherry. FADs with promiscuous and unique activities have been identified in Santalaceae and explain the origin of some of the unusual lipids found in this plant family. PMID:24062307

  16. Helicobacter pylori CagA: analysis of sequence diversity in relation to phosphorylation motifs and implications for the role of CagA as a virulence factor.

    PubMed

    Evans, D J; Evans, D G

    2001-09-01

    CagA is transported into host target cells and subsequently phosphorylated. Clearly this is a mechanism by which Helicobacter pylori could take control of one or more host cell signal transduction pathways. Presumably the end result of this interaction favors survival of H. pylori, irrespective of eventual damage to the host cell. CagA is noted for its amino acid (AA) sequence diversity, both within and outside the variable region of the molecule. The primary purpose of this review is to examine how variation in the type and number of CagA phosphorylation sites might determine the outcome of infection by different strains of H. pylori. The answer to this question could help to explain the widely disparate results obtained when H. pylori CagA status has been compared to type and severity of disease outcome in different populations, that is in different countries. Analysis of all available CagA sequences revealed that CagA contains both tyrosine phosphorylation motifs (TPMs) and cyclic-AMP-dependent phosphorylation motifs (CPMs). There are two potential CPMs near the N-terminus of CagA and at least two in the repeat region; these are not all equally well conserved. We also defined a 48-residue AA sequence, which includes the N-terminal TPM at tyrosine (Y)-122, which distinguishes between Eastern (Hong Kong-Taiwan-Japan-Thailand) H. pylori isolates and those from the West (Europe-Africa-the Americas-Australia). All 28 of the Eastern type CagA proteins have a functional N-terminal TPM whereas 11 of 47 (23.4%) of the Western type contain an inactive motif, with threonine (T) replacing the critical aspartic acid (D) residue. Only 13 of 24 (54%) known CagA sequences have an active TPM in the repeat region and only one has two TPMs in this region. The potential TPM near the C-terminus of CagA is not likely to be important since only 3 of 24 (12.5%) sequences were found to be intact. Protein database searches revealed that the AA sequence immediately following the TPM at Y

  17. Semiconductor Sequencing Reveals the Diversity of Bacterial Communities in an Amazon Reservoir Considered as a Methane Source

    NASA Astrophysics Data System (ADS)

    Graças, D. A.; Ramos, R. T.; Sá, P. G.; Baraúna, R. A.; Schneider, M. C.; Silva, A.

    2013-05-01

    The Amazon region has enormous hydro potential which is used for power generation. In fact, there are several hydroelectric power stations (HPS) already installed and many under construction or designed. It's in the Amazon which the HPS of Tucuruí, fifth largest in the world, is located. The construction of this hydroelectric dam flooded an area of 2,400 km2 of forest that decomposing, releasing greenhouse gases such as methane (CH4). Methane is the most abundant organic gas in the atmosphere and the second most important greenhouse gas. In this study, we use semicondutor sequencing to assess the bacterial diversity along a water column of 70 meters deep in the Tucuruí reservoir. One liter of water was collected every 10 meters along the water column for total DNA extraction. A fragment of approximately 150 base pairs of the 16S rRNA gene was amplified by polymerase chain reaction using universal primers. These fragments were then paralleled sequenced in Ion Torrent® platform using barcodes on the 316 chip. After the quality filters, about 237 thousands reads were obtained, representing more than 300 Mbp. For bacterial diversity analysis, we used only reads longer than 100 base pairs. The taxonomic diversity was obtained from the Ribosomal Database Project Classifier and alpha diversity analysis (diversity indices and rarefaction) was performed using the RDP pyrosequencing pipeline. Although it is recommended for data pyrosequencing, that pipeline is able to process data obtained from semiconductor sequencing once all of them are fasta files. Over 75% of the sequences were not classified in any phylum, which leads us to believe that there is a huge diversity in the bacterial environment whose function is still unclear. Among the sequences that could be classified, there is a predominance of proteobacteria in all layers, but in higher concentrations at the lower layers. Cyanobacteria accounted for about 3% in the layers of 0m and 10m, leading us to conclude that

  18. Genetic diversity in Arabidopsis thaliana L. Heynh. investigated by cleaved amplified polymorphic sequence (CAPS) and inter-simple sequence repeat (ISSR) markers.

    PubMed

    Barth, S; Melchinger, A E; Lübberstedt, Th

    2002-03-01

    In this study, we investigated genetic diversity among 37 accessions in Arabidopsis thaliana from Eurasia, North Africa and North America using morphological traits and two polymerase chain reaction (PCR)-based marker systems: cleaved amplified polymorphic sequences (CAPS) and inter-simple sequence repeats (ISSR). Cluster analysis based on genetic similarities calculated from CAPS data grouped the accessions roughly according to their geographical origin: one large group contained accessions from Western, Northern and Southern Europe as well as North Africa, a second group consisted of Eastern European and Asian continental accessions. North American accessions were interspersed into these groups. Contrary to the CAPS analysis, the dendrogram obtained from the ISSR data did not reflect the geographical origin of the accessions, and the calculated genetic distances did not match the CAPS results. This could be attributable to an uneven genomic distribution of ISSR markers as substantiated by a database search for ISSR binding sites in A. thaliana genomic DNA sequence files, or to the ISSR's different mode of evolution. We recommend CAPS markers for diversity analysis in A. thaliana because a careful selection of markers can ascertain an even representation of the entire genome.

  19. Diversity Analysis of Dairy and Nondairy Lactococcus lactis Isolates, Using a Novel Multilocus Sequence Analysis Scheme and (GTG)5-PCR Fingerprinting▿

    PubMed Central

    Rademaker, Jan L. W.; Herbet, Hélène; Starrenburg, Marjo J. C.; Naser, Sabri M.; Gevers, Dirk; Kelly, William J.; Hugenholtz, Jeroen; Swings, Jean; van Hylckama Vlieg, Johan E. T.

    2007-01-01

    The diversity of a collection of 102 lactococcus isolates including 91 Lactococcus lactis isolates of dairy and nondairy origin was explored using partial small subunit rRNA gene sequence analysis and limited phenotypic analyses. A subset of 89 strains of L. lactis subsp. cremoris and L. lactis subsp. lactis isolates was further analyzed by (GTG)5-PCR fingerprinting and a novel multilocus sequence analysis (MLSA) scheme. Two major genomic lineages within L. lactis were found. The L. lactis subsp. cremoris type-strain-like genotype lineage included both L. lactis subsp. cremoris and L. lactis subsp. lactis isolates. The other major lineage, with a L. lactis subsp. lactis type-strain-like genotype, comprised L. lactis subsp. lactis isolates only. A novel third genomic lineage represented two L. lactis subsp. lactis isolates of nondairy origin. The genomic lineages deviate from the subspecific classification of L. lactis that is based on a few phenotypic traits only. MLSA of six partial genes (atpA, encoding ATP synthase alpha subunit; pheS, encoding phenylalanine tRNA synthetase; rpoA, encoding RNA polymerase alpha chain; bcaT, encoding branched chain amino acid aminotransferase; pepN, encoding aminopeptidase N; and pepX, encoding X-prolyl dipeptidyl peptidase) revealed 363 polymorphic sites (total length, 1,970 bases) among 89 L. lactis subsp. cremoris and L. lactis subsp. lactis isolates with unique sequence types for most isolates. This allowed high-resolution cluster analysis in which dairy isolates form subclusters of limited diversity within the genomic lineages. The pheS DNA sequence analysis yielded two genetic groups dissimilar to the other genotyping analysis-based lineages, indicating a disparate acquisition route for this gene. PMID:17890345

  20. SMRT sequencing provides insight into the diversity of the bovine immunoglobulin heavy chain repertoire

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The vertebrate immune system produces a diverse antibody repertoire capable of responding to a vast array of antigens. This diversity is generated through a multifaceted process of gene segment recombination and somatic hypermutation or gene conversion. Recent advances in high-throughput sequencin...

  1. Compare Identity By Sequence Relationships of the Ames Diversity Panel using TYPSimSelector [abstract

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Maize genetic diversity has been exploited by mankind for 10,000 years. Scientific approaches applied to it by breeders for over a century transformed it into the world’s number one crop. Maize genomic diversity provides a rich resource of interest to evolutionary and population geneticists, constit...

  2. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids

  3. Classification of mouse VK groups based on the partial amino acid sequence to the first invariant tryptophan: impact of 14 new sequences from IgG myeloma proteins.

    PubMed

    Potter, M; Newell, J B; Rudikoff, S; Haber, E

    1982-12-01

    Fourteen new VK sequences derived from BALB/c IgG myeloma proteins were determined to the first invariant tryptophan (Trp 35). These partial sequences were compared with 65 other published VK sequences using a computer program. The 79 sequences were organized according to the length of the sequence from the amino terminus to the first invariant tryptophan (Trp 35), into seven groups (33, 34, 35, 36, 39, 40 and 41aa). A distance matrix of all 79 sequences was then computed, i.e. the number of amino acid substitutions necessary to convert one sequence to another was determined. From these data a dendrogram was constructed. Most of the VK sequences fell into clusters or closely related groups. The definition of a sequence group is arbitrary but facilitates the classification of VK proteins. We used 12 substitutions as the basis for defining a sequence group based on the known number of substitutions that are found in the VK21 proteins. By this criterion there were 18 groups in the Trp 35 dendrogram. Twelve of the 14 new sequences fell into one of these sequence groups; two formed new sequence groups. Collective amino acid sequencing is still encountering new VK structures indicating more sequences will be required to attain an accurate estimate of the total number of VK groups. Updated dendrograms can be quickly generated to include newly generated sequences.

  4. Multiple and Diverse vsp and vlp Sequences in Borrelia miyamotoi, a Hard Tick-Borne Zoonotic Pathogen.

    PubMed

    Barbour, Alan G

    2016-01-01

    Based on chromosome sequences, the human pathogen Borrelia miyamotoi phylogenetically clusters with species that cause relapsing fever. But atypically for relapsing fever agents, B. miyamotoi is transmitted not by soft ticks but by hard ticks, which also are vectors of Lyme disease Borrelia species. To further assess the relationships of B. miyamotoi to species that cause relapsing fever, I investigated extrachromosomal sequences of a North American strain with specific attention on plasmid-borne vsp and vlp genes, which are the underpinnings of antigenic variation during relapsing fever. For a hybrid approach to achieve assemblies that spanned more than one of the paralogous vsp and vlp genes, a database of short-reads from next-generation sequencing was supplemented with long-reads obtained with real-time DNA sequencing from single polymerase molecules. This yielded three contigs of 31, 16, and 11 kb, which each contained multiple and diverse sequences that were homologous to vsp and vlp genes of the relapsing fever agent B. hermsii. Two plasmid fragments had coding sequences for plasmid partition proteins that differed from each other from paralogous proteins for the megaplasmid and a small plasmid of B. miyamotoi. One of 4 vsp genes, vsp1, was present at two loci, one of which was downstream of a candiate prokaryotic promoter. A limited RNA-seq analysis of a population growing in the blood of mice indicated that of the 4 different vsp genes vsp1 was the one that was expressed. The findings indicate that B. miyamotoi has at least four types of plasmids, two or more of which bear vsp and vlp gene sequences that are as numerous and diverse as those of relapsing fever Borrelia. The database and insights from these findings provide a foundation for further investigations of the immune responses to this pathogen and of the capability of B. miyamotoi for antigenic variation.

  5. Four novel papillomavirus sequences support a broad diversity among equine papillomaviruses.

    PubMed

    Lange, Christian E; Vetsch, Elisabeth; Ackermann, Mathias; Favrot, Claude; Tobler, Kurt

    2013-06-01

    Papillomaviruses appear to be species-specific pathogens, and it was suggested that each animal species might harbour its own set of papillomaviruses. However, all approaches addressing the underlying evolutionary phenomena still suffer from very limited data about animal papillomaviruses. In case of the horse for example, only three equine papillomaviruses (EcPVs) have been identified. To further address the situation in this host, suspected papillomavirus-associated lesions were tested for EcPV DNA. Four novel EcPV types were detected and their genomes entirely cloned and sequenced. They display the characteristic organization, with early (E) and late (L) regions harbouring the seven classical open reading frames divided by non-coding regions. They were named EcPVs 4, 5, 6 and 7, according to their dissimilarity to other papillomaviruses. Most L1 nucleotide identities were shared with EcPV2 in case of EcPV4 (62 %) and EcPV5 (60 %) or with EcPV3 in case of EcPV6 (70 %) and EcPV7 (71 %). Thus, EcPVs 4 and 5 may establish novel species within the genus Dyoiota, while EcPVs 6 and 7 might fit into the genus Dyorho and belong to the same species as EcPV3. They were found in genital plaques (EcPV4), aural plaques (EcPV5, EcPV6) or penile masses (EcPV7). Interestingly, PCR analysis revealed the DNA of EcPV2 and EcPV4 as well as of EcPV3 and EcPV6 together in the same tissue samples, respectively. In conclusion, the DNA of four novel EcPV types was identified and cloned. They cluster with the known types and support broad genetic EcPV diversity in at least two of the known clades. Furthermore, PCR assays also provide evidence for EcPV co-infections in horses.

  6. Genetic diversity and molecular evolution of Naga King Chili inferred from internal transcribed spacer sequence of nuclear ribosomal DNA.

    PubMed

    Kehie, Mechuselie; Kumaria, Suman; Devi, Khumuckcham Sangeeta; Tandon, Pramod

    2016-02-01

    Sequences of the Internal Transcribed Spacer (ITS1-5.8S-ITS2) of nuclear ribosomal DNAs were explored to study the genetic diversity and molecular evolution of Naga King Chili. Our study indicated the occurrence of nucleotide polymorphism and haplotypic diversity in the ITS regions. The present study demonstrated that the variability of ITS1 with respect to nucleotide diversity and sequence polymorphism exceeded that of ITS2. Sequence analysis of 5.8S gene revealed a much conserved region in all the accessions of Naga King Chili. However, strong phylogenetic information of this species is the distinct 13 bp deletion in the 5.8S gene which discriminated Naga King Chili from the rest of the Capsicum sp. Neutrality test results implied a neutral variation, and population seems to be evolving at drift-mutation equilibrium and free from directed selection pressure. Furthermore, mismatch analysis showed multimodal curve indicating a demographic equilibrium. Phylogenetic relationships revealed by Median Joining Network (MJN) analysis denoted a clear discrimination of Naga King Chili from its closest sister species (Capsicum chinense and Capsicum frutescens). The absence of star-like network of haplotypes suggested an ancient population expansion of this chili.

  7. Deep COI sequencing of standardized benthic samples unveils overlooked diversity of Jordanian coral reefs in the northern Red Sea.

    PubMed

    Al-Rshaidat, Mamoon M D; Snider, Allison; Rosebraugh, Sydney; Devine, Amanda M; Devine, Thomas D; Plaisance, Laetitia; Knowlton, Nancy; Leray, Matthieu

    2016-09-01

    High-throughput sequencing (HTS) of DNA barcodes (metabarcoding), particularly when combined with standardized sampling protocols, is one of the most promising approaches for censusing overlooked cryptic invertebrate communities. We present biodiversity estimates based on sequencing of the cytochrome c oxidase subunit 1 (COI) gene for coral reefs of the Gulf of Aqaba, a semi-enclosed system in the northern Red Sea. Samples were obtained from standardized sampling devices (Autonomous Reef Monitoring Structures (ARMS)) deployed for 18 months. DNA barcoding of non-sessile specimens >2 mm revealed 83 OTUs in six phyla, of which only 25% matched a reference sequence in public databases. Metabarcoding of the 2 mm - 500 μm and sessile bulk fractions revealed 1197 OTUs in 15 animal phyla, of which only 4.9% matched reference barcodes. These results highlight the scarcity of COI data for cryptobenthic organisms of the Red Sea. Compared with data obtained using similar methods, our results suggest that Gulf of Aqaba reefs are less diverse than two Pacific coral reefs but much more diverse than an Atlantic oyster reef at a similar latitude. The standardized approaches used here show promise for establishing baseline data on biodiversity, monitoring the impacts of environmental change, and quantifying patterns of diversity at regional and global scales.

  8. Development of Microsatellite Markers Derived from Expressed Sequence Tags of Polyporales for Genetic Diversity Analysis of Endangered Polyporus umbellatus

    PubMed Central

    Zhang, Yuejin; Chen, Yuanyuan; Wang, Ruihong; Zeng, Ailin; Deyholos, Michael K.; Shu, Jia; Guo, Hongbo

    2015-01-01

    A large scale of EST sequences of Polyporales was screened in this investigation in order to identify EST-SSR markers for various applications. The distribution of EST sequences and SSRs in five families of Polyporales was analyzed, respectively. Mononucleotide was the most abundant type, followed by trinucleotide. Among five families, Ganodermataceae occupied the most SSR markers, followed by Coriolaceae. Functional prediction of SSR marker-containing EST sequences in Ganoderma lucidum obtained three main groups, namely, cellular component, biological process, and molecular function. Thirty EST-SSR primers were designed to evaluate the genetic diversity of 13 natural Polyporus umbellatus accessions. Twenty one EST-SSRs were polymorphic with average PIC value of 0.33 and transferability rate of 71%. These 13 P. umbellatus accessions showed relatively high genetic diversity. The expected heterozygosity, Nei's gene diversity, and Shannon information index were 0.41, 0.39, and 0.57, respectively. Both UPGMA dendrogram and principal coordinate analysis (PCA) showed the same cluster result that divided the 13 accessions into three or four groups. PMID:26146636

  9. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    SciTech Connect

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g. Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.

  10. Genome Sequencing of Mycobacterium abscessus Isolates from Patients in the United States and Comparisons to Globally Diverse Clinical Strains

    PubMed Central

    Davidson, Rebecca M.; Hasan, Nabeeh A.; Reynolds, Paul R.; Totten, Sarah; Garcia, Benjamin; Levin, Adrah; Ramamoorthy, Preveen; Heifets, Leonid; Daley, Charles L.

    2014-01-01

    Nontuberculous mycobacterial infections caused by Mycobacterium abscessus are responsible for a range of disease manifestations from pulmonary to skin infections and are notoriously difficult to treat, due to innate resistance to many antibiotics. Previous population studies of clinical M. abscessus isolates utilized multilocus sequence typing or pulsed-field gel electrophoresis, but high-resolution examinations of genetic diversity at the whole-genome level have not been well characterized, particularly among clinical isolates derived in the United States. We performed whole-genome sequencing of 11 clinical M. abscessus isolates derived from eight U.S. patients with pulmonary nontuberculous mycobacterial infections, compared them to 30 globally diverse clinical isolates, and investigated intrapatient genomic diversity and evolution. Phylogenomic analyses revealed a cluster of closely related U.S. and Western European M. abscessus subsp. abscessus isolates that are genetically distinct from other European isolates and all Asian isolates. Large-scale variation analyses suggested genome content differences of 0.3 to 8.3%, relative to the reference strain ATCC 19977T. Longitudinally sampled isolates showed very few single-nucleotide polymorphisms and correlated genomic deletion patterns, suggesting homogeneous infection populations. Our study explores the genomic diversity of clinical M. abscessus strains from multiple continents and provides insight into the genome plasticity of an opportunistic pathogen. PMID:25056330

  11. Assessment of genetic diversity among four orchids based on ddRAD sequencing data for conservation purposes.

    PubMed

    Roy, Subhas Chandra; Moitra, Kaushik; De Sarker, Dilip

    2017-01-01

    Genetic diversity was assessed in the four orchid species using NGS based ddRAD sequencing data. The assembled nucleotide sequences (fastq) were deposited in the SRA archive of NCBI Database with accession number (SRP063543 for Dendrobium, SRP065790 for Geodorum, SRP072201 for Cymbidium and SRP072378 for Rhynchostylis). Total base pair read was 1.1 Mbp in case of Dendrobium sp., 553.3 Kbp for Geodorum sp., 1.6 Gbp for Cymbidium, and 1.4 Gbp for Rhynchostylis. Average GC% was 43.9 in Geodorum, 43.7% in Dendrobium, 41.2% in Cymbidium and 42.3% in Rhynchostylis. Four partial gene sequences were used in DnaSP5 program for nucleotide diversity and phylogenetic relationship determination (Ycf2 gene of Dendrobium, matK gene of Geodorum, psbD gene of Cymbidium and Ycf2 gene of Ryhnchostylis). Nucleotide diversity (per site) Pi (π) was 0.10560 in Dendrobium, 0.03586 in Geodorum, 0.01364 in Cymbidium and 0.011344 in Rhynchostylis. Neutrality test statistics showed the negative value in all the four orchid species (Tajima's D value -2.17959 in Dendrobium, -2.01655 in Geodorum, -2.12362 in Rhynchostylis and -1.54222 in Cymbidium) indicating the purifying selection. Result for these gene sequences (matK and Ycf2 and psbD) indicate that they were not evolved neutrally, but signifying that selection might have played a role in evolution of these genes in these four groups of orchids. Phylogenetic relationship was analyzed by reconstructing dendrogram based on the matK, psbD and Ycf2 gene sequences using maximum likelihood method in MEGA6 program.

  12. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  13. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  14. Estimates of Soil Bacterial Ribosome Content and Diversity Are Significantly Affected by the Nucleic Acid Extraction Method Employed.

    PubMed

    Wüst, Pia K; Nacke, Heiko; Kaiser, Kristin; Marhan, Sven; Sikorski, Johannes; Kandeler, Ellen; Daniel, Rolf; Overmann, Jörg

    2016-05-01

    Modern sequencing technologies allow high-resolution analyses of total and potentially active soil microbial communities based on their DNA and RNA, respectively. In the present study, quantitative PCR and 454 pyrosequencing were used to evaluate the effects of different extraction methods on the abundance and diversity of 16S rRNA genes and transcripts recovered from three different types of soils (leptosol, stagnosol, and gleysol). The quality and yield of nucleic acids varied considerably with respect to both the applied extraction method and the analyzed type of soil. The bacterial ribosome content (calculated as the ratio of 16S rRNA transcripts to 16S rRNA genes) can serve as an indicator of the potential activity of bacterial cells and differed by 2 orders of magnitude between nucleic acid extracts obtained by the various extraction methods. Depending on the extraction method, the relative abundances of dominant soil taxa, in particular Actino bacteria and Proteobacteria, varied by a factor of up to 10. Through this systematic approach, the present study allows guidelines to be deduced for the selection of the appropriate extraction protocol according to the specific soil properties, the nucleic acid of interest, and the target organisms.

  15. Estimates of Soil Bacterial Ribosome Content and Diversity Are Significantly Affected by the Nucleic Acid Extraction Method Employed

    PubMed Central

    Wüst, Pia K.; Nacke, Heiko; Kaiser, Kristin; Marhan, Sven; Sikorski, Johannes; Kandeler, Ellen; Daniel, Rolf

    2016-01-01

    Modern sequencing technologies allow high-resolution analyses of total and potentially active soil microbial communities based on their DNA and RNA, respectively. In the present study, quantitative PCR and 454 pyrosequencing were used to evaluate the effects of different extraction methods on the abundance and diversity of 16S rRNA genes and transcripts recovered from three different types of soils (leptosol, stagnosol, and gleysol). The quality and yield of nucleic acids varied considerably with respect to both the applied extraction method and the analyzed type of soil. The bacterial ribosome content (calculated as the ratio of 16S rRNA transcripts to 16S rRNA genes) can serve as an indicator of the potential activity of bacterial cells and differed by 2 orders of magnitude between nucleic acid extracts obtained by the various extraction methods. Depending on the extraction method, the relative abundances of dominant soil taxa, in particular Actinobacteria and Proteobacteria, varied by a factor of up to 10. Through this systematic approach, the present study allows guidelines to be deduced for the selection of the appropriate extraction protocol according to the specific soil properties, the nucleic acid of interest, and the target organisms. PMID:26896137

  16. Amino acid sequence around the active-site serine residue in the acyltransferase domain of goat mammary fatty acid synthetase.

    PubMed Central

    Mikkelsen, J; Højrup, P; Rasmussen, M M; Roepstorff, P; Knudsen, J

    1985-01-01

    Goat mammary fatty acid synthetase was labelled in the acyltransferase domain by formation of O-ester intermediates by incubation with [1-14C]acetyl-CoA and [2-14C]malonyl-CoA. Tryptic-digest and CNBr-cleavage peptides were isolated and purified by high-performance reverse-phase and ion-exchange liquid chromatography. The sequences of the malonyl- and acetyl-labelled peptides were shown to be identical. The results confirm the hypothesis that both acetyl and malonyl groups are transferred to the mammalian fatty acid synthetase complex by the same transferase. The sequence is compared with those of other fatty acid synthetase transferases. PMID:3922356

  17. Ligation with nucleic acid sequence-based amplification.

    PubMed

    Ong, Carmichael; Tai, Warren; Sarma, Aartik; Opal, Steven M; Artenstein, Andrew W; Tripathi, Anubhav

    2012-01-01

    This work presents a novel method for detecting nucleic acid targets using a ligation step along with an isothermal, exponential amplification step. We use an engineered ssDNA with two variable regions on the ends, allowing us to design the probe for optimal reaction kinetics and primer binding. This two-part probe is ligated by T4 DNA Ligase only when both parts bind adjacently to the target. The assay demonstrates that the expected 72-nt RNA product appears only when the synthetic target, T4 ligase, and both probe fragments are present during the ligation step. An extraneous 38-nt RNA product also appears due to linear amplification of unligated probe (P3), but its presence does not cause a false-positive result. In addition, 40 mmol/L KCl in the final amplification mix was found to be optimal. It was also found that increasing P5 in excess of P3 helped with ligation and reduced the extraneous 38-nt RNA product. The assay was also tested with a single nucleotide polymorphism target, changing one base at the ligation site. The assay was able to yield a negative signal despite only a single-base change. Finally, using P3 and P5 with longer binding sites results in increased overall sensitivity of the reaction, showing that increasing ligation efficiency can improve the assay overall. We believe that this method can be used effectively for a number of diagnostic assays.

  18. Multilocus sequence analyses reveal extensive diversity and multiple origins of fluconazole resistance in Candida tropicalis from tropical China

    PubMed Central

    Wu, Jin-Yan; Guo, Hong; Wang, Hua-Min; Yi, Guo-Hui; Zhou, Li-Min; He, Xiao-Wen; Zhang, Ying; Xu, Jianping

    2017-01-01

    Candida tropicalis is among the most prevalent human pathogenic yeast species, second only to C. albicans in certain geographic regions such as East Asia and Brazil. However, compared to C. albicans, relatively little is known about the patterns of genetic variation in C. tropicalis. This study analyzed the genetic diversity and relationships among isolates of C. tropicalis from the southern Chinese island of Hainan. A total of 116 isolates were obtained from seven geographic regions located across the Island. For each isolate, a total of 2677 bp from six gene loci were sequenced and 79 (2.96%) polymorphic nucleotide sites were found in our sample. Comparisons with strains reported from other parts of the world identified significant novel diversities in Hainan, including an average of six novel sequences (with a range 1 to 14) per locus and 80 novel diploid sequence types. Most of the genetic variation was found within individual strains and there was abundant evidence for gene flow among the seven geographic locations within Hainan. Interestingly, our analyses identified no significant correlation between the diploid sequence types at the six loci and fluconazole susceptibility, consistent with multiple origins of fluconazole resistance in the Hainan population of C. tropicalis. PMID:28186162

  19. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    PubMed Central

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729

  20. Thin-film technology for direct visual detection of nucleic acid sequences: applications in clinical research.

    PubMed

    Jenison, Robert D; Bucala, Richard; Maul, Diana; Ward, David C

    2006-01-01

    Certain optical conditions permit the unaided eye to detect thickness changes on surfaces on the order of 20 A, which are of similar dimensions to monomolecular interactions between proteins or hybridization of complementary nucleic acid sequences. Such detection exploits specific interference of reflected white light, wherein thickness changes are perceived as surface color changes. This technology, termed thin-film detection, allows for the visualization of subattomole amounts of nucleic acid targets, even in complex clinical samples. Thin-film technology has been applied to a broad range of clinically relevant indications, including the detection of pathogenic bacterial and viral nucleic acid sequences and the discrimination of sequence variations in human genes causally related to susceptibility or severity of disease.

  1. Molecular characterization and sequence diversity of genes encoding the large subunit of the ADP-glucose pyrophosphorylase in wheat (Triticum aestivum L.).

    PubMed

    Rose, Meghan K; Huang, Xiu-Qiang; Brûlé-Babel, Anita

    2016-02-01

    The large subunit of ADP glucose pyrophosphorylase (AGPase), the rate limiting enzyme in starch biosynthesis in Triticum aestivum L., is encoded by the ADP glucose pyrophosphorylase large subunit (AGP-L) gene. This was the first report on the development of three genome-specific primer sets for isolating the complete genomic sequence of all three homoeologous AGP-L genes on group 1 chromosomes. All three AGP-L genes consisted of 15 introns and 15 exons. The lengths of the structural genes from start to stop codon were 3334 bp for AGP-L-A1, 3351 bp for AGP-L-B1, and 3340 bp for AGP-L-D1. The coding region was 1569 bases long in all three genomes. All three AGP-L genes encoded 522 amino acid residues including the transit peptide sequences with 62 amino acid residues and the mature protein with 460 amino acid residues. The mature protein of three AGP-L genes was highly conserved. Three AGP-L genes were sequenced in 47 diverse spring and winter wheat genotypes. One and two haplotypes were found for AGP-L-D1 and AGP-L-A1, respectively. In total, 67 SNPs (single nucleotide polymorphisms) and 13 indels (insertions or deletions) forming five haplotypes were identified for AGP-L-B1. All 13 indels and 58 of the 67 SNPs among the 47 genotypes were located in the non-coding regions, while the remaining nine SNPs were synonymous substitutions in the coding region. Significant LD was found among the 45 SNPs and ten indels located from intron 2 to intron 3. Association analysis indicated that four SNPs were strongly associated with seed number per spike and thousand kernel weight.

  2. Development of Genomic Microsatellite Markers in Carthamus tinctorius L. (Safflower) Using Next Generation Sequencing and Assessment of Their Cross-Species Transferability and Utility for Diversity Analysis

    PubMed Central

    Variath, Murali Tottekkad; Joshi, Gopal; Bali, Sapinder; Agarwal, Manu; Kumar, Amar; Jagannath, Arun; Goel, Shailendra

    2015-01-01

    Background Safflower (Carthamus tinctorius L.), an Asteraceae member, yields high quality edible oil rich in unsaturated fatty acids and is resilient to dry conditions. The crop holds tremendous potential for improvement through concerted molecular breeding programs due to the availability of significant genetic and phenotypic diversity. Genomic resources that could facilitate such breeding programs remain largely underdeveloped in the crop. The present study was initiated to develop a large set of novel microsatellite markers for safflower using next generation sequencing. Principal Findings Low throughput genome sequencing of safflower was performed using Illumina paired end technology providing ~3.5X coverage of the genome. Analysis of sequencing data allowed identification of 23,067 regions harboring perfect microsatellite loci. The safflower genome was found to be rich in dinucleotide repeats followed by tri-, tetra-, penta- and hexa-nucleotides. Primer pairs were designed for 5,716 novel microsatellite sequences with repeat length ≥ 20 bases and optimal flanking regions. A subset of 325 microsatellite loci was tested for amplification, of which 294 loci produced robust amplification. The validated primers were used for assessment of 23 safflower accessions belonging to diverse agro-climatic zones of the world leading to identification of 93 polymorphic primers (31.6%). The numbers of observed alleles at each locus ranged from two to four and mean polymorphism information content was found to be 0.3075. The polymorphic primers were tested for cross-species transferability on nine wild relatives of cultivated safflower. All primers except one showed amplification in at least two wild species while 25 primers amplified across all the nine species. The UPGMA dendrogram clustered C. tinctorius accessions and wild species separately into two major groups. The proposed progenitor species of safflower, C. oxyacantha and C. palaestinus were genetically closer to

  3. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  4. RNA internal standard synthesis by nucleic acid sequence-based amplification for competitive quantitative amplification reactions.

    PubMed

    Lo, Wan-Yu; Baeumner, Antje J

    2007-02-15

    Nucleic acid sequence-based amplification (NASBA) reactions have been demonstrated to successfully synthesize new sequences based on deletion and insertion reactions. Two RNA internal standards were synthesized for use in competitive amplification reactions in which quantitative analysis can be achieved by coamplifying the internal standard with the wild type sample. The sequences were created in two consecutive NASBA reactions using the E. coli clpB mRNA sequence as model analyte. The primer sequences of the wild type sequence were maintained, and a 20-nt-long segment inside the amplicon region was exchanged for a new segment of similar GC content and melting temperature. The new RNA sequence was thus amplifiable using the wild type primers and detectable via a new inserted sequence. In the first reaction, the forwarding primer and an additional 20-nt-long sequence was deleted and replaced by a new 20-nt-long sequence. In the second reaction, a forwarding primer containing as 5' overhang sequence the wild type primer sequence was used. The presence of pure internal standard was verified using electrochemiluminescence and RNA lateral-flow biosensor analysis. Additional sequence deletion in order to shorten the internal standard amplicons and thus generate higher detection signals was found not to be required. Finally, a competitive NASBA reaction between one internal standard and the wild type sequence was carried out proving its functionality. This new rapid construction method via NASBA provides advantages over the traditional techniques since it requires no traditional cloning procedures, no thermocyclers, and can be completed in less than 4 h.

  5. Triazine-Based Sequence-Defined Polymers with Side-Chain Diversity and Backbone-Backbone Interaction Motifs.

    PubMed

    Grate, Jay W; Mo, Kai-For; Daily, Michael D

    2016-03-14

    Sequence control in polymers, well-known in nature, encodes structure and functionality. Here we introduce a new architecture, based on the nucleophilic aromatic substitution chemistry of cyanuric chloride, that creates a new class of sequence-defined polymers dubbed TZPs. Proof of concept is demonstrated with two synthesized hexamers, having neutral and ionizable side chains. Molecular dynamics simulations show backbone-backbone interactions, including H-bonding motifs and pi-pi interactions. This architecture is arguably biomimetic while differing from sequence-defined polymers having peptide bonds. The synthetic methodology supports the structural diversity of side chains known in peptides, as well as backbone-backbone hydrogen-bonding motifs, and will thus enable new macromolecules and materials with useful functions.

  6. Complete metagenome sequencing based bacterial diversity and functional insights from basaltic hot spring of Unkeshwar, Maharashtra, India

    PubMed Central

    Mehetre, Gajanan T.; Paranjpe, Aditi S.; Dastager, Syed G.; Dharne, Mahesh S.

    2015-01-01

    Unkeshwar hot springs are located at geographical South East Deccan Continental basalt of India. Here, we report the microbial community analysis of this hot spring using whole metagenome shotgun sequencing approach. The analysis revealed a total of 848,096 reads with 212.87 Mbps with 50.87% G + C content. Metagenomic sequences were deposited in SRA database with accession number (SUB1242219). Community analysis revealed 99.98% sequences belonging to bacteria and 0.01% to archaea and 0.01% to Viruses. The data obtained revealed 41 phyla including bacteria and Archaea and including 719 different species. In taxonomic analysis, the dominant phyla were found as, Actinobacteria (56%), Verrucomicrobia (24%), Bacteriodes (13%), Deinococcus-Thermus (3%) and firmicutes (2%) and Viruses (2%). Furthermore, functional annotation using pathway information revealed dynamic potential of hot spring community in terms of metabolism, environmental information processing, cellular processes and other important aspects. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of each contig sequence by assigning KEGG Orthology (KO) numbers revealed contig sequences that were assigned to metabolism, organismal system, Environmental Information Processing, cellular processes and human diseases with some unclassified sequences. The Unkeshwar hot springs offer rich phylogenetic diversity and metabolic potential for biotechnological applications. PMID:26981391

  7. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-01-01

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363

  8. Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM

    PubMed Central

    Altermann, Eric; Russell, W. Michael; Azcarate-Peril, M. Andrea; Barrangou, Rodolphe; Buck, B. Logan; McAuliffe, Olivia; Souther, Nicole; Dobson, Alleson; Duong, Tri; Callanan, Michael; Lick, Sonja; Hamrick, Alice; Cano, Raul; Klaenhammer, Todd R.

    2005-01-01

    Lactobacillus acidophilus NCFM is a probiotic bacterium that has been produced commercially since 1972. The complete genome is 1,993,564 nt and devoid of plasmids. The average GC content is 34.71% with 1,864 predicted ORFs, of which 72.5% were functionally classified. Nine phage-related integrases were predicted, but no complete prophages were found. However, three unique regions designated as potential autonomous units (PAUs) were identified. These units resemble a unique structure and bear characteristics of both plasmids and phages. Analysis of the three PAUs revealed the presence of two R/M systems and a prophage maintenance system killer protein. A spacers interspersed direct repeat locus containing 32 nearly perfect 29-bp repeats was discovered and may provide a unique molecular signature for this organism. In silico analyses predicted 17 transposase genes and a chromosomal locus for lactacin B, a class II bacteriocin. Several mucus- and fibronectin-binding proteins, implicated in adhesion to human intestinal cells, were also identified. Gene clusters for transport of a diverse group of carbohydrates, including fructooligosaccharides and raffinose, were present and often accompanied by transcriptional regulators of the lacI family. For protein degradation and peptide utilization, the organism encoded 20 putative peptidases, homologs for PrtP and PrtM, and two complete oligopeptide transport systems. Nine two-component regulatory systems were predicted, some associated with determinants implicated in bacteriocin production and acid tolerance. Collectively, these features within the genome sequence of L. acidophilus are likely to contribute to the organisms' gastric survival and promote interactions with the intestinal mucosa and microbiota. PMID:15671160

  9. In search of actionable targets for agrigenomics and microalgal biofuel production: sequence-structural diversity studies on algal and higher plants with a focus on GPAT protein.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar

    2013-04-01

    The triacylglycerol (TAG) pathway provides several targets for genetic engineering to optimize microalgal lipid productivity. GPAT (glycerol-3-phosphate acyltransferase) is a crucial enzyme that catalyzes the initial step of TAG biosynthesis. Despite many recent biochemical studies, a comprehensive sequence-structure analysis of GPAT across diverse lipid-yielding organisms is lacking. Hence, we performed a comparative genomic analysis of plastid-located GPAT proteins from 7 microalgae and 3 higher plants species. The close evolutionary relationship observed between red algae/diatoms and green algae/plant lineages in the phylogenetic tree were further corroborated by motif and gene structure analysis. The predicted molecular weight, amino acid composition, Instability Index, and hydropathicity profile gave an overall representation of the biochemical features of GPAT protein across the species under study. Furthermore, homology models of GPAT from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Glycine max provided deep insights into the protein architecture and substrate binding sites. Despite low sequence identity found between algal and plant GPATs, the developed models exhibited strikingly conserved topology consisting of 14α helices and 9β sheets arranged in two domains. However, subtle variations in amino acids of fatty acyl binding site were identified that might influence the substrate selectivity of GPAT. Together, the results will provide useful resources to understand the functional and evolutionary relationship of GPAT and potentially benefit in development of engineered enzyme for augmenting algal biofuel production.

  10. Genetic Diversity and Phylogenetic Evolution of Tibetan Sheep Based on mtDNA D-Loop Sequences.

    PubMed

    Liu, Jianbin; Ding, Xuezhi; Zeng, Yufeng; Yue, Yaojing; Guo, Xian; Guo, Tingting; Chu, Min; Wang, Fan; Han, Jilong; Feng, Ruilin; Sun, Xiaoping; Niu, Chune; Yang, Bohui; Guo, Jian; Yuan, Chao

    2016-01-01

    The molecular and population genetic evidence of the phylogenetic status of the Tibetan sheep (Ovis aries) is not well understood, and little is known about this species' genetic diversity. This knowledge gap is partly due to the difficulty of sample collection. This is the first work to address this question. Here, the genetic diversity and phylogenetic relationship of 636 individual Tibetan sheep from fifteen populations were assessed using 642 complete sequences of the mitochondrial DNA D-loop. Samples were collected from the Qinghai-Tibetan Plateau area in China, and reference data were obtained from the six reference breed sequences available in GenBank. The length of the sequences varied considerably, between 1031 and 1259 bp. The haplotype diversity and nucleotide diversity were 0.992±0.010 and 0.019±0.001, respectively. The average number of nucleotide differences was 19.635. The mean nucleotide composition of the 350 haplotypes was 32.961% A, 29.708% T, 22.892% C, 14.439% G, 62.669% A+T, and 37.331% G+C. Phylogenetic analysis showed that all four previously defined haplogroups (A, B, C, and D) were found in the 636 individuals of the fifteen Tibetan sheep populations but that only the D haplogroup was found in Linzhou sheep. Further, the clustering analysis divided the fifteen Tibetan sheep populations into at least two clusters. The estimation of the demographic parameters from the mismatch analyses showed that haplogroups A, B, and C had at least one demographic expansion in Tibetan sheep. These results contribute to the knowledge of Tibetan sheep populations and will help inform future conservation programs about the Tibetan sheep native to the Qinghai-Tibetan Plateau.

  11. Genetic Diversity and Phylogenetic Evolution of Tibetan Sheep Based on mtDNA D-Loop Sequences

    PubMed Central

    Yue, Yaojing; Guo, Xian; Guo, Tingting; Chu, Min; Wang, Fan; Han, Jilong; Feng, Ruilin; Sun, Xiaoping; Niu, Chune; Yang, Bohui; Guo, Jian; Yuan, Chao

    2016-01-01

    The molecular and population genetic evidence of the phylogenetic status of the Tibetan sheep (Ovis aries) is not well understood, and little is known about this species’ genetic diversity. This knowledge gap is partly due to the difficulty of sample collection. This is the first work to address this question. Here, the genetic diversity and phylogenetic relationship of 636 individual Tibetan sheep from fifteen populations were assessed using 642 complete sequences of the mitochondrial DNA D-loop. Samples were collected from the Qinghai-Tibetan Plateau area in China, and reference data were obtained from the six reference breed sequences available in GenBank. The length of the sequences varied considerably, between 1031 and 1259 bp. The haplotype diversity and nucleotide diversity were 0.992±0.010 and 0.019±0.001, respectively. The average number of nucleotide differences was 19.635. The mean nucleotide composition of the 350 haplotypes was 32.961% A, 29.708% T, 22.892% C, 14.439% G, 62.669% A+T, and 37.331% G+C. Phylogenetic analysis showed that all four previously defined haplogroups (A, B, C, and D) were found in the 636 individuals of the fifteen Tibetan sheep populations but that only the D haplogroup was found in Linzhou sheep. Further, the clustering analysis divided the fifteen Tibetan sheep populations into at least two clusters. The estimation of the demographic parameters from the mismatch analyses showed that haplogroups A, B, and C had at least one demographic expansion in Tibetan sheep. These results contribute to the knowledge of Tibetan sheep populations and will help inform future conservation programs about the Tibetan sheep native to the Qinghai-Tibetan Plateau. PMID:27463976

  12. Allelic diversity of the MHC class II DRB genes in brown bears (Ursus arctos) and a comparison of DRB sequences within the family Ursidae.

    PubMed

    Goda, N; Mano, T; Kosintsev, P; Vorobiev, A; Masuda, R

    2010-11-01

    The allelic diversity of the DRB locus in major histocompatibility complex (MHC) genes was analyzed in the brown bear (Ursus arctos) from the Hokkaido Island of Japan, Siberia, and Kodiak of Alaska. Nineteen alleles of the DRB exon 2 were identified from a total of 38 individuals of U. arctos and were highly polymorphic. Comparisons of non-synonymous and synonymous substitutions in the antigen-binding sites of deduced amino acid sequences indicated evidence for balancing selection on the bear DRB locus. The phylogenetic analysis of the DRB alleles among three genera (Ursus, Tremarctos, and Ailuropoda) in the family Ursidae revealed that DRB allelic lineages were not separated according to species. This strongly shows trans-species persistence of DRB alleles within the Ursidae.

  13. Distribution and diversity of Verrucomicrobia methanotrophs in geothermal and acidic environments.

    PubMed

    Sharp, Christine E; Smirnova, Angela V; Graham, Jaime M; Stott, Matthew B; Khadka, Roshan; Moore, Tim R; Grasby, Stephen E; Strack, Maria; Dunfield, Peter F

    2014-06-01

    Recently, methanotrophic members of the phylum Verrucomicrobia have been described, but little is known about their distribution in nature. We surveyed methanotrophic bacteria in geothermal springs and acidic wetlands via pyrosequencing of 16S rRNA gene amplicons. Putative methanotrophic Verrucomicrobia were found in samples covering a broad temperature range (22.5-81.6°C), but only in acidic conditions (pH 1.8-5.0) and only in geothermal environments, not in acidic bogs or fens. Phylogenetically, three 16S rRNA gene sequence clusters of putative methanotrophic Verrucomicrobia were observed. Those detected in high-temperature geothermal samples (44.1-81.6°C) grouped with known thermoacidiphilic 'Methylacidiphilum' isolates. A second group dominated in moderate-temperature geothermal samples (22.5-40.1°C) and a representative mesophilic methanotroph from this group was isolated (strain LP2A). Genome sequencing verified that strain LP2A possessed particulate methane monooxygenase, but its 16S rRNA gene sequence identity to 'Methylacidiphilum infernorum' strain V4 was only 90.6%. A third group clustered distantly with known methanotrophic Verrucomicrobia. Using pmoA-gene targeted quantitative polymerase chain reaction, two geothermal soil profiles showed a dominance of LP2A-like pmoA sequences in the cooler surface layers and 'Methylacidiphilum'-like pmoA sequences in deeper, hotter layers. Based on these results, there appears to be a thermophilic group and a mesophilic group of methanotrophic Verrucomicrobia. However, both were detected only in acidic geothermal environments.

  14. High-throughput DNA sequencing of the moose rumen from different geographical locations reveals a core ruminal methanogenic archaeal diversity and a differential ciliate protozoal diversity

    PubMed Central

    Sundset, Monica A.; Crouse, John; Wright, André-Denis G.

    2015-01-01

    Moose rumen samples from Vermont, Alaska and Norway were investigated for methanogenic archaeal and protozoal density using real-time PCR, and diversity using high-throughput sequencing of the 16S and 18S rRNA genes. Vermont moose showed the highest protozoal and methanogen densities. Alaskan samples had the highest percentages of Methanobrevibacter smithii, followed by the Norwegian samples. One Norwegian sample contained 43 % Methanobrevibacter thaueri, whilst all other samples contained < 10 %. Vermont samples had large percentages of Methanobrevibacter ruminantium, as did two Norwegian samples. Methanosphaera stadtmanae represented one-third of sequences in three samples. Samples were heterogeneous based on gender, geographical location and weight class using analysis of molecular variance (AMOVA). Two Alaskan moose contained >70 % Polyplastron multivesiculatum and one contained >75 % Entodinium spp. Protozoa from Norwegian moose belonged predominantly (>50 %) to the genus Entodinium, especially Entodinium caudatum. Norwegian moose contained a large proportion of sequences (25–97 %) which could not be classified beyond family. Protozoa from Vermont samples were predominantly Eudiplodinium rostratum (>75 %), with up to 7 % Diploplastron affine. Four of the eight Vermont samples also contained 5–12 % Entodinium spp. Samples were heterogeneous based on AMOVA, principal coordinate analysis and UniFrac. This study gives the first insight into the methanogenic archaeal diversity in the moose rumen. The high percentage of rumen archaeal species associated with high starch diets found in Alaskan moose corresponds well to previous data suggesting that they feed on plants high in starch. Similarly, the higher percentage of species related to forage diets in Vermont moose also relates well to their higher intake of fibre. PMID:28348818

  15. High-throughput DNA sequencing of the moose rumen from different geographical locations reveals a core ruminal methanogenic archaeal diversity and a differential ciliate protozoal diversity.

    PubMed

    Ishaq, Suzanne L; Sundset, Monica A; Crouse, John; Wright, André-Denis G

    2015-10-01

    Moose rumen samples from Vermont, Alaska and Norway were investigated for methanogenic archaeal and protozoal density using real-time PCR, and diversity using high-throughput sequencing of the 16S and 18S rRNA genes. Vermont moose showed the highest protozoal and methanogen densities. Alaskan samples had the highest percentages of Methanobrevibacter smithii, followed by the Norwegian samples. One Norwegian sample contained 43 % Methanobrevibacter thaueri, whilst all other samples contained < 10 %. Vermont samples had large percentages of Methanobrevibacter ruminantium, as did two Norwegian samples. Methanosphaera stadtmanae represented one-third of sequences in three samples. Samples were heterogeneous based on gender, geographical location and weight class using analysis of molecular variance (AMOVA). Two Alaskan moose contained >70 % Polyplastron multivesiculatum and one contained >75 % Entodinium spp. Protozoa from Norwegian moose belonged predominantly (>50 %) to the genus Entodinium, especially Entodinium caudatum. Norwegian moose contained a large proportion of sequences (25-97 %) which could not be classified beyond family. Protozoa from Vermont samples were predominantly Eudiplodinium rostratum (>75 %), with up to 7 % Diploplastron affine. Four of the eight Vermont samples also contained 5-12 % Entodinium spp. Samples were heterogeneous based on AMOVA, principal coordinate analysis and UniFrac. This study gives the first insight into the methanogenic archaeal diversity in the moose rumen. The high percentage of rumen archaeal species associated with high starch diets found in Alaskan moose corresponds well to previous data suggesting that they feed on plants high in starch. Similarly, the higher percentage of species related to forage diets in Vermont moose also relates well to their higher intake of fibre.

  16. Complete Genome Sequence of Rhodococcus sp. Strain IcdP1 Shows Diverse Catabolic Potential

    PubMed Central

    Qu, Jie; Miao, Li-Li; Liu, Ying

    2015-01-01

    The complete genome sequence of Rhodococcus sp. strain IcdP1 is presented here. This organism was shown to degrade a broad range of high-molecular-weight polycyclic aromatic hydrocarbons and organochlorine pesticides. The sequence data can be used to predict genes for xenobiotic biodegradation and metabolism. PMID:26139718

  17. Genome-wide survey of genetic diversity of apple using genotyping-by-sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    With the advent of next-generation sequencing technologies it is now possible to generate large numbers of genetic markers without the need to rely on costly microarray platforms. Genotyping-by-Sequencing (GBS) enables the simultaneous identification and genotyping of single nucleotide polymorphisms...

  18. Amino acid sequences of two nonspecific lipid-transfer proteins from germinated castor bean.

    PubMed

    Takishima, K; Watanabe, S; Yamada, M; Suga, T; Mamiya, G

    1988-11-01

    The amino acid sequence of two nonspecific lipid-transfer proteins (nsLTP) B and C from germinated castor bean seeds have been determined. Both the proteins consist of 92 residues, as for nsLTP previously reported, and their calculated Mr values are 9847 and 9593 for nsLTP-B and nsLTP-C, respectively. The sequences of nsLTP-B and nsLTP-C, compared to the known sequence of nsLTP-A from the same source, are 68% and 35% similar, respectively. No variation was found at the positions of the cysteine residues, indicating that they might be involved in disulfide bridges.

  19. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    PubMed

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  20. Metabolic reprogramming by the pyruvate dehydrogenase kinase-lactic acid axis: Linking metabolism and diverse neuropathophysiologies.

    PubMed

    Jha, Mithilesh Kumar; Lee, In-Kyu; Suk, Kyoungho

    2016-09-01

    Emerging evidence indicates that there is a complex interplay between metabolism and chronic disorders in the nervous system. In particular, the pyruvate dehydrogenase (PDH) kinase (PDK)-lactic acid axis is a critical link that connects metabolic reprogramming and the pathophysiology of neurological disorders. PDKs, via regulation of PDH complex activity, orchestrate the conversion of pyruvate either aerobically to acetyl-CoA, or anaerobically to lactate. The kinases are also involved in neurometabolic dysregulation under pathological conditions. Lactate, an energy substrate for neurons, is also a recently acknowledged signaling molecule involved in neuronal plasticity, neuron-glia interactions, neuroimmune communication, and nociception. More recently, the PDK-lactic acid axis has been recognized to modulate neuronal and glial phenotypes and activities, contributing to the pathophysiologies of diverse neurological disorders. This review covers the recent advances that implicate the PDK-lactic acid axis as a novel linker of metabolism and diverse neuropathophysiologies. We finally explore the possibilities of employing the PDK-lactic acid axis and its downstream mediators as putative future therapeutic strategies aimed at prevention or treatment of neurological disorders.

  1. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  2. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  3. Sequence Based Structural Characterization and Genetic Diversity Analysis of Full Length TLR4 CDS in Crossbred and Indigenous Cattle.

    PubMed

    Mishra, Chinmoy; Kumar, Subodh; Sonwane, Arvind Asaram; Yathish, H M; Chaudhary, Rajni

    2017-01-02

    The exploration of candidate genes for immune response in cattle may be vital for improving our understanding regarding the species specific response to pathogens. Toll-like receptor 4 (TLR4) is mostly involved in protection against the deleterious effects of Gram negative pathogens. Approximately 2.6 kb long cDNA sequence of TLR4 gene covering the entire coding region was characterized in two Indian milk cattle (Vrindavani and Tharparkar). The phylogenetic analysis confirmed that the bovine TLR4 was apparently evolved from an ancestral form that predated the appearance of vertebrates, and it is grouped with buffalo, yak, and mithun TLR4s. Sequence analysis revealed a 2526-nucleotide long open reading frame (ORF) encoding 841 amino acids, similar to other cattle breeds. The calculated molecular weight of the translated ORF was 96144 and 96040.9 Da; the isoelectric point was 6.35 and 6.42 in Vrindavani and Tharparkar cattle, respectively. The Simple Modular Architecture Research Tool (SMART) analysis identified 14 leucine rich repeats (LRR) motifs in bovine TLR4 protein. The deduced TLR4 amino acid sequence of Tharparkar had 4 different substitutions as compared to Bos taurus, Sahiwal, and Vrindavani. The signal peptide cleavage site predicted to lie between 16th and 17th amino acid of mature peptide. The transmebrane helix was identified between 635-657 amino acids in the mature peptide.

  4. Complete amino acid sequence of the N-terminal extension of calf skin type III procollagen.

    PubMed Central

    Brandt, A; Glanville, R W; Hörlein, D; Bruckner, P; Timpl, R; Fietzek, P P; Kühn, K

    1984-01-01

    The N-terminal extension peptide of type III procollagen, isolated from foetal-calf skin, contains 130 amino acid residues. To determine its amino acid sequence, the peptide was reduced and carboxymethylated or aminoethylated and fragmented with trypsin, Staphylococcus aureus V8 proteinase and bacterial collagenase. Pyroglutamate aminopeptidase was used to deblock the N-terminal collagenase fragment to enable amino acid sequencing. The type III collagen extension peptide is homologous to that of the alpha 1 chain of type I procollagen with respect to a three-domain structure. The N-terminal 79 amino acids, which contain ten of the 12 cysteine residues, form a compact globular domain. The next 39 amino acids are in a collagenase triplet sequence (Gly- Xaa - Yaa )n with a high hydroxyproline content. Finally, another short non-collagenous domain of 12 amino acids ends at the cleavage site for procollagen aminopeptidase, which cleaves a proline-glutamine bond. In contrast with type I procollagen, the type III procollagen extension peptides contain interchain disulphide bridges located at the C-terminus of the triple-helical domain. PMID:6331392

  5. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  6. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  7. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  8. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  9. Diversity of anaerobic gut fungal populations analysed using ribosomal ITS1 sequences in faeces of wild and domesticated herbivores.

    PubMed

    Nicholson, Matthew J; McSweeney, Christopher S; Mackie, Roderick I; Brookman, Jayne L; Theodorou, Michael K

    2010-04-01

    Gut fungal-specific PCR primers have been used to selectively amplify the ITS1 region of gut fungal rDNA recovered from faeces of domestic and wild animals to investigate population diversity. Two different gel-based methods are described for separating populations of gut fungal rDNA amplicons, namely (1) denaturing gradient gel electrophoresis (DGGE) and (2) separation according to small size differences using Spreadex, a proprietary matrix for electrophoresis. Gut fungal populations were characterised by analysis of rDNA in faeces of seventeen domesticated and ten wild herbivores. Sequences derived from these gel-based characterisations were analysed and classified using a hidden Markov model-based fingerprint matching algorithm. Faecal samples contained a broad spectrum of fungi and sequences from five of the six recognised genera were identified, including Cyllamyces, the most recently described gut fungal genus, which was found to be widely distributed in the samples. Furthermore, four other novel groupings of gut fungal sequences were identified that did not cluster with sequences from any of the previously described genera. Both gel- and sequence- based profiles for gut fungal populations suggested a lack of geographical restriction on occurrence of any individual fungal type.

  10. Complete amino acid sequence of branched-chain amino acid aminotransferase (transaminase B) of Salmonella typhimurium, identification of the coenzyme-binding site and sequence comparison analysis

    SciTech Connect

    Feild, M.J.

    1988-01-01

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase of Salmonella typhimurium was determined by automated Edman degradation of peptide fragments generated by chemical and enzymatic digestion of S-carboxymethylated and S-pyridylethylated transaminase B. Peptide fragments of transaminase B were generated by treatment of the enzyme with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. Protocols were developed for separation of the peptide fragments by reverse-phase high performance liquid chromatography (HPLC), ion-exchange HPLC, and SDS-urea gel electrophoresis. The enzyme subunit contains 308 amino acid residues and has a molecular weight of 33,920 daltons. The coenzyme-binding site was determined by treatment of the enzyme, containing bound pyridoxal 5-phosphate, with tritiated sodium borohydride prior to trypsin digestion. Monitoring radioactivity incorporation and peptide map comparisons with an apoenzyme tryptic digest, allowed identification of the pyridoxylated-peptide which was isolated by reverse-phase HPLC and sequenced. The coenzyme-binding site is a lysyl residue at position 159. Some peptides were further characterized by fast atom bombardment mass spectrometry.

  11. Genetic diversity of Taenia asiatica from Thailand and other geographical locations as revealed by cytochrome c oxidase subunit 1 sequences.

    PubMed

    Anantaphruti, Malinee Thairungroj; Thaenkham, Urusa; Watthanakulpanich, Dorn; Phuphisut, Orawan; Maipanich, Wanna; Yoonuan, Tippayarat; Nuamtanong, Supaporn; Pubampen, Somjit; Sanguankiat, Surapol

    2013-02-01

    Twelve 924 bp cytochrome c oxidase subunit 1 (cox1) mitochondrial DNA sequences from Taenia asiatica isolates from Thailand were aligned and compared with multiple sequence isolates from Thailand and 6 other countries from the GenBank database. The genetic divergence of T. asiatica was also compared with Taenia saginata database sequences from 6 different countries in Asia, including Thailand, and 3 countries from other continents. The results showed that there were minor genetic variations within T. asiatica species, while high intraspecies variation was found in T. saginata. There were only 2 haplotypes and 1 polymorphic site found in T. asiatica, but 8 haplotypes and 9 polymorphic sites in T. saginata. Haplotype diversity was very low, 0.067, in T. asiatica and high, 0.700, in T. saginata. The very low genetic diversity suggested that T. asiatica may be at a risk due to the loss of potential adaptive alleles, resulting in reduced viability and decreased responses to environmental changes, which may endanger the species.

  12. Diversity in prokaryotic glycosylation: an archaeal-derived N-linked glycan contains legionaminic acid.

    PubMed

    Kandiba, Lina; Aitio, Olli; Helin, Jari; Guan, Ziqiang; Permi, Perttu; Bamford, Dennis H; Eichler, Jerry; Roine, Elina

    2012-05-01

    VP4, the major structural protein of the haloarchaeal pleomorphic virus, HRPV-1, is glycosylated. To define the glycan structure attached to this protein, oligosaccharides released by β-elimination were analysed by mass spectrometry and nuclear magnetic resonance spectroscopy. Such analyses showed that the major VP4-derived glycan is a pentasaccharide comprising glucose, glucuronic acid, mannose, sulphated glucuronic acid and a terminal 5-N-formyl-legionaminic acid residue. This is the first observation of legionaminic acid, a sialic acid-like sugar, in an archaeal-derived glycan structure. The importance of this residue for viral infection was demonstrated upon incubation with N-acetylneuraminic acid, a similar monosaccharide. Such treatment reduced progeny virus production by half 4 h post infection. LC-ESI/MS analysis confirmed the presence of pentasaccharide precursors on two different VP4-derived peptides bearing the N-glycosylation signal, NTT. The same sites modified by the native host, Halorubrum sp. strain PV6, were also recognized by the Haloferax volcanii N-glycosylation apparatus, as determined by LC-ESI/MS of heterologously expressed VP4. Here, however, the N-linked pentasaccharide was the same as shown to decorate the S-layer glycoprotein in this species. Hence, N-glycosylation of the haloarchaeal viral protein, VP4, is host-specific. These results thus present additional examples of archaeal N-glycosylation diversity and show the ability of Archaea to modify heterologously expressed proteins.

  13. Recombination sequences in plant mitochondrial genomes: diversity and homologies to known mitochondrial genes.

    PubMed Central

    Stern, D B; Palmer, J D

    1984-01-01

    Several plant mitochondrial genomes contain repeated sequences that are postulated to be sites of homologous intragenomic recombination (1-3). In this report, we have used filter hybridizations to investigate sequence relationships between the cloned mitochondrial DNA (mtDNA) recombination repeats from turnip, spinach and maize and total mtDNA isolated from thirteen species of angiosperms. We find that strong sequence homologies exist between the spinach and turnip recombination repeats and essentially all other mitochondrial genomes tested, whereas a major maize recombination repeat does not hybridize to any other mtDNA. The sequences homologous to the turnip repeat do not appear to function in recombination in any other genome, whereas the spinach repeat hybridizes to reiterated sequences within the mitochondrial genomes of wheat and two species of pokeweed that do appear to be sites of recombination. Thus, although intragenomic recombination is a widespread phenomenon in plant mitochondria, it appears that different sequences either serve as substrates for this function in different species, or else surround a relatively short common recombination site which does not cross-hybridize under our experimental conditions. Identified gene sequences from maize mtDNA were used in heterologous hybridizations to show that the repeated sequences implicated in recombination in turnip and spinach/pokeweed/wheat mitochondria include, or are closely linked to genes for subunit II of cytochrome c oxidase and 26S rRNA, respectively. Together with previous studies indicating that the 18S rRNA gene in wheat mtDNA is contained within a recombination repeat (3), these results imply an unexpectedly frequent association between recombination repeats and plant mitochondrial genes. Images PMID:6473104

  14. Genetic diversity of wild Prunus cerasifera Ehrhart (wild cherry plum) in China revealed by simple-sequence repeat markers.

    PubMed

    Zhao, Y; Li, Y; Liu, Y; Yang, Y F

    2015-07-28

    Simple-sequence repeat (SSR) markers were employed to assess the genetic diversity of wild Prunus cerasifera Ehrhart (wild cherry plum) in China. Fourteen SSR primer pairs generated a total of 94 alleles (90 were polymorphic, accounting for 95.74%), with a mean of 6.71 alleles per locus. The number of alleles detected at each locus ranged from 2 at BPPCT 028 to 13 at BPPCT 002, with an average of 6.71 alleles per locus. Nei's genetic diversity ranged from 0.0938 to 0.4951 and Shannon's information index ranged from 0.1706 to 0.6882, with averages of 0.3295 and 0.4899, respectively. The SSR data indicated moderate genetic diversity of P. cerasifera in China. In the unweighted pair group method with arithmetic mean phylogenetic tree, the 40 forms of P. cerasifera were divided into 3 genetic clusters. However, the 3 clades determined using SSR data were not consistent with the classification based on morphological characters, such as fruit color. Because of the endangered status and the moderate genetic diversity of P. cerasifera in China, both in situ and ex situ conservation strategies should be adopted.

  15. Genetic diversity and variance of Stentor coeruleus (Ciliophora: Heterotrichea) inferred from inter-simple sequence repeat (ISSR) fingerprinting.

    PubMed

    Zhang, Wen-Jing; Lin, Yuan-Shao; Cao, Wen-Qing; Yang, Jun

    2012-01-01

    We used inter-simple sequence repeat fingerprinting to analyze the genetic structure of 16 populations of Stentor coeruleus from three lakes and three ponds in China. Using 14 polymorphic primers, a total of 99 discernible DNA fragments were detected, among which 76 (76.77%) were polymorphic, indicating median genetic diversity in these populations. Further, both Nei's gene diversity (h) and Shannon's information index (I) between the different populations revealed a median genetic diversity. At the same time, gene flow was interpreted to be low. The main factors responsible for the median level of diversity and low gene flow within populations are probably due to a low frequency of sexual recombinations. Analysis of molecular variance showed that there was high genetic differentiation among the five water bodies. Both cluster analysis and a nonmetric multidimensional scaling analysis suggested that genotypes isolated from the same locations displayed a higher genetic similarity than those from different ones, separating populations into subgroups according to their geographical locations. However, there is a weak positive correlation between the genetic distance and geographical distance.

  16. Genetic Diversity of Arabica Coffee (Coffea arabica L.) in Nicaragua as Estimated by Simple Sequence Repeat Markers

    PubMed Central

    Geleta, Mulatu; Herrera, Isabel; Monzón, Arnulfo; Bryngelsson, Tomas

    2012-01-01

    Coffea arabica L. (arabica coffee), the only tetraploid species in the genus Coffea, represents the majority of the world's coffee production and has a significant contribution to Nicaragua's economy. The present paper was conducted to determine the genetic diversity of arabica coffee in Nicaragua for its conservation and breeding values. Twenty-six populations that represent eight varieties in Nicaragua were investigated using simple sequence repeat (SSR) markers. A total of 24 alleles were obtained from the 12 loci investigated across 260 individual plants. The total Nei's gene diversity (HT) and the within-population gene diversity (HS) were 0.35 and 0.29, respectively, which is comparable with that previously reported from other countries and regions. Among the varieties, the highest diversity was recorded in the variety Catimor. Analysis of variance (AMOVA) revealed that about 87% of the total genetic variation was found within populations and the remaining 13% differentiate the populations (FST = 0.13; P < 0.001). The variation among the varieties was also significant. The genetic variation in Nicaraguan coffee is significant enough to be used in the breeding programs, and most of this variation can be conserved through ex situ conservation of a low number of populations from each variety. PMID:22701376

  17. The amino acid sequence of cytochromes c-551 from three species of Pseudomonas

    PubMed Central

    Ambler, R. P.; Wynn, Margaret

    1973-01-01

    The amino acid sequences of the cytochromes c-551 from three species of Pseudomonas have been determined. Each resembles the protein from Pseudomonas strain P6009 (now known to be Pseudomonas aeruginosa, not Pseudomonas fluorescens) in containing 82 amino acids in a single peptide chain, with a haem group covalently attached to cysteine residues 12 and 15. In all four sequences 43 residues are identical. Although by bacteriological criteria the organisms are closely related, the differences between pairs of sequences range from 22% to 39%. These values should be compared with the differences in the sequence of mitochondrial cytochrome c between mammals and amphibians (about 18%) or between mammals and insects (about 33%). Detailed evidence for the amino acid sequences of the proteins has been deposited as Supplementary Publication SUP 50015 at the National Lending Library for Science and Technology, Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1973), 131, 5. PMID:4352718

  18. Draft Genome Sequence of Sorghum Grain Mold Fungus Epicoccum sorghinum, a Producer of Tenuazonic Acid

    PubMed Central

    Oliveira, Rodrigo C.; Davenport, Karen W.; Hovde, Blake; Silva, Danielle; Chain, Patrick S. G.; Correa, Benedito

    2017-01-01

    ABSTRACT The facultative plant pathogen Epicoccum sorghinum is associated with grain mold of sorghum and produces the mycotoxin tenuazonic acid. This fungus can have serious economic impact on sorghum production. Here, we report the draft genome sequence of E. sorghinum (USPMTOX48). PMID:28126937

  19. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein.

  20. Draft Genome Sequence of Bacillus coagulans NL01, a Wonderful l-Lactic Acid Producer

    PubMed Central

    Zheng, Zhaojuan; Jiang, Ting; Lin, Xi; Zhou, Jie

    2015-01-01

    Here, we report the draft genome sequence of Bacillus coagulans NL01, which could produce high optically pure l-lactic acid using xylose as a sole carbon source. The draft genome is 3,505,081 bp, with 144 contigs. About 3,903 protein-coding genes and 92 rRNAs are predicted from this assembly. PMID:26089419

  1. Diverse Array of New Viral Sequences Identified in Worldwide Populations of the Asian Citrus Psyllid (Diaphorina citri) Using Viral Metagenomics

    PubMed Central

    Nouri, Shahideh; Salem, Nidá; Nigg, Jared C.

    2015-01-01

    ABSTRACT The Asian citrus psyllid, Diaphorina citri, is the natural vector of the causal agent of Huanglongbing (HLB), or citrus greening disease. Together; HLB and D. citri represent a major threat to world citrus production. As there is no cure for HLB, insect vector management is considered one strategy to help control the disease, and D. citri viruses might be useful. In this study, we used a metagenomic approach to analyze viral sequences associated with the global population of D. citri. By sequencing small RNAs and the transcriptome coupled with bioinformatics analysis, we showed that the virus-like sequences of D. citri are diverse. We identified novel viral sequences belonging to the picornavirus superfamily, the Reoviridae, Parvoviridae, and Bunyaviridae families, and an unclassified positive-sense single-stranded RNA virus. Moreover, a Wolbachia prophage-related sequence was identified. This is the first comprehensive survey to assess the viral community from worldwide populations of an agricultural insect pest. Our results provide valuable information on new putative viruses, some of which may have the potential to be used as biocontrol agents. IMPORTANCE Insects have the most species of all animals, and are hosts to, and vectors of, a great variety of known and unknown viruses. Some of these most likely have the potential to be important fundamental and/or practical resources. In this study, we used high-throughput next-generation sequencing (NGS) technology and bioinformatics analysis to identify putative viruses associated with Diaphorina citri, the Asian citrus psyllid. D. citri is the vector of the bacterium causing Huanglongbing (HLB), currently the most serious threat to citrus worldwide. Here, we report several novel viral sequences associated with D. citri. PMID:26676774

  2. Evolution of phosphagen kinase V. cDNA-derived amino acid sequences of two molluscan arginine kinases from the chiton Liolophura japonica and the turbanshell Battilus cornutus.

    PubMed

    Suzuki, T; Ban, T; Furukohri, T

    1997-06-20

    The cDNAs of arginine kinases from the chiton Liolophura japonica (Polyplacophora) and the turbanshell Battilus cornutus (Gastropoda) were amplified by polymerase chain reaction (PCR), and the complete nucleotide sequences of 1669 and 1624 bp, respectively, were determined. The open reading frame for Liolophura arginine kinase is 1050 nucleotides in length and encodes a protein with 349 amino acid residues, and that for Battilus is 1077 nucleotides and 358 residues. The validity of the cDNA-derived amino acid sequence was supported by chemical sequencing of internal tryptic peptides. The molecular masses were calculated to be 39,057 and 39,795 Da, respectively. The amino acid sequence of Liolophura arginine kinase showed 65-68% identity with those of Battilus and Nordotis (abalone) arginine kinases, and the homology between Battilus and Nordotis was 79%. Molluscan arginine kinases also show lower, but significant homology (38-43%) with rabbit creatine kinase. The sequences of arginine kinases could be used as a molecular clock to elucidate the phylogeny of Mollusca, one of the most diverse animal phyla.

  3. Multilocus sequence typing approach for a broader range of species of Leishmania genus: describing parasite diversity in Argentina.

    PubMed

    Marco, Jorge D; Barroso, Paola A; Locatelli, Fabricio M; Cajal, S Pamela; Hoyos, Carlos L; Nevot, M Cecilia; Lauthier, Juan J; Tomasini, Nicolás; Juarez, Marisa; Estévez, J Octavio; Korenaga, Masataka; Nasser, Julio R; Hashiguchi, Yoshihisa; Ruybal, Paula

    2015-03-01

    Leishmaniasis is a vector-borne protozoan infection affecting over 350 million people around the world. In Argentina cutaneous leishmaniasis is endemic in nine provinces and visceral leishmaniasis is spreading from autochthonous transmission foci in seven provinces. However, there is limited information about the diversity of the parasite in this country. Implementation of molecular strategies for parasite typing, particularly multilocus sequence typing (MLST), represents an improved approach for genetic variability and population dynamics analyses. We selected six loci as candidates implemented in reference strains and Argentinean isolates. Phylogenetic analysis showed high correlation with taxonomic classification of the parasite. Autochthonous Leishmania (Viannia) braziliensis showed higher genetic diversity than L. (Leishmania) infantum but low support was obtained for intra-L. braziliensis complex variants suggesting the need of new loci that contribute to phylogenetic resolution for an improved MLST or nested-MLST scheme. This study represents the first characterization of genetic variability of Leishmania spp. in Argentina.

  4. NextGen sequencing reveals short double crossovers contribute disproportionately to genetic diversity in Toxoplasma gondii

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Toxoplasma gondii is a widespread protozoan parasite of animals that causes zoonotic disease in humans. Three clonal variants predominate in North America and Europe, while South American strains are genetically diverse, and undergo more frequent recombination. All three northern clonal variants s...

  5. Unusually high genetic diversity in COI sequences of Chimarra obscura (Trichoptera: Philopotamidae)

    EPA Science Inventory

    Chimarra obscura (Walker 1852) is a philopotamid caddisfly found throughout much of North America. Using the COI DNA barcode locus, we have found unexpectedly high amounts of genetic diversity and distances within C. obscura. Of the approximately 150 specimens sampled, we have fo...

  6. Genetic Diversity in Lens Species Revealed by EST and Genomic Simple Sequence Repeat Analysis.

    PubMed

    Dikshit, Harsh Kumar; Singh, Akanksha; Singh, Dharmendra; Aski, Muraleedhar Sidaram; Prakash, Prapti; Jain, Neelu; Meena, Suresh; Kumar, Shiv; Sarker, Ashutosh

    2015-01-01

    Low productivity of pilosae type lentils grown in South Asia is attributed to narrow genetic base of the released cultivars which results in susceptibility to biotic and abiotic stresses. For enhancement of productivity and production, broadening of genetic base is essentially required. The genetic base of released cultivars can be broadened by using diverse types including bold seeded and early maturing lentils from Mediterranean region and related wild species. Genetic diversity in eighty six accessions of three species of genus Lens was assessed based on twelve genomic and thirty one EST-SSR markers. The evaluated set of genotypes included diverse lentil varieties and advanced breeding lines from Indian programme, two early maturing ICARDA lines and five related wild subspecies/species endemic to the Mediterranean region. Genomic SSRs exhibited higher polymorphism in comparison to EST SSRs. GLLC 598 produced 5 alleles with highest gene diversity value of 0.80. Among the studied subspecies/species 43 SSRs detected maximum number of alleles in L. orientalis. Based on Nei's genetic distance cultivated lentil L. culinaris subsp. culinaris was found to be close to its wild progenitor L. culinaris subsp. orientalis. The Prichard's structure of 86 genotypes distinguished different subspecies/species. Higher variability was recorded among individuals within population than among populations.

  7. Amino acid sequences of heterotrophic and photosynthetic ferredoxins from the tomato plant (Lycopersicon esculentum Mill.).

    PubMed

    Kamide, K; Sakai, H; Aoki, K; Sanada, Y; Wada, K; Green, L S; Yee, B C; Buchanan, B B

    1995-11-01

    Several forms (isoproteins) of ferredoxin in roots, leaves, and green and red pericarps in tomato plants (Lycopersicon esculentum Mill.) were earlier identified on the basis of N-terminal amino acid sequence and chromatographic behavior (Green et al. 1991). In the present study, a large scale preparation made possible determination of the full length amino acid sequence of the two ferredoxins from leaves. The ferredoxins characteristic of fruit and root were sequenced from the amino terminus to the 30th residue or beyond. The leaf ferredoxins were confirmed to be expressed in pericarp of both green and red fruit. The ferredoxins characteristic of fruit and root appeared to be restricted to those tissue. The results extend earlier findings in demonstrating that ferredoxin occurs in the major organs of the tomato plant where it appears to function irrespective of photosynthetic competence.

  8. Amino acid sequence of myoglobin from white-tailed deer (Odocoileus virginianus).

    PubMed

    Joseph, Poulson; Suman, Surendranath P; Li, Shuting; Fontaine, Michele; Steinke, Laurey

    2012-10-01

    Our objective was to determine the primary structure of white-tailed deer myoglobin (Mb). White-tailed deer Mb was isolated from cardiac muscles employing ammonium sulfate precipitation and gel-filtration chromatography. The amino acid sequence was determined by Edman degradation. Sequence analyses of intact Mb as well as tryptic- and cyanogen bromide-peptides yielded the complete primary structure of white-tailed deer Mb, which shared 100% similarity with red deer Mb. White-tailed deer Mb consists of 153 amino acid residues and shares more than 96% sequence similarity with myoglobins from meat-producing ruminants, such as cattle, buffalo, sheep, and goat. Similar to sheep and goat myoglobins, white-tailed deer Mb contains 12 histidine residues. Proximal (position 93) and distal (position 64) histidine residues responsible for maintaining the stability of heme are conserved in white-tailed deer Mb.

  9. Linking wine lactic acid bacteria diversity with wine aroma and flavour.

    PubMed

    Cappello, Maria Stella; Zapparoli, Giacomo; Logrieco, Antonio; Bartowsky, Eveline J

    2017-02-21

    In the last two decades knowledge on lactic acid bacteria (LAB) associated with wine has increased considerably. Investigations on genetic and biochemistry of species involved in malolactic fermentation, such as Oenococcus oeni and of Lactobacillus have enabled a better understand of their role in aroma modification and microbial stability of wine. In particular, the use of molecular techniques has provided evidence on the high diversity at species and strain level, thus improving the knowledge on wine LAB taxonomy and ecology. These tools demonstrated to also be useful to detect strains with potential desirable or undesirable traits for winemaking purposes. At the same time, advances on the enzymatic properties of wine LAB responsible for the development of wine aroma molecules have been undertaken. Interestingly, it has highlighted the high intraspecific variability of enzymatic activities such as glucosidase, esterase, proteases and those related to citrate metabolism within the wine LAB species. This genetic and biochemistry diversity that characterizes wine LAB populations can generate a wide spectrum of wine sensory outcomes. This review examines some of these interesting aspects as a way to elucidate the link between LAB diversity with wine aroma and flavour. In particular, the correlation between inter- and intra-species diversity and bacterial metabolic traits that affect the organoleptic properties of wines is highlighted with emphasis on the importance of enzymatic potential of bacteria for the selection of starter cultures to control MLF and to enhance wine aroma.

  10. Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Gordon, Sean

    2013-03-01

    Sean Gordon of the USDA on "Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  11. Staphylococcus epidermidis pan-genome sequence analysis reveals diversity of skin commensal and hospital infection-associated isolates

    PubMed Central

    2012-01-01

    Background While Staphylococcus epidermidis is commonly isolated from healthy human skin, it is also the most frequent cause of nosocomial infections on indwelling medical devices. Despite its importance, few genome sequences existed and the most frequent hospital-associated lineage, ST2, had not been fully sequenced. Results We cultivated 71 commensal S. epidermidis isolates from 15 skin sites and compared them with 28 nosocomial isolates from venous catheters and blood cultures. We produced 21 commensal and 9 nosocomial draft genomes, and annotated and compared their gene content, phylogenetic relatedness and biochemical functions. The commensal strains had an open pan-genome with 80% core genes and 20% variable genes. The variable genome was characterized by an overabundance of transposable elements, transcription factors and transporters. Biochemical diversity, as assayed by antibiotic resistance and in vitro biofilm formation, demonstrated the varied phenotypic consequences of this genomic diversity. The nosocomial isolates exhibited both large-scale rearrangements and single-nucleotide variation. We showed that S. epidermidis genomes separate into two phylogenetic groups, one consisting only of commensals. The formate dehydrogenase gene, present only in commensals, is a discriminatory marker between the two groups. Conclusions Commensal skin S. epidermidis have an open pan-genome and show considerable diversity between isolates, even when derived from a single individual or body site. For ST2, the most common nosocomial lineage, we detect variation between three independent isolates sequenced. Finally, phylogenetic analyses revealed a previously unrecognized group of S. epidermidis strains characterized by reduced virulence and formate dehydrogenase, which we propose as a clinical molecular marker. PMID:22830599

  12. Genetic diversity, genetic structure and demographic history of Cycas simplicipinna (Cycadaceae) assessed by DNA sequences and SSR markers

    PubMed Central

    2014-01-01

    Background Cycas simplicipinna (T. Smitinand) K. Hill. (Cycadaceae) is an endangered species in China. There were seven populations and 118 individuals that we could collect were genotyped in this study. Here, we assessed the genetic diversity, genetic structure and demographic history of this species. Results Analyses of data of DNA sequences (two maternally inherited intergenic spacers of chloroplast, cpDNA and one biparentally inherited internal transcribed spacer region ITS4-ITS5, nrDNA) and sixteen microsatellite loci (SSR) were conducted in the species. Of the 118 samples, 86 individuals from the seven populations were used for DNA sequencing and 115 individuals from six populations were used for the microsatellite study. We found high genetic diversity at the species level, low genetic diversity within each of the seven populations and high genetic differentiation among the populations. There was a clear genetic structure within populations of C. simplicipinna. A demographic history inferred from DNA sequencing data indicates that C. simplicipinna experienced a recent population contraction without retreating to a common refugium during the last glacial period. The results derived from SSR data also showed that C. simplicipinna underwent past effective population contraction, likely during the Pleistocene. Conclusions Some genetic features of C. simplicipinna such as having high genetic differentiation among the populations, a clear genetic structure and a recent population contraction could provide guidelines for protecting this endangered species from extinction. Furthermore, the genetic features with population dynamics of the species in our study would help provide insights and guidelines for protecting other endangered species effectively. PMID:25016306

  13. Nucleotide sequence and the encoded amino acids of human apolipoprotein A-I mRNA.

    PubMed Central

    Law, S W; Brewer, H B

    1984-01-01

    The cDNA clones encoding the precursor form of human liver apolipoprotein A-I (apoA-I), preproapoA-I, have been isolated from a cDNA library. A 17-base synthetic oligonucleotide based on residues 108-113 of apoA-I and a 26-base primer-extended, dideoxynucleotide-terminated cDNA were used as hybridization probes to select for recombinant plasmids bearing the apoA-I sequence. The complete nucleic acid sequence of human liver preproapoA-I has been determined by analysis of the cloned cDNA. The sequence is composed of 801 nucleotides encoding 267 amino acid residues. PreproapoA-I contains an 18-amino-acid prepeptide and a 6-amino-acid propeptide connected to the amino terminus of the 243-amino acid mature apoA-I. Southern blotting analysis of chromosomal DNA obtained from peripheral blood indicated the apoA-I gene is contained in a 2.1-kilobase-pair Pst I fragment and there is no gross difference in structural organization between the normal apoA-I gene and the Tangier disease apoA-I gene. Images PMID:6198645

  14. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids.

    PubMed

    Das, Jayanta Kumar; Das, Provas; Ray, Korak Kumar; Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as 'FPKATD' and 'Y/FTNEKL' without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids' pattern in different proteins.

  15. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    PubMed Central

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  16. Simultaneous Presence of Insertion Sequence Excision Enhancer and Insertion Sequence IS629 Correlates with Increased Diversity and Virulence in Shiga Toxin-Producing Escherichia coli

    PubMed Central

    Toro, M.; Rump, L. V.; Cao, G.; Meng, J.; Brown, E. W.

    2015-01-01

    Although new serotypes of enterohemorrhagic Escherichia coli (EHEC) emerge constantly, the mechanisms by which these new pathogens arise and the reasons emerging serotypes tend to carry more virulence genes than other E. coli are not understood. An insertion sequence (IS) excision enhancer (IEE) was discovered in EHEC O157:H7 that promoted the excision of IS3 family members and generating various genomic deletions. One IS3 family member, IS629, actively transposes and proliferates in EHEC O157:H7 and enterotoxigenic E. coli (ETEC) O139 and O149. The simultaneous presence of the IEE and IS629 (and other IS3 family members) may be part of a system promoting not only adaptation and genome diversification in E. coli O157:H7 but also contributing to the development of pathogenicity among predominant serotypes. Prevalence comparisons of these elements in 461 strains, representing 72 different serotypes and 5 preassigned seropathotypes (SPT) A to E, showed that the presence of these two elements simultaneously was serotype specific and associated with highly pathogenic serotypes (O157 and top non-O157 Shiga toxin-producing Escherichia coli [STEC]) implicated in outbreaks and sporadic cases of human illness (SPT A and B). Serotypes lacking one or both elements were less likely to have been isolated from clinical cases. Our comparisons of IEE sequences showed sequence variations that could be divided into at least three clusters. Interestingly, the IEE sequences from O157 and the top 10 non-O157 STEC serotypes fell into clusters I and II, while less commonly isolated serotypes O5 and O174 fell into cluster III. These results suggest that IS629 and IEE elements may be acting synergistically to promote genome plasticity and genetic diversity among STEC strains, enhancing their abilities to adapt to hostile environments and rapidly take up virulence factors. PMID:26292302

  17. Genetic Diversity of Flavobacterium psychrophilum Isolates from Three Oncorhynchus spp. in the United States, as Revealed by Multilocus Sequence Typing

    PubMed Central

    Van Vliet, Danielle; Wiens, Gregory D.; Loch, Thomas P.; Nicolas, Pierre

    2016-01-01

    ABSTRACT The use of a multilocus sequence typing (MLST) technique has identified the intraspecific genetic diversity of U.S. Flavobacterium psychrophilum, an important pathogen of salmonids worldwide. Prior to this analysis, little U.S. F. psychrophilum genetic information was known; this is of importance when considering targeted control strategies, including vaccine development. Herein, MLST was used to investigate the genetic diversity of 96 F. psychrophilum isolates recovered from rainbow trout (Oncorhynchus mykiss), coho salmon (Oncorhynchus kisutch), and Chinook salmon (Oncorhynchus tshawytscha) that originated from nine U.S. states. The isolates fell into 34 distinct sequence types (STs) that clustered in 5 clonal complexes (CCs) (n = 63) or were singletons (n = 33). The distribution of STs varied spatially, by host species, and in association with mortality events. Several STs (i.e., ST9, ST10, ST30, and ST78) were found in multiple states, whereas the remaining STs were localized to single states. With the exception of ST256, which was recovered from rainbow trout and Chinook salmon, all STs were found to infect a single host species. Isolates that were collected during bacterial cold water disease outbreaks most frequently belonged to CC-ST10 (e.g., ST10 and ST78). Collectively, the results of this study clearly demonstrate the genetic diversity of F. psychrophilum within the United States and identify STs of clinical significance. Although the majority of STs described herein were novel, some (e.g., ST9, ST10, ST13, ST30, and ST31) were previously recovered on other continents, which demonstrates the transcontinental distribution of F. psychrophilum genotypes. IMPORTANCE Flavobacterium psychrophilum is the causative agent of bacterial cold water disease (BCWD) and rainbow trout fry syndrome (RTFS) and is an important bacterial pathogen of wild and farmed salmonids worldwide. These infections are responsible for large economic losses globally, yet the

  18. Application of Ion Torrent Sequencing to the Assessment of the Effect of Alkali Ballast Water Treatment on Microbial Community Diversity

    PubMed Central

    Fujimoto, Masanori; Moyerbrailean, Gregory A.; Noman, Sifat; Gizicki, Jason P.; Ram, Michal L.; Green, Phyllis A.; Ram, Jeffrey L.

    2014-01-01

    The impact of NaOH as a ballast water treatment (BWT) on microbial community diversity was assessed using the 16S rRNA gene based Ion Torrent sequencing with its new 400 base chemistry. Ballast water samples from a Great Lakes ship were collected from the intake and discharge of both control and NaOH (pH 12) treated tanks and were analyzed in duplicates. One set of duplicates was treated with the membrane-impermeable DNA cross-linking reagent propidium mono-azide (PMA) prior to PCR amplification to differentiate between live and dead microorganisms. Ion Torrent sequencing generated nearly 580,000 reads for 31 bar-coded samples and revealed alterations of the microbial community structure in ballast water that had been treated with NaOH. Rarefaction analysis of the Ion Torrent sequencing data showed that BWT using NaOH significantly decreased microbial community diversity relative to control discharge (p<0.001). UniFrac distance based principal coordinate analysis (PCoA) plots and UPGMA tree analysis revealed that NaOH-treated ballast water microbial communities differed from both intake communities and control discharge communities. After NaOH treatment, bacteria from the genus Alishewanella became dominant in the NaOH-treated samples, accounting for <0.5% of the total reads in intake samples but more than 50% of the reads in the treated discharge samples. The only apparent difference in microbial community structure between PMA-processed and non-PMA samples occurred in intake water samples, which exhibited a significantly higher amount of PMA-sensitive cyanobacteria/chloroplast 16S rRNA than their corresponding non-PMA total DNA samples. The community assembly obtained using Ion Torrent sequencing was comparable to that obtained from a subset of samples that were also subjected to 454 pyrosequencing. This study showed the efficacy of alkali ballast water treatment in reducing ballast water microbial diversity and demonstrated the application of new Ion Torrent

  19. Application of ion torrent sequencing to the assessment of the effect of alkali ballast water treatment on microbial community diversity.

    PubMed

    Fujimoto, Masanori; Moyerbrailean, Gregory A; Noman, Sifat; Gizicki, Jason P; Ram, Michal L; Green, Phyllis A; Ram, Jeffrey L

    2014-01-01

    The impact of NaOH as a ballast water treatment (BWT) on microbial community diversity was assessed using the 16S rRNA gene based Ion Torrent sequencing with its new 400 base chemistry. Ballast water samples from a Great Lakes ship were collected from the intake and discharge of both control and NaOH (pH 12) treated tanks and were analyzed in duplicates. One set of duplicates was treated with the membrane-impermeable DNA cross-linking reagent propidium mono-azide (PMA) prior to PCR amplification to differentiate between live and dead microorganisms. Ion Torrent sequencing generated nearly 580,000 reads for 31 bar-coded samples and revealed alterations of the microbial community structure in ballast water that had been treated with NaOH. Rarefaction analysis of the Ion Torrent sequencing data showed that BWT using NaOH significantly decreased microbial community diversity relative to control discharge (p<0.001). UniFrac distance based principal coordinate analysis (PCoA) plots and UPGMA tree analysis revealed that NaOH-treated ballast water microbial communities differed from both intake communities and control discharge communities. After NaOH treatment, bacteria from the genus Alishewanella became dominant in the NaOH-treated samples, accounting for <0.5% of the total reads in intake samples but more than 50% of the reads in the treated discharge samples. The only apparent difference in microbial community structure between PMA-processed and non-PMA samples occurred in intake water samples, which exhibited a significantly higher amount of PMA-sensitive cyanobacteria/chloroplast 16S rRNA than their corresponding non-PMA total DNA samples. The community assembly obtained using Ion Torrent sequencing was comparable to that obtained from a subset of samples that were also subjected to 454 pyrosequencing. This study showed the efficacy of alkali ballast water treatment in reducing ballast water microbial diversity and demonstrated the application of new Ion Torrent

  20. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided.

  1. High diversity of airborne fungi in the hospital environment as revealed by meta-sequencing-based microbiome analysis

    PubMed Central

    Tong, Xunliang; Xu, Hongtao; Zou, Lihui; Cai, Meng; Xu, Xuefeng; Zhao, Zuotao; Xiao, Fei; Li, Yanming

    2017-01-01

    Invasive fungal infections acquired in the hospital have progressively emerged as an important cause of life-threatening infection. In particular, airborne fungi in hospitals are considered critical pathogens of hospital-associated infections. To identify the causative airborne microorganisms, high-volume air samplers were utilized for collection, and species identification was performed using a culture-based method and DNA sequencing analysis with the Illumina MiSeq and HiSeq 2000 sequencing systems. Few bacteria were grown after cultivation in blood agar. However, using microbiome sequencing, the relative abundance of fungi, Archaea species, bacteria and viruses was determined. The distribution characteristics of fungi were investigated using heat map analysis of four departments, including the Respiratory Intensive Care Unit, Intensive Care Unit, Emergency Room and Outpatient Department. The prevalence of Aspergillus among fungi was the highest at the species level, approximately 17% to 61%, and the prevalence of Aspergillus fumigatus among Aspergillus species was from 34% to 50% in the four departments. Draft genomes of microorganisms isolated from the hospital environment were obtained by sequence analysis, indicating that investigation into the diversity of airborne fungi may provide reliable results for hospital infection control and surveillance. PMID:28045065

  2. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    SciTech Connect

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  3. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-o...

  4. Homologous recombination drives both sequence diversity and gene content variation in Neisseria meningitidis.

    PubMed

    Kong, Ying; Ma, Jennifer H; Warren, Keisha; Tsang, Raymond S W; Low, Donald E; Jamieson, Frances B; Alexander, David C; Hao, Weilong

    2013-01-01

    The study of genetic and phenotypic variation is fundamental for understanding the dynamics of bacterial genome evolution and untangling the evolution and epidemiology of bacterial pathogens. Neisseria meningitidis (Nm) is among the most intriguing bacterial pathogens in genomic studies due to its dynamic population structure and complex forms of pathogenicity. Extensive genomic variation within identical clonal complexes (CCs) in Nm has been recently reported and suggested to be the result of homologous recombination, but the extent to which recombination contributes to genomic variation within identical CCs has remained unclear. In this study, we sequenced two Nm strains of identical serogroup (C) and multi-locus sequence type (ST60), and conducted a systematic analysis with an additional 34 Nm genomes. Our results revealed that all gene content variation between the two ST60 genomes was introduced by homologous recombination at the conserved flanking genes, and 94.25% or more of sequence divergence was caused by homologous recombination. Recombination was found in genes associated with virulence factors, antigenic outer membrane proteins, and vaccine targets, suggesting an important role of homologous recombination in rapidly altering the pathogenicity and antigenicity of Nm. Recombination was also evident in genes of the restriction and modification systems, which may undermine barriers to DNA exchange. In conclusion, homologous recombination can drive both gene content variation and sequence divergence in Nm. These findings shed new light on the understanding of the rapid pathoadaptive evolution of Nm and other recombinogenic bacterial pathogens.

  5. Diploid Musa acuminata genetic diversity assayed with sequence-tagged microsatellite sites.

    PubMed

    Grapin, A; Noyer, J L; Carreel, F; Dambier, D; Baurens, F C; Lanaud, C; Lagoda, P J

    1998-06-01

    The sequence-tagged microsatellite site (STMS) discrimination potential was explored using nine microsatellite primer pairs. STMS polymorphism was assayed by nonradioactive urea-polyacrylamide gel electrophoresis. Genetic relationships were examined among 59 genotypes of wild or cultivated accessions of diploid Musa acuminata. The organization of the subspecies was confirmed and some clone relationships were clarified.

  6. The importance of sequence diversity in the aggregation and evolution of proteins.

    PubMed

    Wright, Caroline F; Teichmann, Sarah A; Clarke, Jane; Dobson, Christopher M

    2005-12-08

    Incorrect folding of proteins, leading to aggregation and amyloid formation, is associated with a group of highly debilitating medical conditions including Alzheimer's disease and late-onset diabetes. The issue of how unwanted protein association is normally avoided in a living system is particularly significant in the context of the evolution of multidomain proteins, which account for over 70% of all eukaryotic proteins, where the effective local protein concentration in the vicinity of each domain is very high. Here we describe the aggregation kinetics of multidomain protein constructs of immunoglobulin domains and the ability of different homologous domains to aggregate together. We show that aggregation of these proteins is a specific process and that the efficiency of coaggregation between different domains decreases markedly with decreasing sequence identity. Thus, whereas immunoglobulin domains with more than about 70% identity are highly prone to coaggregation, those with less than 30-40% sequence identity do not detectably interact. A bioinformatics analysis of consecutive homologous domains in large multidomain proteins shows that such domains almost exclusively have sequence identities of less than 40%, in other words below the level at which coaggregation is likely to be efficient. We propose that such low sequence identities could have a crucial and general role in safeguarding proteins against misfolding and aggregation.

  7. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  8. Abscisic acid has a key role in modulating diverse plant-pathogen interactions.

    PubMed

    Fan, Jun; Hill, Lionel; Crooks, Casey; Doerner, Peter; Lamb, Chris

    2009-08-01

    We isolated an activation-tagged Arabidopsis (Arabidopsis thaliana) line, constitutive disease susceptibility2-1D (cds2-1D), that showed enhanced bacterial growth when challenged with various Pseudomonas syringae strains. Systemic acquired resistance and systemic PATHOGENESIS-RELATED GENE1 induction were also compromised in cds2-1D. The T-DNA insertion adjacent to NINE-CIS-EPOXYCAROTENOID DIOXYGENASE5 (NCED5), one of six genes encoding the abscisic acid (ABA) biosynthetic enzyme NCED, caused a massive increase in transcript level and enhanced ABA levels >2-fold. Overexpression of NCED genes recreated the enhanced disease susceptibility phenotype. NCED2, NCED3, and NCED5 were induced, and ABA accumulated strongly following compatible P. syringae infection. The ABA biosynthetic mutant aba3-1 showed reduced susceptibility to virulent P. syringae, and ABA, whether through exogenous application or endogenous accumulation in response to mild water stress, resulted in increased bacterial growth following challenge with virulent P. syringae, indicating that ABA suppresses resistance to P. syringae. Likewise ABA accumulation also compromised resistance to the biotrophic oomycete Hyaloperonospora arabidopsis, whereas resistance to the fungus Alternaria brassicicola was enhanced in cds2-1D plants and compromised in aba3-1 plants, indicating that ABA promotes resistance to this necrotroph. Comparison of the accumulation of salicylic acid and jasmonic acid in the wild type, cds2-1D, and aba3-1 plants challenged with P. syringae showed that ABA promotes jasmonic acid accumulation and exhibits a complex antagonistic relationship with salicylic acid. Our findings provide genetic evidence that the abiotic stress signal ABA also has profound roles in modulating diverse plant-pathogen interactions mediated at least in part by cross talk with the jasmonic acid and salicylic acid biotic stress signal pathways.

  9. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    PubMed

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method.

  10. Amino acid sequence of band-3 protein from rainbow trout erythrocytes derived from cDNA.

    PubMed Central

    Hübner, S; Michel, F; Rudloff, V; Appelhans, H

    1992-01-01

    In this report we present the first complete band-3 cDNA sequence of a poikilothermic lower vertebrate. The primary structure of the anion-exchange protein band 3 (AE1) from rainbow trout erythrocytes was determined by nucleotide sequencing of cDNA clones. The overlapping clones have a total length of 3827 bp with a 5'-terminal untranslated region of 150 bp, a 2754 bp open reading frame and a 3'-untranslated region of 924 bp. Band-3 protein from trout erythrocytes consists of 918 amino acid residues with a calculated molecular mass of 101 827 Da. Comparison of its amino acid sequence revealed a 60-65% identity within the transmembrane spanning sequence of band-3 proteins published so far. An additional insertion of 24 amino acid residues within the membrane-associated domain of trout band-3 protein was identified, which until now was thought to be a general feature only of mammalian band-3-related proteins. PMID:1637296

  11. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken; SNL,

    2016-07-12

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  12. Identification of different clonal complexes and diverse amino acid substitutions in penicillin-binding protein 2 (PBP2) associated with borderline oxacillin resistance in Canadian Staphylococcus aureus isolates.

    PubMed

    Nadarajah, Jeya; Lee, Mark J S; Louie, Lisa; Jacob, Latha; Simor, Andrew E; Louie, Marie; McGavin, Martin J

    2006-12-01

    Borderline oxacillin-resistant Staphylococcus aureus (BORSA) exhibit oxacillin MIC values of 1-8 microg ml(-1), but lack mecA, which encodes the low-affinity penicillin-binding protein (PBP)2a. The relationship of the BORSA phenotype with specific genetic backgrounds was assessed, as well as amino acid sequence variation in the normal PBP2. Among 38 BORSA, 26 had a common PFGE profile of genomic DNA, and were multilocus sequence type (ST)25. The other isolates were genetically diverse. Complete pbp2 sequences were determined for three BORSA, corresponding to ST25, ST1 and ST47, which were selected on the basis of lacking blaZ-encoded beta-lactamase. The essential transpeptidase-domain-encoding segment of pbp2 was also sequenced from seven additional ST25 isolates. Amino acid substitutions occurred in the transpeptidase domain of all BORSA, irrespective of clonal type. A Gln(629)-->Pro substitution was common to all ST25 BORSA, but most could be distinguished from one another by additional unique substitutions in the transpeptidase domain. The ST1 and ST47 isolates also possessed unique substitutions in the transpeptidase domain. Plasmid-mediated expression of pbp2 from an ST25 or ST1 isolate in S. aureus RN6390 increased its oxacillin MIC from 0.25 to 4 microg ml(-1), while pbp2 from a susceptible strain, ATCC 25923, had no effect. Therefore, different amino acid substitutions in PBP2 of diverse BORSA lineages contribute to borderline resistance. The predominant ST25 lineage was not related to any of the five clonal complexes that contain meticillin-resistant S. aureus (MRSA), suggesting that ST25 cannot readily acquire mecA-mediated resistance.

  13. Role of the two-component leader sequence and mature amino acid sequences in extracellular export of endoglucanase EGL from Pseudomonas solanacearum.

    PubMed Central

    Huang, J Z; Schell, M A

    1992-01-01

    The egl gene of Pseudomonas solanacearum encodes a 43-kDa extracellular endoglucanase (mEGL) involved in wilt disease caused by this phytopathogen. Egl is initially translated with a 45-residue, two-part leader sequence. The first 19 residues are apparently removed by signal peptidase II during export of Egl across the inner membrane (IM); the remaining residues of the leader sequence (modified with palmitate) are removed during export across the outer membrane (OM). Localization of Egl-PhoA fusion proteins showed that the first 26 residues of the Egl leader sequence are required and sufficient to direct lipid modification, processing, and export of Egl or PhoA across the IM but not the OM. Fusions of the complete 45-residue leader sequence or of the leader and increasing portions of mEgl sequences to PhoA did not cause its export across the OM. In-frame deletion of portions of mEGL-coding sequences blocked export of the truncated polypeptides across the OM without affecting export across the IM. These results indicate that the first part of the leader sequence functions independently to direct export of Egl across the IM while the second part and sequences and structures in mEGL are involved in export across the OM. Computer analysis of the mEgl amino acid sequence obtained from its nucleotide sequence identified a region of mEGL similar in amino acid sequence to regions in other prokaryotic endoglucanases. Images PMID:1735723

  14. Studies on adenosine triphosphate transphosphorylases. Amino acid sequence of rabbit muscle ATP-AMP transphosphorylase.

    PubMed

    Kuby, S A; Palmieri, R H; Frischat, A; Fischer, A H; Wu, L H; Maland, L; Manship, M

    1984-05-22

    The total amino acid sequence of rabbit muscle adenylate kinase has been determined, and the single polypeptide chain of 194 amino acid residues starts with N-acetylmethionine and ends with leucyllysine at its carboxyl terminus, in agreement with the earlier data on its amino acid composition [Mahowald, T. A., Noltmann, E. A., & Kuby, S. A. (1962) J. Biol. Chem. 237, 1138-1145] and its carboxyl-terminus sequence [Olson, O. E., & Kuby, S. A. (1964) J. Biol. Chem. 239, 460-467]. Elucidation of the primary structure was based on tryptic and chymotryptic cleavages of the performic acid oxidized protein, cyanogen bromide cleavages of the 14C-labeled S-carboxymethylated protein at its five methionine sites (followed by maleylation of peptide fragments), and tryptic cleavages at its 12 arginine sites of the maleylated 14C-labeled S-carboxymethylated protein. Calf muscle myokinase, whose sequence has also been established, differs primarily from the rabbit muscle myokinase's sequence in the following: His-30 is replaced by Gln-30; Lys-56 is replaced by Met-56; Ala-84 and Asp 85 are replaced by Val-84 and Asn-85. A comparison of the four muscle-type adenylate kinases, whose covalent structures have now been determined, viz., rabbit, calf, porcine, and human [for the latter two sequences see Heil, A., Müller, G., Noda, L., Pinder, T., Schirmer, H., Schirmer, I., & Von Zabern, I. (1974) Eur. J. Biochem. 43, 131-144, and Von Zabern, I., Wittmann-Liebold, B., Untucht-Grau, R., Schirmer, R. H., & Pai, E. F. (1976) Eur. J. Biochem. 68, 281-290], demonstrates an extraordinary degree of homology.(ABSTRACT TRUNCATED AT 250 WORDS)

  15. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids

    PubMed Central

    Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as ‘FPKATD’ and ‘Y/FTNEKL’ without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids’ pattern in different proteins. PMID:27930687

  16. The complete amino acid sequence of a trypsin inhibitor from Bauhinia variegata var. candida seeds.

    PubMed

    Di Ciero, L; Oliva, M L; Torquato, R; Köhler, P; Weder, J K; Camillo Novello, J; Sampaio, C A; Oliveira, B; Marangoni, S

    1998-11-01

    Trypsin inhibitors of two varieties of Bauhinia variegata seeds have been isolated and characterized. Bauhinia variegata candida trypsin inhibitor (BvcTI) and B. variegata lilac trypsin inhibitor (BvlTI) are proteins with Mr of about 20,000 without free sulfhydryl groups. Amino acid analysis shows a high content of aspartic acid, glutamic acid, serine, and glycine, and a low content of histidine, tyrosine, methionine, and lysine in both inhibitors. Isoelectric focusing for both varieties detected three isoforms (pI 4.85, 5.00, and 5.15), which were resolved by HPLC procedure. The trypsin inhibitors show Ki values of 6.9 and 1.2 nM for BvcTI and BvlTI, respectively. The N-terminal sequences of the three trypsin inhibitor isoforms from both varieties of Bauhinia variegata and the complete amino acid sequence of B. variegata var. candida L. trypsin inhibitor isoform 3 (BvcTI-3) are presented. The sequences have been determined by automated Edman degradation of the reduced and carboxymethylated proteins of the peptides resulting from Staphylococcus aureus protease and trypsin digestion. BvcTI-3 is composed of 167 residues and has a calculated molecular mass of 18,529. Homology studies with other trypsin inhibitors show that BvcTI-3 belongs to the Kunitz family. The putative active site encompasses Arg (63)-Ile (64).

  17. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  18. Next-Generation Sequencing Assessment of Eukaryotic Diversity in Oil Sands Tailings Ponds Sediments and Surface Water.

    PubMed

    Aguilar, Maria; Richardson, Elisabeth; Tan, BoonFei; Walker, Giselle; Dunfield, Peter F; Bass, David; Nesbø, Camilla; Foght, Julia; Dacks, Joel B

    2016-11-01

    Tailings ponds in the Athabasca oil sands (Canada) contain fluid wastes, generated by the extraction of bitumen from oil sands ores. Although the autochthonous prokaryotic communities have been relatively well characterized, almost nothing is known about microbial eukaryotes living in the anoxic soft sediments of tailings ponds or in the thin oxic layer of water that covers them. We carried out the first next-generation sequencing study of microbial eukaryotic diversity in oil sands tailings ponds. In metagenomes prepared from tailings sediment and surface water, we detected very low numbers of sequences encoding eukaryotic small subunit ribosomal RNA representing seven major taxonomic groups of protists. We also produced and analysed three amplicon-based 18S rRNA libraries prepared from sediment samples. These revealed a more diverse set of taxa, 169 different OTUs encompassing up to eleven higher order groups of eukaryotes, according to detailed classification using homology searching and phylogenetic methods. The 10 most abundant OTUs accounted for > 90% of the total of reads, vs. large numbers of rare OTUs (< 1% abundance). Despite the anoxic and hydrocarbon-enriched nature of the environment, the tailings ponds harbour complex communities of microbial eukaryotes indicating that these organisms should be taken into account when studying the microbiology of the oil sands.

  19. SUBGROUPS OF AMINO ACID SEQUENCES IN THE VARIABLE REGIONS OF IMMUNOGLOBULIN HEAVY CHAINS*

    PubMed Central

    Cunningham, Bruce A.; Pflumm, Mollie N.; User, Urs Rutisha; Edelman, Gerald M.

    1969-01-01

    The amino acid sequence of the first 133 residues of the heavy (γ) chain from a human γG immunoglobulin (He) has been determined. This γ-chain is identical in Gm type to that of protein Eu, the complete sequence of which has been reported. Comparison of the two sequences substantiates the previous suggestion that there are subgroups of variable regions of heavy chains. The variable region of Eu has been assigned to subgroup I and that of He to subgroup II; on the other hand, the constant regions of the two proteins appear to be identical. Comparison of the sequence of the heavy chain of He with the heavy chain sequences determined in other laboratories suggests that the variable region of subgroup II is at least 118 residues long. The nature and distribution of amino acid variations in this heavy chain subgroup resemble those observed in light chain subgroups. These studies provide evidence that the translocation hypothesis applies to heavy as well as to light chains, viz., genes for variable regions (V) are somatically translocated to genes for constant regions (C) to form complete VC structural genes. Images PMID:5264153

  20. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand.

  1. Variation in seed fatty acid composition and sequence divergence in the FAD2 gene coding region between wild and cultivated sesame.

    PubMed

    Chen, Zhenbang; Tonnis, Brandon; Morris, Brad; Wang, Richard B; Zhang, Amy L; Pinnow, David; Wang, Ming Li

    2014-12-03

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examination of fatty acid composition. The coding region of the FAD2 gene for fatty acid desaturase (FAD) in these accessions was also sequenced. Cultivated sesame accessions flowered and matured earlier than the wild species. The cultivated sesame seeds contained a significantly higher percentage of oleic acid (40.4%) than the seeds of the wild species (26.1%). Nucleotide polymorphisms were identified in the FAD2 gene coding region between wild and cultivated species. Some nucleotide polymorphisms led to amino acid changes, one of which was located in the enzyme active site and may contribute to the altered fatty acid composition. Based on the morphology observation, chemical analysis, and sequence analysis, it was determined that two accessions were misnamed and need to be reclassified. The results obtained from this study are useful for sesame improvement in molecular breeding programs.

  2. Diversity and genetic stability in banana genotypes in a breeding program using inter simple sequence repeats (ISSR) markers.

    PubMed

    Silva, A V C; Nascimento, A L S; Vitória, M F; Rabbani, A R C; Soares, A N R; Lédo, A S

    2017-02-23

    Banana (Musa spp) is a fruit species frequently cultivated and consumed worldwide. Molecular markers are important for estimating genetic diversity in germplasm and between genotypes in breeding programs. The objective of this study was to analyze the genetic diversity of 21 banana genotypes (FHIA 23, PA42-44, Maçã, Pacovan Ken, Bucaneiro, YB42-47, Grand Naine, Tropical, FHIA 18, PA94-01, YB42-17, Enxerto, Japira, Pacovã, Prata-Anã, Maravilha, PV79-34, Caipira, Princesa, Garantida, and Thap Maeo), by using inter-simple sequence repeat (ISSR) markers. Material was generated from the banana breeding program of Embrapa Cassava & Fruits and evaluated at Embrapa Coastal Tablelands. The 12 primers used in this study generated 97.5% polymorphism. Four clusters were identified among the different genotypes studied, and the sum of the first two principal components was 48.91%. From the Unweighted Pair Group Method using Arithmetic averages (UPGMA) dendrogram, it was possible to identify two main clusters and subclusters. Two genotypes (Garantida and Thap Maeo) remained isolated from the others, both in the UPGMA clustering and in the principal cordinate analysis (PCoA). Using ISSR markers, we could analyze the genetic diversity of the studied material and state that these markers were efficient at detecting sufficient polymorphism to estimate the genetic variability in banana genotypes.

  3. New insights on the genetic diversity of the honeybee parasite Nosema ceranae based on multilocus sequence analysis.

    PubMed

    Roudel, Mathieu; Aufauvre, Julie; Corbara, Bruno; Delbac, Frederic; Blot, Nicolas

    2013-09-01

    The microsporidian parasite Nosema ceranae is a common pathogen of the Western honeybee (Apis mellifera) whose variable virulence could be related to its genetic polymorphism and/or its polyphenism responding to environmental cues. Since the genotyping of N. ceranae based on unique marker sequences had been unsuccessful, we tested whether a multilocus approach, assessing the diversity of ten genetic markers – encoding nine proteins and the small ribosomal RNA subunit – allowed the discrimination between N. ceranae variants isolated from single A. mellifera individuals in four distant locations. High nucleotide diversity and allele content were observed for all genes. Most importantly, the diversity was mainly present within parasite populations isolated from single honeybee individuals. In contrast the absence of isolate differentiation precluded any taxa discrimination, even through a multilocus approach, but suggested that similar populations of parasites seem to infect honeybees in distant locations. As statistical evolutionary analyses showed that the allele frequency is under selective pressure, we discuss the origin and consequences of N. ceranae heterozygosity in a single host and lack of population divergence in the context of the parasite natural and evolutionary history.

  4. Analysis of genetic diversity of Tunisian pistachio (Pistacia vera L.) using sequence-related amplified polymorphism (SRAP) markers.

    PubMed

    Guenni, K; Aouadi, M; Chatti, K; Salhi-Hannachi, A

    2016-10-17

    Sequence-related amplified polymorphism (SRAP) markers preferentially amplify open reading frames and were used to study the genetic diversity of Tunisian pistachio. In the present study, 43 Pistacia vera accessions were screened using seven SRAP primer pairs. A total of 78 markers was revealed (95.12%) with an average polymorphic information content of 0.850. The results suggest that there is strong genetic differentiation, which characterizes the local resources (GST = 0.307). High gene flow (Nm = 1.127) among groups was explained by the exchange of plant material among regions. Analysis of molecular variance revealed significant differences within groups and showed that 73.88% of the total genetic diversity occurred within groups, whereas the remaining 26.12% occurred among groups. Bayesian clustering and principal component analysis identified three pools, El Guettar, Pollenizers, and the rest of the pistachios belonging to the Gabès, Kasserine, and Sfax localities. Bayesian analysis revealed that El Guettar and male genotypes were assigned with more than 80% probability. The BayeScan method proposed that locus 59 (F13-R9) could be used in the development of sex-linked SCAR markers from SRAP since it is a commonly detected locus in comparisons involving the Pollenizers group. This is the first application of SRAP markers for the assessment of genetic diversity in Tunisian germplasm of P. vera. Such information will be useful to define conservation strategies and improvement programs for this species.

  5. Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands

    USGS Publications Warehouse

    Jarvi, S.I.; Farias, M.E.; Lapointe, D.A.; Belcaid, M.; Atkinson, C.T.

    2013-01-01

    Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.

  6. Microbial community structure of two freshwater sponges using Illumina MiSeq sequencing revealed high microbial diversity.

    PubMed

    Gaikwad, Swapnil; Shouche, Yogesh S; Gade, Wasudev N

    2016-12-01

    Sponges are primitive metazoans that are known to harbour diverse and abundant microbes. All over the world attempts are being made to exploit these microbes for their biotechnological potential to produce, bioactive compounds and antimicrobial peptides. However, the majority of the studies are focussed on the marine sponges and studies on the freshwater sponges have been neglected so far. To increase our understanding of the microbial community structure of freshwater sponges, microbiota of two fresh water sponges namely, Eunapius carteri and Corvospongilla lapidosa is explored for the first time using Next Generation Sequencing (NGS) technology. Overall the microbial composition of these sponges comprises of 14 phyla and on an average, more than 2900 OTUs were obtained from C. lapidosa while E. carteri showed 980 OTUs which is higher than OTUs obtained in the marine sponges. Thus, our study showed that, fresh water sponges also posses highly diverse microbial community than previously thought and it is distinct from the marine sponge microbiota. The present study also revealed that microbial community structure of both the sponges is significantly different from each other and their respective water samples. In the present study, we have detected many bacterial lineages belonging to Firmicutes, Actinobacteria, Proteobacteria, Planctomycetes, etc. that are known to produce compounds of biotechnological importance. Overall, this study gives insight into the microbial composition of the freshwater sponges which is highly diverse and needs to be studied further to exploit their biotechnological capabilities.

  7. Distribution and Diversity of Bacteria and Fungi Colonization in Stone Monuments Analyzed by High-Throughput Sequencing

    PubMed Central

    Li, Qiang; Zhang, Bingjian; He, Zhang; Yang, Xiaoru

    2016-01-01

    The historical and cultural heritage of Qingxing palace and Lingyin and Kaihua temple, located in Hangzhou of China, include a large number of exquisite Buddhist statues and ancient stone sculptures which date back to the Northern Song (960–1219 A.D.) and Qing dynasties (1636–1912 A.D.) and are considered to be some of the best examples of ancient stone sculpting techniques. They were added to the World Heritage List in 2011 because of their unique craftsmanship and importance to the study of ancient Chinese Buddhist culture. However, biodeterioration of the surface of the ancient Buddhist statues and white marble pillars not only severely impairs their aesthetic value but also alters their material structure and thermo-hygric properties. In this study, high-throughput sequencing was utilized to identify the microbial communities colonizing the stone monuments. The diversity and distribution of the microbial communities in six samples collected from three different environmental conditions with signs of deterioration were analyzed by means of bioinformatics software and diversity indices. In addition, the impact of environmental factors, including temperature, light intensity, air humidity, and the concentration of NO2 and SO2, on the microbial communities’ diversity and distribution was evaluated. The results indicate that the presence of predominantly phototrophic microorganisms was correlated with light and humidity, while nitrifying bacteria and Thiobacillus were associated with NO2 and SO2 from air pollution. PMID:27658256

  8. Distribution and Diversity of Bacteria and Fungi Colonization in Stone Monuments Analyzed by High-Throughput Sequencing.

    PubMed

    Li, Qiang; Zhang, Bingjian; He, Zhang; Yang, Xiaoru

    The historical and cultural heritage of Qingxing palace and Lingyin and Kaihua temple, located in Hangzhou of China, include a large number of exquisite Buddhist statues and ancient stone sculptures which date back to the Northern Song (960-1219 A.D.) and Qing dynasties (1636-1912 A.D.) and are considered to be some of the best examples of ancient stone sculpting techniques. They were added to the World Heritage List in 2011 because of their unique craftsmanship and importance to the study of ancient Chinese Buddhist culture. However, biodeterioration of the surface of the ancient Buddhist statues and white marble pillars not only severely impairs their aesthetic value but also alters their material structure and thermo-hygric properties. In this study, high-throughput sequencing was utilized to identify the microbial communities colonizing the stone monuments. The diversity and distribution of the microbial communities in six samples collected from three different environmental conditions with signs of deterioration were analyzed by means of bioinformatics software and diversity indices. In addition, the impact of environmental factors, including temperature, light intensity, air humidity, and the concentration of NO2 and SO2, on the microbial communities' diversity and distribution was evaluated. The results indicate that the presence of predominantly phototrophic microorganisms was correlated with light and humidity, while nitrifying bacteria and Thiobacillus were associated with NO2 and SO2 from air pollution.

  9. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  10. Amino-Acid Sequence of NADP-Specific Glutamate Dehydrogenase of Neurospora crassa

    PubMed Central

    Wootton, John C.; Chambers, Geoffrey K.; Holder, Anthony A.; Baron, Andrew J.; Taylor, John G.; Fincham, John R. S.; Blumenthal, Kenneth M.; Moon, Kenneth; Smith, Emil L.

    1974-01-01

    A tentative primary structure of the NADP-specific glutamate dehydrogenase [L-glutamate: NADP oxidoreductase (deaminating), EC 1.4.1.4] from Neurospora crassa has been determined. The proposed sequence contains 452 amino-acid residues in each of the identical subunits of the hexameric enzyme. Comparison of the sequence with that of the bovine liver enzyme reveals considerable homology in the amino-terminal portion of the chain, including the vicinity of the reactive lysine, with only shorter stretches of homology within the carboxyl-terminal regions. The significance of this distribution of homologous regions is discussed. PMID:4155068

  11. Genetic diversity in two Japanese flounder populations from China seas inferred using microsatellite markers and COI sequences

    NASA Astrophysics Data System (ADS)

    Xu, Dongdong; Li, Sanlei; Lou, Bao; Zhang, Yurong; Zhan, Wei; Shi, Huilai

    2012-07-01

    Japanese flounder is one of the most important commercial species in China; however, information on the genetic background of natural populations in China seas is scarce. The lack of genetic data has hampered fishery management and aquaculture development programs for this species. In the present study, we have analyzed the genetic diversity in natural populations of Japanese flounder sampled from the Yellow Sea (Qingdao population, QD) and East China Sea (Zhoushan population, ZS) using 10 polymorphic microsatellite loci and cytochrome c oxidase subunit I (COI) sequencing data. A total of 68 different alleles were observed over 10 microsatellite loci. The total number of alleles per locus ranged from 2 to 9, and the number of genotypes per locus ranged from 3 to 45. The observed heterozygosity and expected heterozygosity in QD were 0.733 and 0.779, respectively, and in ZS the heterozygosity values were 0.708 and 0.783, respectively. Significant departures from Hardy-Weinberg equilibrium were observed in 7 of the 10 microsatellite loci in each of the two populations. The COI sequencing analysis revealed 25 polymorphic sites and 15 haplotypes in the two populations. The haplotype diversity and nucleotide diversity in the QD population were 0.746±0.072 8 and 0.003 34±0.001 03 respectively, and in ZS population the genetic diversity values were 0.712±0.047 0 and 0.003 18±0.000 49, respectively. The microsatellite data ( F st =0.048 7, P <0.001) and mitochondrial DNA data ( F st =0.128, P <0.001) both revealed significant genetic differentiation between the two populations. The information on the genetic variation and differentiation in Japanese flounder obtained in this study could be used to set up suitable guidelines for the management and conservation of this species, as well as for managing artificial selection programs. In future studies, more geographically diverse stocks should be used to obtain a deeper understanding of the population structure of Japanese

  12. A not-so-big crisis: re-reading Silurian conodont diversity in a sequence-stratigraphic framework

    NASA Astrophysics Data System (ADS)

    Jarochowska, Emilia; Munnecke, Axel

    2016-04-01

    Conodonts are extensively used in Ordovician through Triassic biostratigraphy and fossil-based geochemistry. However, their distribution in rock successions is commonly taken at face value, without taking into account their diverse and poorly understood ecology. Multielement taxonomy, ontogenetic and environmental variability, difficulties in extraction, and relative rarity all contribute to the general lack of quantitative studies on conodont stratigraphic distribution and temporal turnover. With respect to Silurian conodonts, the concept of recurrent conodont extinction events - the so called Ireviken, Mulde and Lau events - has become a standard in the stratigraphic literature. The concept has been proposed based on qualitative observations of local extirpations of open-marine pelagic or nekto-benthic taxa and temporary dominance of shallow-water species in the Silurian succession of the Swedish island of Gotland. These changes coincided with positive carbon isotope excursions, abrupt facies shifts, "blooms" of benthic fauna, and changes in reef communities, which have all been combined into a general view of Silurian bio-geochemical events. This view posits a deterministic, reproducible pattern in Silurian conodont diversity, attributed to recurrent ecological or geochemical conditions. The growing body of sequence-stratigraphic interpretations across these events in Gotland and other sections worldwide indicate that in all cases the Silurian "events" are associated with rapid global regressions. This suggests that faunal changes such as the dominance of shallow-water, low-diversity conodont fauna and the increase of benthic invertebrate diversity and abundance represent predictable consequences of the variation in the completeness of the rock record and preservation potential of different environments. Our studies in Poland and Ukraine indicate that the magnitude of change in the taxonomic composition of conodont assemblages across the middle Silurian global

  13. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

    PubMed Central

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-01-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

  14. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.

    PubMed

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-07-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed).

  15. Sequence of the rhesus monkey T-cell receptor {beta} chain diversity and joining loci

    SciTech Connect

    Cheynier, R.; Henrichwark, S.; Wain-Hobson, S.

    1996-06-01

    Rhesus monkeys are frequently used as animal models for human diseases, most noticeably for simian immunodeficiency virus (SIV) infection and simian AIDS. An analysis of HIV proviruses and HIV-specific cytotoxic T cells in splenic white pulps relied heavily on the analysis of rearranged TCRBV sequences. The spleens were derived from patients with drug-insensitive idiopathic thrombocytopenia purpura and frequently taken at an advanced stage of disease. In order to obtain some insight into the balance of forces between the virus and the immune system during earlier stages of infection, one must inevitably turn to the SIV/macaque AIDS model. As a prerequisite to undertaking similar virological and immunological studies the nucleotide sequence of the macaque TCRBJ loci had to be established. 9 refs., 4 figs., 1 tab.

  16. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication.

    PubMed

    Wu, G Albert; Prochnik, Simon; Jenkins, Jerry; Salse, Jerome; Hellsten, Uffe; Murat, Florent; Perrier, Xavier; Ruiz, Manuel; Scalabrin, Simone; Terol, Javier; Takita, Marco Aurélio; Labadie, Karine; Poulain, Julie; Couloux, Arnaud; Jabbari, Kamel; Cattonaro, Federica; Del Fabbro, Cristian; Pinosio, Sara; Zuccolo, Andrea; Chapman, Jarrod; Grimwood, Jane; Tadeo, Francisco R; Estornell, Leandro H; Muñoz-Sanz, Juan V; Ibanez, Victoria; Herrero-Ortega, Amparo; Aleza, Pablo; Pérez-Pérez, Julián; Ramón, Daniel; Brunel, Dominique; Luro, François; Chen, Chunxian; Farmerie, William G; Desany, Brian; Kodira, Chinnappa; Mohiuddin, Mohammed; Harkins, Tim; Fredrikson, Karin; Burns, Paul; Lomsadze, Alexandre; Borodovsky, Mark; Reforgiato, Giuseppe; Freitas-Astúa, Juliana; Quetier, Francis; Navarro, Luis; Roose, Mikeal; Wincker, Patrick; Schmutz, Jeremy; Morgante, Michele; Machado, Marcos Antonio; Talon, Manuel; Jaillon, Olivier; Ollitrault, Patrick; Gmitter, Frederick; Rokhsar, Daniel

    2014-07-01

    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes--a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange