Science.gov

Sample records for acid sequence reveals

  1. Diversity of trypsins in the Mediterranean corn borer Sesamia nonagrioides (Lepidoptera: Noctuidae), revealed by nucleic acid sequences and enzyme purification.

    PubMed

    Díaz-Mendoza, M; Ortego, F; García de Lacoba, M; Magaña, C; de la Poza, M; Farinós, G P; Castañera, P; Hernández-Crespo, P

    2005-09-01

    The existence of a diverse trypsin gene family with a main role in the proteolytic digestion process has been proved in vertebrate and invertebrate organisms. In lepidopteran insects, a diversity of trypsin-like genes expressed in midgut has also been identified. Genomic DNA and cDNA trypsin-like sequences expressed in the Mediterranean corn Borer (MCB), Sesamia nonagrioides, midgut are reported in this paper. A phylogenetic analysis revealed that at least three types of trypsin-like enzymes putatively involved in digestion are conserved in MCB and other lepidopteran species. As expected, a diversity of sequences has been found, including four type-I (two subtypes), four type-II (two subtypes) and one type-III. In parallel, four different trypsins have been purified from midgut lumen of late instar MCB larvae. N-terminal sequencing and mass spectrometric analyses of purified trypsins have been performed in order to identify cDNAs coding for major trypsins among the diversity of trypsin-like sequences obtained. Thus, it is revealed that the four purified trypsins in MCB belong to the three well-defined phylogenetic groups of trypsin-like sequences detected in Lepidoptera. Major active trypsins present in late instar MCB lumen guts are trypsin-I (type-I), trypsin-IIA and trypsin-IIB (type-II), and trypsin-III (type-III). Trypsin-I, trypsin-IIA and trypsin-III showed preference for Arg over Lys, but responded differently to proteinaceous or synthetic inhibitors. As full-length cDNA clones coding for the purified trypsins were available, three-dimensional protein models were built in order to study the implication of specific residues on their response to inhibitors. Thus, it is predicted that Arg73, conserved in type-I lepidopteran trypsins, may favour reversible inhibition by the E-64. Indeed, the substitution of Val213Cys, unique for type-II lepidopteran trypsins, may be responsible for their specific inhibition by HgCl2. The implication of these results on the

  2. Locked nucleic acids (LNAs) reveal sequence requirements and kinetics of Xist RNA localization to the X chromosome

    PubMed Central

    Sarma, Kavitha; Levasseur, Pierre; Aristarkhov, Alexander; Lee, Jeannie T.

    2010-01-01

    A large fraction of the mammalian genome is transcribed into long noncoding RNAs. The RNAs remain largely uncharacterized as the field awaits new technologies to aid functional analysis. Here, we describe a unique use of locked nucleic acids (LNAs) for studying nuclear long noncoding RNA, an RNA subclass that has been less amenable to traditional knockdown techniques. We target LNAs at Xist RNA and show displacement from the X chromosome with fast kinetics. Xist transcript stability is not affected. By targeting different Xist regions, we identify a localization domain and show that polycomb repressive complex 2 (PRC2) is displaced together with Xist. Thus, PRC2 depends on RNA for both initial targeting to and stable association with chromatin. H3K27-trimethyl marks and gene silencing remain stable. Time-course analysis of RNA relocalization suggests that Xist and PRC2 bind to different regions of the X at the same time but do not reach saturating levels immediately. Thus, LNAs provide a tool for studying an emerging class of regulatory RNA and offer a window of opportunity to target epigenetic modifications with possible therapeutic applications. PMID:21135235

  3. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  4. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  5. Transcriptome sequencing revealed the transcriptional organization at ribosome-mediated attenuation sites in Corynebacterium glutamicum and identified a novel attenuator involved in aromatic amino acid biosynthesis.

    PubMed

    Neshat, Armin; Mentz, Almut; Rückert, Christian; Kalinowski, Jörn

    2014-11-20

    The Gram-positive bacterium Corynebacterium glutamicum belongs to the order Corynebacteriales and is used as a producer of amino acids at industrial scales. Due to its economic importance, gene expression and particularly the regulation of amino acid biosynthesis has been investigated extensively. Applying the high-resolution technique of transcriptome sequencing (RNA-seq), recently a vast amount of data has been generated that was used to comprehensively analyze the C. glutamicum transcriptome. By analyzing RNA-seq data from a small RNA cDNA library of C. glutamicum, short transcripts in the known transcriptional attenuators sites of the trp operon, the ilvBNC operon and the leuA gene were verified. Furthermore, whole transcriptome RNA-seq data were used to elucidate the transcriptional organization of these three amino acid biosynthesis operons. In addition, we discovered and analyzed the novel attenuator aroR, located upstream of the aroF gene (cg1129). The DAHP synthase encoded by aroF catalyzes the first step in aromatic amino acid synthesis. The AroR leader peptide contains the amino acid sequence motif F-Y-F, indicating a regulatory effect by phenylalanine and tyrosine. Analysis by real-time RT-PCR suggests that the attenuator regulates the transcription of aroF in dependence of the cellular amount of tRNA loaded with phenylalanine when comparing a phenylalanine-auxotrophic C. glutamicum mutant fed with limiting and excess amounts of a phenylalanine-containing dipeptide. Additionally, the very interesting finding was made that all analyzed attenuators are leaderless transcripts.

  6. High genetic diversity among strains of the unindustrialized lactic acid bacterium Carnobacterium maltaromaticum in dairy products as revealed by multilocus sequence typing.

    PubMed

    Rahman, Abdur; Cailliez-Grimal, Catherine; Bontemps, Cyril; Payot, Sophie; Chaillou, Stéphane; Revol-Junelles, Anne-Marie; Borges, Frédéric

    2014-07-01

    Dairy products are colonized with three main classes of lactic acid bacteria (LAB): opportunistic bacteria, traditional starters, and industrial starters. Most of the population structure studies were previously performed with LAB species belonging to these three classes and give interesting knowledge about the population structure of LAB at the stage where they are already industrialized. However, these studies give little information about the population structure of LAB prior their use as an industrial starter. Carnobacterium maltaromaticum is a LAB colonizing diverse environments, including dairy products. Since this bacterium was discovered relatively recently, it is not yet commercialized as an industrial starter, which makes C. maltaromaticum an interesting model for the study of unindustrialized LAB population structure in dairy products. A multilocus sequence typing scheme based on an analysis of fragments of the genes dapE, ddlA, glpQ, ilvE, pyc, pyrE, and leuS was applied to a collection of 47 strains, including 28 strains isolated from dairy products. The scheme allowed detecting 36 sequence types with a discriminatory index of 0.98. The whole population was clustered in four deeply branched lineages, in which the dairy strains were spread. Moreover, the dairy strains could exhibit a high diversity within these lineages, leading to an overall dairy population with a diversity level as high as that of the nondairy population. These results are in agreement with the hypothesis according to which the industrialization of LAB leads to a diversity reduction in dairy products.

  7. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  8. Piriform spider silk sequences reveal unique repetitive elements.

    PubMed

    Perry, David J; Bittencourt, Daniela; Siltberg-Liberles, Jessica; Rech, Elibio L; Lewis, Randolph V

    2010-11-08

    Orb-weaving spider silk fibers are assembled from very large, highly repetitive proteins. The repeated segments contain, in turn, short, simple, and repetitive amino acid motifs that account for the physical and mechanical properties of the assembled fiber. Of the six orb-weaver silk fibroins, the piriform silk that makes the attachment discs, which lashes the joints of the web and attaches dragline silk to surfaces, has not been previously characterized. Piriform silk protein cDNAs were isolated from phage libraries of three species: A. trifasciata , N. clavipes , and N. cruentata . The deduced amino acid sequences from these genes revealed two new repetitive motifs: an alternating proline motif, where every other amino acid is proline, and a glutamine-rich motif of 6-8 amino acids. Similar to other spider silk proteins, the repeated segments are large (>200 amino acids) and highly homogenized within a species. There is also substantial sequence similarity across the genes from the three species, with particular conservation of the repetitive motifs. Northern blot analysis revealed that the mRNA is larger than 11 kb and is expressed exclusively in the piriform glands of the spider. Phylogenetic analysis of the C-terminal regions of the new proteins with published spidroins robustly shows that the piriform sequences form an ortholog group.

  9. Nemertean Toxin Genes Revealed through Transcriptome Sequencing

    PubMed Central

    Whelan, Nathan V.; Kocot, Kevin M.; Santos, Scott R.; Halanych, Kenneth M.

    2014-01-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63–74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  10. Sequence tagging reveals unexpected modifications in toxicoproteomics.

    PubMed

    Dasari, Surendra; Chambers, Matthew C; Codreanu, Simona G; Liebler, Daniel C; Collins, Ben C; Pennington, Stephen R; Gallagher, William M; Tabb, David L

    2011-02-18

    Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here, we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty-five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications.

  11. Differences in acid tolerance between Bifidobacterium breve BB8 and its acid-resistant derivative B. breve BB8dpH, revealed by RNA-sequencing and physiological analysis.

    PubMed

    Yang, Xu; Hang, Xiaomin; Tan, Jing; Yang, Hong

    2015-06-01

    Bifidobacteria are common inhabitants of the human gastrointestinal tract, and their application has increased dramatically in recent years due to their health-promoting effects. The ability of bifidobacteria to tolerate acidic environments is particularly important for their function as probiotics because they encounter such environments in food products and during passage through the gastrointestinal tract. In this study, we generated a derivative, Bifidobacterium breve BB8dpH, which displayed a stable, acid-resistant phenotype. To investigate the possible reasons for the higher acid tolerance of B. breve BB8dpH, as compared with its parental strain B. breve BB8, a combined transcriptome and physiological approach was used to characterize differences between the two strains. An analysis of the transcriptome by RNA-sequencing indicated that the expression of 121 genes was increased by more than 2-fold, while the expression of 146 genes was reduced more than 2-fold, in B. breve BB8dpH. Validation of the RNA-sequencing data using real-time quantitative PCR analysis demonstrated that the RNA-sequencing results were highly reliable. The comparison analysis, based on differentially expressed genes, suggested that the acid tolerance of B. breve BB8dpH was enhanced by regulating the expression of genes involved in carbohydrate transport and metabolism, energy production, synthesis of cell envelope components (peptidoglycan and exopolysaccharide), synthesis and transport of glutamate and glutamine, and histidine synthesis. Furthermore, an analysis of physiological data showed that B. breve BB8dpH displayed higher production of exopolysaccharide and lower H(+)-ATPase activity than B. breve BB8. The results presented here will improve our understanding of acid tolerance in bifidobacteria, and they will lead to the development of new strategies to enhance the acid tolerance of bifidobacterial strains.

  12. Direct sequencing of the human microbiome readily reveals community differences

    PubMed Central

    2010-01-01

    Culture-independent studies of human microbiota by direct genomic sequencing reveal quite distinct differences among communities, indicating that improved sequencing capacity can be most wisely utilized to study more samples, rather than more sequences per sample. PMID:20441597

  13. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  14. High-Quality Draft Genome Sequence of Kallotenue papyrolyticum JKG1T Reveals Broad Heterotrophic Capacity Focused on Carbohydrate and Amino Acid Metabolism.

    PubMed

    Hedlund, Brian P; Murugapiran, Senthil K; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T B K; Ngan, Chew Yee; Daum, Chris; Duffy, Kecia; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Williams, Amanda J; Cole, Jessica K; Dodsworth, Jeremy A; Woyke, Tanja

    2015-12-03

    The draft genome of Kallotenue papyrolyticum JKG1(T), a member of the order Kallotenuales, class Chloroflexia, consists of 4,475,263 bp in 4 contigs and encodes 4,010 predicted genes, 49 tRNA-encoding genes, and 3 rRNA operons. The genome is consistent with a heterotrophic lifestyle including catabolism of polysaccharides and amino acids.

  15. High-Quality Draft Genome Sequence of Kallotenue papyrolyticum JKG1T Reveals Broad Heterotrophic Capacity Focused on Carbohydrate and Amino Acid Metabolism

    PubMed Central

    Murugapiran, Senthil K.; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T. B. K.; Ngan, Chew Yee; Daum, Chris; Duffy, Kecia; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Williams, Amanda J.; Cole, Jessica K.; Dodsworth, Jeremy A.; Woyke, Tanja

    2015-01-01

    The draft genome of Kallotenue papyrolyticum JKG1T, a member of the order Kallotenuales, class Chloroflexia, consists of 4,475,263 bp in 4 contigs and encodes 4,010 predicted genes, 49 tRNA-encoding genes, and 3 rRNA operons. The genome is consistent with a heterotrophic lifestyle including catabolism of polysaccharides and amino acids. PMID:26634758

  16. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  17. Binning of shallowly sampled metagenomic sequence fragments reveals that low abundance bacteria play important roles in sulfur cycling and degradation of complex organic polymers in an acid mine drainage community

    NASA Astrophysics Data System (ADS)

    Dick, G. J.; Andersson, A.; Banfield, J. F.

    2007-12-01

    Our understanding of environmental microbiology has been greatly enhanced by community genome sequencing of DNA recovered directly the environment. Community genomics provides insights into the diversity, community structure, metabolic function, and evolution of natural populations of uncultivated microbes, thereby revealing dynamics of how microorganisms interact with each other and their environment. Recent studies have demonstrated the potential for reconstructing near-complete genomes from natural environments while highlighting the challenges of analyzing community genomic sequence, especially from diverse environments. A major challenge of shotgun community genome sequencing is identification of DNA fragments from minor community members for which only low coverage of genomic sequence is present. We analyzed community genome sequence retrieved from biofilms in an acid mine drainage (AMD) system in the Richmond Mine at Iron Mountain, CA, with an emphasis on identification and assembly of DNA fragments from low-abundance community members. The Richmond mine hosts an extensive, relatively low diversity subterranean chemolithoautotrophic community that is sustained entirely by oxidative dissolution of pyrite. The activity of these microorganisms greatly accelerates the generation of AMD. Previous and ongoing work in our laboratory has focused on reconstrucing genomes of dominant community members, including several bacteria and archaea. We binned contigs from several samples (including one new sample and two that had been previously analyzed) by tetranucleotide frequency with clustering by Self-Organizing Maps (SOM). The binning, evaluated by comparison with information from the manually curated assembly of the dominant organisms, was found to be very effective: fragments were correctly assigned with 95% accuracy. Improperly assigned fragments often contained sequences that are either evolutionarily constrained (e.g. 16S rRNA genes) or mobile elements that are

  18. The complete amino acid sequence of prochymosin.

    PubMed Central

    Foltmann, B; Pedersen, V B; Jacobsen, H; Kauffman, D; Wybrandt, G

    1977-01-01

    The total sequence of 365 amino acid residues in bovine prochymosin is presented. Alignment with the amino acid sequence of porcine pepsinogen shows that 204 amino acid residues are common to the two zymogens. Further comparison and alignment with the amino acid sequence of penicillopepsin shows that 66 residues are located at identical positions in all three proteases. The three enzymes belong to a large group of proteases with two aspartate residues in the active center. This group forms a family derived from one common ancestor. PMID:329280

  19. Biomedical Impact of Splicing Mutations Revealed through Exome Sequencing

    PubMed Central

    Taneri, Bahar; Asilmaz, Esra; Gaasterland, Terry

    2012-01-01

    Splicing is a cellular mechanism, which dictates eukaryotic gene expression by removing the noncoding introns and ligating the coding exons in the form of a messenger RNA molecule. Alternative splicing (AS) adds a major level of complexity to this mechanism and thus to the regulation of gene expression. This widespread cellular phenomenon generates multiple messenger RNA isoforms from a single gene, by utilizing alternative splice sites and promoting different exon–intron inclusions and exclusions. AS greatly increases the coding potential of eukaryotic genomes and hence contributes to the diversity of eukaryotic proteomes. Mutations that lead to disruptions of either constitutive splicing or AS cause several diseases, among which are myotonic dystrophy and cystic fibrosis. Aberrant splicing is also well established in cancer states. Identification of rare novel mutations associated with splice-site recognition, and splicing regulation in general, could provide further insight into genetic mechanisms of rare diseases. Here, disease relevance of aberrant splicing is reviewed, and the new methodological approach of starting from disease phenotype, employing exome sequencing and identifying rare mutations affecting splicing regulation is described. Exome sequencing has emerged as a reliable method for finding sequence variations associated with various disease states. To date, genetic studies using exome sequencing to find disease-causing mutations have focused on the discovery of nonsynonymous single nucleotide polymorphisms that alter amino acids or introduce early stop codons, or on the use of exome sequencing as a means to genotype known single nucleotide polymorphisms. The involvement of splicing mutations in inherited diseases has received little attention and thus likely occurs more frequently than currently estimated. Studies of exome sequencing followed by molecular and bioinformatic analyses have great potential to reveal the high impact of splicing

  20. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

    PubMed Central

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D.; Adir, Noam

    2016-01-01

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  1. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.

  2. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  3. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  4. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  5. Deep sequencing reveals 50 novel genes for recessive cognitive disorders.

    PubMed

    Najmabadi, Hossein; Hu, Hao; Garshasbi, Masoud; Zemojtel, Tomasz; Abedini, Seyedeh Sedigheh; Chen, Wei; Hosseini, Masoumeh; Behjati, Farkhondeh; Haas, Stefan; Jamali, Payman; Zecha, Agnes; Mohseni, Marzieh; Püttmann, Lucia; Vahid, Leyla Nouri; Jensen, Corinna; Moheb, Lia Abbasi; Bienek, Melanie; Larti, Farzaneh; Mueller, Ines; Weissmann, Robert; Darvish, Hossein; Wrogemann, Klaus; Hadavi, Valeh; Lipkowitz, Bettina; Esmaeeli-Nieh, Sahar; Wieczorek, Dagmar; Kariminejad, Roxana; Firouzabadi, Saghar Ghasemi; Cohen, Monika; Fattahi, Zohreh; Rost, Imma; Mojahedi, Faezeh; Hertzberg, Christoph; Dehghan, Atefeh; Rajab, Anna; Banavandi, Mohammad Javad Soltani; Hoffer, Julia; Falah, Masoumeh; Musante, Luciana; Kalscheuer, Vera; Ullmann, Reinhard; Kuss, Andreas Walter; Tzschach, Andreas; Kahrizi, Kimia; Ropers, H Hilger

    2011-09-21

    Common diseases are often complex because they are genetically heterogeneous, with many different genetic defects giving rise to clinically indistinguishable phenotypes. This has been amply documented for early-onset cognitive impairment, or intellectual disability, one of the most complex disorders known and a very important health care problem worldwide. More than 90 different gene defects have been identified for X-chromosome-linked intellectual disability alone, but research into the more frequent autosomal forms of intellectual disability is still in its infancy. To expedite the molecular elucidation of autosomal-recessive intellectual disability, we have now performed homozygosity mapping, exon enrichment and next-generation sequencing in 136 consanguineous families with autosomal-recessive intellectual disability from Iran and elsewhere. This study, the largest published so far, has revealed additional mutations in 23 genes previously implicated in intellectual disability or related neurological disorders, as well as single, probably disease-causing variants in 50 novel candidate genes. Proteins encoded by several of these genes interact directly with products of known intellectual disability genes, and many are involved in fundamental cellular processes such as transcription and translation, cell-cycle control, energy metabolism and fatty-acid synthesis, which seem to be pivotal for normal brain development and function.

  6. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  7. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  8. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  9. Structural gene and complete amino acid sequence of Vibrio alginolyticus collagenase.

    PubMed Central

    Takeuchi, H; Shibano, Y; Morihara, K; Fukushima, J; Inami, S; Keil, B; Gilles, A M; Kawamoto, S; Okuda, K

    1992-01-01

    The DNA encoding the collagenase of Vibrio alginolyticus was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited both collagenase antigen and collagenase activity. The open reading frame from the ATG initiation codon was 2442 bp in length for the collagenase structural gene. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature collagenase consists of 739 amino acids with an Mr of 81875. The amino acid sequences of 20 polypeptide fragments were completely identical with the deduced amino acid sequences of the collagenase gene. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified collagenase reported previously. The analyses of both the DNA and amino acid sequences of the collagenase gene were rigorously performed, but we could not detect any significant sequence similarity to other collagenases. Images Fig. 2. PMID:1311172

  10. Next Generation Sequencing Reveals the Hidden Diversity of Zooplankton Assemblages

    PubMed Central

    Harmer, Rachel A.; Somerfield, Paul J.; Atkinson, Angus

    2013-01-01

    Background Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. Methodology/Principle Findings Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. Conclusions Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly

  11. Active site amino acid sequence of human factor D.

    PubMed

    Davis, A E

    1980-08-01

    Factor D was isolated from human plasma by chromatography on CM-Sephadex C50, Sephadex G-75, and hydroxylapatite. Digestion of reduced, S-carboxymethylated factor D with cyanogen bromide resulted in three peptides which were isolated by chromatography on Sephadex G-75 (superfine) equilibrated in 20% formic acid. NH2-Terminal sequences were determined by automated Edman degradation with a Beckman 890C sequencer using a 0.1 M Quadrol program. The smallest peptide (CNBr III) consisted of the NH2-terminal 14 amino acids. The other two peptides had molecular weights of 17,000 (CNBr I) and 7000 (CNBr II). Overlap of the NH2-terminal sequence of factor D with the NH2-terminal sequence of CNBr I established the order of the peptides. The NH2-terminal 53 residues of factor D are somewhat more homologous with the group-specific protease of rat intestine than with other serine proteases. The NH2-terminal sequence of CNBr II revealed the active site serine of factor D. The typical serine protease active site sequence (Gly-Asp-Ser-Gly-Gly-Pro was found at residues 12-17. The region surrounding the active site serine does not appear to be more highly homologous with any one of the other serine proteases. The structural data obtained point out the similarities between factor D and the other proteases. However, complete definition of the degree of relationship between factor D and other proteases will require determination of the remainder of the primary structure.

  12. Mitochondrial Genome Sequences Effectively Reveal the Phylogeny of Hylobates Gibbons

    PubMed Central

    Chan, Yi-Chiao; Roos, Christian; Inoue-Murayama, Miho; Inoue, Eiji; Shih, Chih-Chin; Pei, Kurtis Jai-Chyi; Vigilant, Linda

    2010-01-01

    Background Uniquely among hominoids, gibbons exist as multiple geographically contiguous taxa exhibiting distinctive behavioral, morphological, and karyotypic characteristics. However, our understanding of the evolutionary relationships of the various gibbons, especially among Hylobates species, is still limited because previous studies used limited taxon sampling or short mitochondrial DNA (mtDNA) sequences. Here we use mtDNA genome sequences to reconstruct gibbon phylogenetic relationships and reveal the pattern and timing of divergence events in gibbon evolutionary history. Methodology/Principal Findings We sequenced the mitochondrial genomes of 51 individuals representing 11 species belonging to three genera (Hylobates, Nomascus and Symphalangus) using the high-throughput 454 sequencing system with the parallel tagged sequencing approach. Three phylogenetic analyses (maximum likelihood, Bayesian analysis and neighbor-joining) depicted the gibbon phylogenetic relationships congruently and with strong support values. Most notably, we recover a well-supported phylogeny of the Hylobates gibbons. The estimation of divergence times using Bayesian analysis with relaxed clock model suggests a much more rapid speciation process in Hylobates than in Nomascus. Conclusions/Significance Use of more than 15 kb sequences of the mitochondrial genome provided more informative and robust data than previous studies of short mitochondrial segments (e.g., control region or cytochrome b) as shown by the reliable reconstruction of divergence patterns among Hylobates gibbons. Moreover, molecular dating of the mitogenomic divergence times implied that biogeographic change during the last five million years may be a factor promoting the speciation of Sundaland animals, including Hylobates species. PMID:21203450

  13. Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux

    PubMed Central

    Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

    2012-01-01

    We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ∼20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology. PMID:22848480

  14. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  15. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  16. Efficient analysis of mouse genome sequences reveal many nonsense variants

    PubMed Central

    Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

    2016-01-01

    Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

  17. Phylogenetic position of Myriapoda revealed by 454 transcriptome sequencing.

    PubMed

    Rehm, Peter; Meusemann, Karen; Borner, Janus; Misof, Bernhard; Burmester, Thorsten

    2014-08-01

    Myriapods had been considered closely allied to hexapods (insects and relatives). However, analyses of molecular sequence data have consistently placed Myriapoda either as a sister group of Pancrustacea, comprising crustaceans and hexapods, and thereby supporting the monophyly of Mandibulata, or retrieved Myriapoda as a sister group of Chelicerata (spiders, ticks, mites and allies). In addition, the relationships among the four myriapod groups (Pauropoda, Symphyla, Diplopoda, Chilopoda) are unclear. To resolve the phylogeny of myriapods and their relationship to other main arthropod groups, we collected transcriptome data from the symphylan Symphylella vulgaris, the centipedes Lithobius forficatus and Scolopendra dehaani, and the millipedes Polyxenus lagurus, Glomeris pustulata and Polydesmus angustus by 454 sequencing. We concatenated a multiple sequence alignment that contained 1550 orthologous single copy genes (1,109,847 amino acid positions) from 55 euarthropod and 14 outgroup taxa. The final selected alignment included 181 genes and 37,425 amino acid positions from 55 taxa, with eight myriapods and 33 other euarthropods. Bayesian analyses robustly recovered monophyletic Mandibulata, Pancrustacea and Myriapoda. Most analyses support a sister group relationship of Symphyla in respect to a clade comprising Chilopoda and Diplopoda. Inclusion of additional sequence data from nine myriapod species resulted in an alignment with poor data density, but broader taxon average. With this dataset we inferred Diplopoda+Pauropoda as closest relatives (i.e., Dignatha) and recovered monophyletic Helminthomorpha. Molecular clock calculations suggest an early Cambrian emergence of Myriapoda ∼513 million years ago and a late Cambrian divergence of myriapod classes. This implies a marine origin of the myriapods and independent terrestrialization events during myriapod evolution.

  18. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  19. Next generation sequencing in sporadic retinoblastoma patients reveals somatic mosaicism.

    PubMed

    Amitrano, Sara; Marozza, Annabella; Somma, Serena; Imperatore, Valentina; Hadjistilianou, Theodora; De Francesco, Sonia; Toti, Paolo; Galimberti, Daniela; Meloni, Ilaria; Cetta, Francesco; Piu, Pietro; Di Marco, Chiara; Dosa, Laura; Lo Rizzo, Caterina; Carignani, Giulia; Mencarelli, Maria Antonietta; Mari, Francesca; Renieri, Alessandra; Ariani, Francesca

    2015-11-01

    In about 50% of sporadic cases of retinoblastoma, no constitutive RB1 mutations are detected by conventional methods. However, recent research suggests that, at least in some of these cases, there is somatic mosaicism with respect to RB1 normal and mutant alleles. The increased availability of next generation sequencing improves our ability to detect the exact percentage of patients with mosaicism. Using this technology, we re-tested a series of 40 patients with sporadic retinoblastoma: 10 of them had been previously classified as constitutional heterozygotes, whereas in 30 no RB1 mutations had been found in lymphocytes. In 3 of these 30 patients, we have now identified low-level mosaic variants, varying in frequency between 8 and 24%. In 7 out of the 10 cases previously classified as heterozygous from testing blood cells, we were able to test additional tissues (ocular tissues, urine and/or oral mucosa): in three of them, next generation sequencing has revealed mosaicism. Present results thus confirm that a significant fraction (6/40; 15%) of sporadic retinoblastoma cases are due to postzygotic events and that deep sequencing is an efficient method to unambiguously distinguish mosaics. Re-testing of retinoblastoma patients through next generation sequencing can thus provide new information that may have important implications with respect to genetic counseling and family care.

  20. Los Alamos sequence analysis package for nucleic acids and proteins.

    PubMed Central

    Kanehisa, M I

    1982-01-01

    An interactive system for computer analysis of nucleic acid and protein sequences has been developed for the Los Alamos DNA Sequence Database. It provides a convenient way to search or verify various sequence features, e.g., restriction enzyme sites, protein coding frames, and properties of coded proteins. Further, the comprehensive analysis package on a large-scale database can be used for comparative studies on sequence and structural homologies in order to find unnoted information stored in nucleic acid sequences. PMID:6174934

  1. Genome Sequencing Reveals a Phage in Helicobacter pylori

    PubMed Central

    Lehours, Philippe; Vale, Filipa F.; Bjursell, Magnus K.; Melefors, Ojar; Advani, Reza; Glavas, Steve; Guegueniat, Julia; Gontier, Etienne; Lacomme, Sabrina; Alves Matos, António; Menard, Armelle; Mégraud, Francis; Engstrand, Lars; Andersson, Anders F.

    2011-01-01

    ABSTRACT Helicobacter pylori chronically infects the gastric mucosa in more than half of the human population; in a subset of this population, its presence is associated with development of severe disease, such as gastric cancer. Genomic analysis of several strains has revealed an extensive H. pylori pan-genome, likely to grow as more genomes are sampled. Here we describe the draft genome sequence (63 contigs; 26× mean coverage) of H. pylori strain B45, isolated from a patient with gastric mucosa-associated lymphoid tissue (MALT) lymphoma. The major finding was a 24.6-kb prophage integrated in the bacterial genome. The prophage shares most of its genes (22/27) with prophage region II of Helicobacter acinonychis strain Sheeba. After UV treatment of liquid cultures, circular DNA carrying the prophage integrase gene could be detected, and intracellular tailed phage-like particles were observed in H. pylori cells by transmission electron microscopy, indicating that phage production can be induced from the prophage. PCR amplification and sequencing of the integrase gene from 341 H. pylori strains from different geographic regions revealed a high prevalence of the prophage (21.4%). Phylogenetic reconstruction showed four distinct clusters in the integrase gene, three of which tended to be specific for geographic regions. Our study implies that phages may play important roles in the ecology and evolution of H. pylori. PMID:22086490

  2. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  3. Comparative RNA sequencing reveals substantial genetic variation in endangered primates.

    PubMed

    Perry, George H; Melsted, Páll; Marioni, John C; Wang, Ying; Bainer, Russell; Pickrell, Joseph K; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D; Stephens, Matthew; Pritchard, Jonathan K; Gilad, Yoav

    2012-04-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success.

  4. Tertiary structural propensities reveal fundamental sequence/structure relationships.

    PubMed

    Zheng, Fan; Zhang, Jian; Grigoryan, Gevorg

    2015-05-05

    Extracting useful generalizations from the continually growing Protein Data Bank (PDB) is of central importance. We hypothesize that the PDB contains valuable quantitative information on the level of local tertiary structural motifs (TERMs). We show that by breaking a protein structure into its constituent TERMs, and querying the PDB to characterize the natural ensemble matching each, we can estimate the compatibility of the structure with a given amino acid sequence through a metric we term "structure score." Considering submissions from recent Critical Assessment of Structure Prediction (CASP) experiments, we found a strong correlation (R = 0.69) between structure score and model accuracy, with poorly predicted regions readily identifiable. This performance exceeds that of leading atomistic statistical energy functions. Furthermore, TERM-based analysis of two prototypical multi-state proteins rapidly produced structural insights fully consistent with prior extensive experimental studies. We thus find that TERM-based analysis should have considerable utility for protein structural biology.

  5. High throughput sequencing reveals a novel fabavirus infecting sweet cherry.

    PubMed

    Villamor, D E V; Pillai, S S; Eastwell, K C

    2017-03-01

    The genus Fabavirus currently consists of five species represented by viruses that infect a wide range of hosts but none reported from temperate climate fruit trees. A virus with genomic features resembling fabaviruses (tentatively named Prunus virus F, PrVF) was revealed by high throughput sequencing of extracts from a sweet cherry tree (Prunus avium). PrVF was subsequently shown to be graft transmissible and further identified in three other non-symptomatic Prunus spp. from different geographical locations. Two genetic variants of RNA1 and RNA2 coexisted in the same samples. RNA1 consisted of 6,165 and 6,163 nucleotides, and RNA2 consisted of 3,622 and 3,468 nucleotides.

  6. p53-Regulated Networks of Protein, mRNA, miRNA, and lncRNA Expression Revealed by Integrated Pulsed Stable Isotope Labeling With Amino Acids in Cell Culture (pSILAC) and Next Generation Sequencing (NGS) Analyses*

    PubMed Central

    Hünten, Sabine; Kaller, Markus; Drepper, Friedel; Oeljeklaus, Silke; Bonfert, Thomas; Erhard, Florian; Dueck, Anne; Eichner, Norbert; Friedel, Caroline C.; Meister, Gunter; Zimmer, Ralf; Warscheid, Bettina; Hermeking, Heiko

    2015-01-01

    We determined the effect of p53 activation on de novo protein synthesis using quantitative proteomics (pulsed stable isotope labeling with amino acids in cell culture/pSILAC) in the colorectal cancer cell line SW480. This was combined with mRNA and noncoding RNA expression analyses by next generation sequencing (RNA-, miR-Seq). Furthermore, genome-wide DNA binding of p53 was analyzed by chromatin-immunoprecipitation (ChIP-Seq). Thereby, we identified differentially regulated proteins (542 up, 569 down), mRNAs (1258 up, 415 down), miRNAs (111 up, 95 down) and lncRNAs (270 up, 123 down). Changes in protein and mRNA expression levels showed a positive correlation (r = 0.50, p < 0.0001). In total, we detected 133 direct p53 target genes that were differentially expressed and displayed p53 occupancy in the vicinity of their promoter. More transcriptionally induced genes displayed occupied p53 binding sites (4.3% mRNAs, 7.2% miRNAs, 6.3% lncRNAs, 5.9% proteins) than repressed genes (2.4% mRNAs, 3.2% miRNAs, 0.8% lncRNAs, 1.9% proteins), suggesting indirect mechanisms of repression. Around 50% of the down-regulated proteins displayed seed-matching sequences of p53-induced miRNAs in the corresponding 3′-UTRs. Moreover, proteins repressed by p53 significantly overlapped with those previously shown to be repressed by miR-34a. We confirmed up-regulation of the novel direct p53 target genes LINC01021, MDFI, ST14 and miR-486 and showed that ectopic LINC01021 expression inhibits proliferation in SW480 cells. Furthermore, KLF12, HMGB1 and CIT mRNAs were confirmed as direct targets of the p53-induced miR-34a, miR-205 and miR-486–5p, respectively. In line with the loss of p53 function during tumor progression, elevated expression of KLF12, HMGB1 and CIT was detected in advanced stages of cancer. In conclusion, the integration of multiple omics methods allowed the comprehensive identification of direct and indirect effectors of p53 that provide new insights and leads into the

  7. Parallel Selection Revealed by Population Sequencing in Chicken

    PubMed Central

    Qanbari, Saber; Seidel, Michael; Strom, Tim-Mathias; Mayer, Klaus F.X.; Preisinger, Ruedi; Simianer, Henner

    2015-01-01

    Human-driven selection during domestication and subsequent breed formation has likely left detectable signatures within the genome of modern chicken. The elucidation of these signatures of selection is of interest from the perspective of evolutionary biology, and for identifying genes relevant to domestication and improvement that ultimately may help to further genetically improve this economically important animal. We used whole genome sequence data from 50 hens of commercial white (WL) and brown (BL) egg-laying chicken along with pool sequences of three meat-type chicken to perform a systematic screening of past selection in modern chicken. Evidence of positive selection was investigated in two steps. First, we explored evidence of parallel fixation in regions with overlapping elevated allele frequencies in replicated populations of layers and broilers, suggestive of selection during domestication or preimprovement ages. We confirmed parallel fixation in BCDO2 and TSHR genes and found four candidates including AGTR2, a gene heavily involved in “Ascites” in commercial birds. Next, we explored differentiated loci between layers and broilers suggestive of selection during improvement in chicken. This analysis revealed evidence of parallel differentiation in genes relevant to appearance and production traits exemplified with the candidate gene OPG, implicated in Osteoporosis, a disorder related to overconsumption of calcium in egg-laying hens. Our results illustrate the potential for population genetic techniques to identify genomic regions relevant to the phenotypes of importance to breeders. PMID:26568375

  8. Amino acid sequence of a mouse immunoglobulin mu chain.

    PubMed Central

    Kehry, M; Sibley, C; Fuhrman, J; Schilling, J; Hood, L E

    1979-01-01

    The complete amino acid sequence of the mouse mu chain from the BALB/c myeloma tumor MOPC 104E is reported. The C mu region contains four consecutive homology regions of approximately 110 residues and a COOH-terminal region of 19 residues. A comparison of this mu chain from mouse with a complete mu sequence from human (Ou) and a partial mu chain sequence from dog (Moo) reveals a striking gradient of increasing homology from the NH2-terminal to the COOH-terminal portion of these mu chains, with the former being the least and the latter the most highly conserved. Four of the five sites of carbohydrate attachment appear to be at identical residue positions when the constant regions of the mouse and human mu chains are compared. The mu chain of MOPC 104E has a carbohydrate moiety attached in the second hypervariable region. This is particularly interesting in view of the fact that MOPC 104E binds alpha-(1 leads to 3)-dextran, a simple carbohydrate. The structural and functional constraints imposed by these comparative sequence analyses are discussed. PMID:111247

  9. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  10. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  11. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  12. Replication Study: Melanoma genome sequencing reveals frequent PREX2 mutations

    PubMed Central

    Horrigan, Stephen K; Courville, Pascal; Sampey, Darryl; Zhou, Faren; Cai, Steve

    2017-01-01

    In 2015, as part of the Reproducibility Project: Cancer Biology, we published a Registered Report (Chroscinski et al., 2014) that described how we intended to replicate selected experiments from the paper "Melanoma genome sequencing reveals frequent PREX2 mutations" (Berger et al., 2012). Here we report the results of those experiments. We regenerated cells stably expressing ectopic wild-type and mutant phosphatidylinositol-3,4,5-trisphosphate-dependent Rac exchange factor 2 (PREX2) using the same immortalized human NRASG12D melanocytes as the original study. Evaluation of PREX2 expression in these newly generated stable cells revealed varying levels of expression among the PREX2 isoforms, which was also observed in the stable cells made in the original study (Figure S6A; Berger et al., 2012). Additionally, ectopically expressed PREX2 was found to be at least 5 times above endogenous PREX2 expression. The monitoring of tumor formation of these stable cells in vivo resulted in no statistically significant difference in tumor-free survival driven by PREX2 variants, whereas the original study reported that these PREX2 mutations increased the rate of tumor incidence compared to controls (Figure 3B and S6B; Berger et al., 2012). Surprisingly, the median tumor-free survival was 1 week in this replication attempt, while 70% of the control mice were reported to be tumor-free after 9 weeks in the original study. The rapid tumor onset observed in this replication attempt, compared to the original study, makes the detection of accelerated tumor growth in PREX2 expressing NRASG12D melanocytes extremely difficult. Finally, we report meta-analyses for each result. DOI: http://dx.doi.org/10.7554/eLife.21634.001 PMID:28100394

  13. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  14. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  15. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  16. Genome Sequencing Reveals the Origin of the Allotetraploid Arabidopsis suecica.

    PubMed

    Novikova, Polina Yu; Tsuchimatsu, Takashi; Simon, Samson; Nizhynska, Viktoria; Voronin, Viktor; Burns, Robin; Fedorenko, Olga M; Holm, Svante; Säll, Torbjörn; Prat, Elisa; Marande, William; Castric, Vincent; Nordborg, Magnus

    2017-04-01

    Polyploidy is an example of instantaneous speciation when it involves the formation of a new cytotype that is incompatible with the parental species. Because new polyploid individuals are likely to be rare, establishment of a new species is unlikely unless polyploids are able to reproduce through self-fertilization (selfing), or asexually. Conversely, selfing (or asexuality) makes it possible for polyploid species to originate from a single individual-a bona fide speciation event. The extent to which this happens is not known. Here, we consider the origin of Arabidopsis suecica, a selfing allopolyploid between Arabidopsis thaliana and Arabidopsis arenosa, which has hitherto been considered to be an example of a unique origin. Based on whole-genome re-sequencing of 15 natural A. suecica accessions, we identify ubiquitous shared polymorphism with the parental species, and hence conclusively reject a unique origin in favor of multiple founding individuals. We further estimate that the species originated after the last glacial maximum in Eastern Europe or central Eurasia (rather than Sweden, as the name might suggest). Finally, annotation of the self-incompatibility loci in A. suecica revealed that both loci carry non-functional alleles. The locus inherited from the selfing A. thaliana is fixed for an ancestral non-functional allele, whereas the locus inherited from the outcrossing A. arenosa is fixed for a novel loss-of-function allele. Furthermore, the allele inherited from A. thaliana is predicted to transcriptionally silence the allele inherited from A. arenosa, suggesting that loss of self-incompatibility may have been instantaneous.

  17. Amino acid sequence of mouse submaxillary gland renin.

    PubMed Central

    Misono, K S; Chang, J J; Inagami, T

    1982-01-01

    The complete amino acid sequences of the heavy chain and light chain of mouse submaxillary gland renin have been determined. The heavy chain consists of 288 amino acid residues having a Mr of 31,036 calculated from the sequence. The light chain contains 48 amino acid residues with a Mr of 5,458. The sequence of the heavy chain was determined by automated Edman degradations of the cyanogen bromide peptides and tryptic peptides generated after citraconylation, as well as other peptides generated therefrom. The sequence of the light chain was derived from sequence analyses of the peptides generated by cyanogen bromide cleavage or by digestion with Staphylococcus aureus protease. The sequences in the active site regions in renin containing two catalytically essential aspartyl residues 32 and 215 were found identical with those in pepsin, chymosin, and penicillopepsin. Comparison of the amino acid sequence of renin with that of porcine pepsin indicated a 42% sequence identity of the heavy chain with the amino-terminal and middle regions and a 46% identity of the light chain with the carboxyl-terminal region of the porcine pepsin sequence. Residues identical in renin and pepsin are distributed throughout the length of the molecules, suggesting a similarity in their overall structures. PMID:6812055

  18. Innovations in host and microbial sialic acid biosynthesis revealed by phylogenomic prediction of nonulosonic acid structure

    PubMed Central

    Lewis, Amanda L.; Desa, Nolan; Hansen, Elizabeth E.; Knirel, Yuriy A.; Gordon, Jeffrey I.; Gagneux, Pascal; Nizet, Victor; Varki, Ajit

    2009-01-01

    Sialic acids (Sias) are nonulosonic acid (NulO) sugars prominently displayed on vertebrate cells and occasionally mimicked by bacterial pathogens using homologous biosynthetic pathways. It has been suggested that Sias were an animal innovation and later emerged in pathogens by convergent evolution or horizontal gene transfer. To better illuminate the evolutionary processes underlying the phenomenon of Sia molecular mimicry, we performed phylogenomic analyses of biosynthetic pathways for Sias and related higher sugars derived from 5,7-diamino-3,5,7,9-tetradeoxynon-2-ulosonic acids. Examination of ≈1,000 sequenced microbial genomes indicated that such biosynthetic pathways are far more widely distributed than previously realized. Phylogenetic analysis, validated by targeted biochemistry, was used to predict NulO types (i.e., neuraminic, legionaminic, or pseudaminic acids) expressed by various organisms. This approach uncovered previously unreported occurrences of Sia pathways in pathogenic and symbiotic bacteria and identified at least one instance in which a human archaeal symbiont tentatively reported to express Sias in fact expressed the related pseudaminic acid structure. Evaluation of targeted phylogenies and protein domain organization revealed that the “unique” Sia biosynthetic pathway of animals was instead a much more ancient innovation. Pathway phylogenies suggest that bacterial pathogens may have acquired Sia expression via adaptation of pathways for legionaminic acid biosynthesis, one of at least 3 evolutionary paths for de novo Sia synthesis. Together, these data indicate that some of the long-standing paradigms in Sia biology should be reconsidered in a wider evolutionary context of the extended family of NulO sugars. PMID:19666579

  19. Full-Length Isoform Sequencing Reveals Novel Transcripts and Substantial Transcriptional Overlaps in a Herpesvirus

    PubMed Central

    Tombácz, Dóra; Csabai, Zsolt; Oláh, Péter; Balázs, Zsolt; Likó, István; Zsigmond, Laura; Sharon, Donald; Snyder, Michael; Boldogkői, Zsolt

    2016-01-01

    Whole transcriptome studies have become essential for understanding the complexity of genetic regulation. However, the conventionally applied short-read sequencing platforms cannot be used to reliably distinguish between many transcript isoforms. The Pacific Biosciences (PacBio) RS II platform is capable of reading long nucleic acid stretches in a single sequencing run. The pseudorabies virus (PRV) is an excellent system to study herpesvirus gene expression and potential interactions between the transcriptional units. In this work, non-amplified and amplified isoform sequencing protocols were used to characterize the poly(A+) fraction of the lytic transcriptome of PRV, with the aim of a complete transcriptional annotation of the viral genes. The analyses revealed a previously unrecognized complexity of the PRV transcriptome including the discovery of novel protein-coding and non-coding genes, novel mono- and polycistronic transcription units, as well as extensive transcriptional overlaps between neighboring and distal genes. This study identified non-coding transcripts overlapping all three replication origins of the PRV, which might play a role in the control of DNA synthesis. We additionally established the relative expression levels of gene products. Our investigations revealed that the whole PRV genome is utilized for transcription, including both DNA strands in all coding and intergenic regions. The genome-wide occurrence of transcript overlaps suggests a crosstalk between genes through a network formed by interacting transcriptional machineries with a potential function in the control of gene expression. PMID:27685795

  20. RNA Sequencing Revealed Numerous Polyketide Synthase Genes in the Harmful Dinoflagellate Karenia mikimotoi

    PubMed Central

    Kimura, Kei; Okuda, Shujiro; Nakayama, Kei; Shikata, Tomoyuki; Takahashi, Fumio; Yamaguchi, Haruo; Skamoto, Setsuko; Yamaguchi, Mineo; Tomaru, Yuji

    2015-01-01

    The dinoflagellate Karenia mikimotoi forms blooms in the coastal waters of temperate regions and occasionally causes massive fish and invertebrate mortality. This study aimed to elucidate the toxic effect of K. mikimotoi on marine organisms by using the genomics approach; RNA-sequence libraries were constructed, and data were analyzed to identify toxin-related genes. Next-generation sequencing produced 153,406 transcript contigs from the axenic culture of K. mikimotoi. BLASTX analysis against all assembled contigs revealed that 208 contigs were polyketide synthase (PKS) sequences. Thus, K. mikimotoi was thought to have several genes encoding PKS metabolites and to likely produce toxin-like polyketide molecules. Of all the sequences, approximately 30 encoded eight PKS genes, which were remarkably similar to those of Karenia brevis. Our phylogenetic analyses showed that these genes belonged to a new group of PKS type-I genes. Phylogenetic and active domain analyses showed that the amino acid sequence of four among eight Karenia PKS genes was not similar to any of the reported PKS genes. These PKS genes might possibly be associated with the synthesis of polyketide toxins produced by Karenia species. Further, a homology search revealed 10 contigs that were similar to a toxin gene responsible for the synthesis of saxitoxin (sxtA) in the toxic dinoflagellate Alexandrium fundyense. These contigs encoded A1–A3 domains of sxtA genes. Thus, this study identified some transcripts in K. mikimotoi that might be associated with several putative toxin-related genes. The findings of this study might help understand the mechanism of toxicity of K. mikimotoi and other dinoflagellates. PMID:26561394

  1. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    PubMed

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  2. Protein sequence conservation and stable molecular evolution reveals influenza virus nucleoprotein as a universal druggable target.

    PubMed

    Babar, Mustafeez Mujtaba; Zaidi, Najam-us-Sahar Sadaf

    2015-08-01

    The high mutation rate in influenza virus genome and appearance of drug resistance calls for a constant effort to identify alternate drug targets and develop new antiviral strategies. The internal proteins of the virus can be exploited as a potential target for therapeutic interventions. Among these, the nucleoprotein (NP) is the most abundant protein that provides structural and functional support to the viral replication machinery. The current study aims at analysis of protein sequence polymorphism patterns, degree of molecular evolution and sequence conservation as a function of potential druggability of nucleoprotein. We analyzed a universal set of amino acid sequences, (n=22,000) and, in order to identify and correlate the functionally conserved, druggable regions across different parameters, classified them on the basis of host organism, strain type and continental region of sample isolation. The results indicated that around 95% of the sequence length was conserved, with at least 7 regions conserved across the protein among various classes. Moreover, the highly variable regions, though very limited in number, were found to be positively selected indicating, thereby, the high degree of protein stability against various hosts and spatio-temporal references. Furthermore, on mapping the conserved regions on the protein, 7 drug binding pockets in the functionally important regions of the protein were revealed. The results, therefore, collectively indicate that nucleoprotein is a highly conserved and stable viral protein that can potentially be exploited for development of broadly effective antiviral strategies.

  3. Amino Acid Sequence of Human Cholinesterase

    DTIC Science & Technology

    1985-10-01

    liquid chromatography (HPLC). Activity testing of the aged, DFP-labeled cholinesterase showed that 99.8% of the active sites had been labeled, since...acids were quantitated by ninhydrin at the AAA Labs, or by derivatization with phenylisothiocyanate at the University of Michigan. The latter method

  4. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    PubMed Central

    Yates, Lucy R; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Loo, Peter Van; Aas, Turid; Alexandrov, Ludmil B; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre-Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R; Sotiriou, Christos; Richardson, Andrea L; Lønning, Per Eystein; Wedge, David C; Campbell, Peter J

    2015-01-01

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient’s tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole genome and targeted sequencing to multiple samples from each of 50 patients’ tumors (total 303). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13/50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resisting chemotherapy and acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer. PMID:26099045

  5. Fatty acids profiling reveals potential candidate markers of semen quality.

    PubMed

    Zerbinati, C; Caponecchia, L; Rago, R; Leoncini, E; Bottaccioli, A G; Ciacciarelli, M; Pacelli, A; Salacone, P; Sebastianelli, A; Pastore, A; Palleschi, G; Boccia, S; Carbone, A; Iuliano, L

    2016-11-01

    Previous reports showed altered fatty acid content in subjects with altered sperm parameters compared to normozoospermic individuals. However, these studies focused on a limited number of fatty acids, included a short number of subjects and results varied widely. We conducted a case-control study involving 155 patients allocated into four groups, including normozoospermia (n = 33), oligoasthenoteratozoospermia (n = 32), asthenozoospermia (n = 25), and varicocoele (n = 44). Fatty acid profiling, including 30 species, was analyzed by a validated gas chromatography (GC) method on the whole seminal fluid sample. Multinomial logistic regression modeling was used to identify the associations between fatty acids and the four groups. Specimens from 15 normozoospermic subjects were also analyzed for fatty acids content in the seminal plasma and spermatozoa to study the distribution in the two compartments. Fatty acids lipidome varied markedly between the four groups. Multinomial logistic regression modeling revealed that high levels of palmitic acid, behenic acid, oleic acid, and docosahexaenoic acid (DHA) confer a low risk to stay out of the normozoospermic group. In the whole population, seminal fluid stearic acid was negatively correlated (r = -0.53), and DHA was positively correlated (r = 0.65) with sperm motility. Some fatty acids were preferentially accumulated in spermatozoa and the highest difference was observed for DHA, which was 6.2 times higher in spermatozoa than in seminal plasma. The results of this study highlight complete fatty acids profile in patients with different semen parameters. Given the easy-to-follow and rapid method of analysis, fatty acid profiling by GC method can be used for therapeutic purposes and to measure compliance in infertility trials using fatty acids supplements.

  6. Cystatin. Amino acid sequence and possible secondary structure.

    PubMed Central

    Schwabe, C; Anastasi, A; Crow, H; McDonald, J K; Barrett, A J

    1984-01-01

    The amino acid sequence of cystatin, the protein from chicken egg-white that is a tight-binding inhibitor of many cysteine proteinases, is reported. Cystatin is composed of 116 amino acid residues, and the Mr is calculated to be 13 143. No striking similarity to any other known sequence has been detected. The results of computer analysis of the sequence and c.d. spectrometry indicate that the secondary structure includes relatively little alpha-helix (about 20%) and that the remainder is mainly beta-structure. PMID:6712597

  7. Exome sequencing reveals novel IRXI mutation in congenital heart disease.

    PubMed

    Guo, Changlong; Wang, Qidi; Wang, Yuting; Yang, Liping; Luo, Haiyan; Cao, Xiao Fang; An, Lisha; Qiu, Yue; Du, Meng; Ma, Xu; Li, Hui; Lu, Cailing

    2017-03-30

    Genetic variation in specific transcription factors during heart formation may lead to congenital heart disease (CHD) or even miscarriage. The aim of the present study was to identify CHD‑associated genes using next generation sequencing (NGS). The whole exome DNA sequence was obtained from a stillborn fetus diagnosed with tricuspid atresia and complete transposition of the great arteries using high‑throughput sequencing methods. Subsequently, genetic variants of CHD‑associated genes were selected and verified in 215 non‑syndromic CHD patients and 249 healthy control subjects using polymerase chain reaction combined with Sanger sequencing. Genetic variants of previously reported CHD‑inducing genes, such as cysteine rich with EGF like domains 1 and cbp/p300‑interacting transactivator with Glu/Asp rich carboxy‑terminal domain 2, were discovered through the NGS analysis. In addition, a novel non‑synonymous mutation of the iroquois homeobox 1 (IRX1) gene (p.Gln240Glu) was identified. A total of three non‑synonymous mutations (p.Gln240Glu, p.Ser298Asn and p.Ala381Glu) of the IRX1 gene were verified in 215 non‑syndromic CHD patients, but not in 249 healthy volunteers. The results demonstrated that NGS is a powerful tool to study the etiology of CHD. In addition, the results suggest that genetic variants of the IRX1 gene may contribute to the pathogenesis of CHD.

  8. Revealing the Complexity of Breast Cancer by Next Generation Sequencing

    PubMed Central

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of “-omic” platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  9. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  10. Amino acid sequence of anionic peroxidase from the windmill palm tree Trachycarpus fortunei.

    PubMed

    Baker, Margaret R; Zhao, Hongwei; Sakharov, Ivan Yu; Li, Qing X

    2014-12-10

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications.

  11. Deep sequencing reveals the complete genome and evidence for transcriptional activity of the first virus-like sequences identified in Aristotelia chilensis (Maqui Berry).

    PubMed

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-04-03

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  12. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry)

    PubMed Central

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F.; Alzate, Juan F.; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-01-01

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant. PMID:25855242

  13. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    SciTech Connect

    Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Van Loo, Peter; Aas, Turid; Alexandrov, Ludmil B.; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J.; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre -Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R.; Sotiriou, Christos; Richardson, Andrea L.; Lønning, Per Eystein; Wedge, David C.; Campbell, Peter J.

    2015-06-22

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.

  14. Subclonal diversification of primary breast cancer revealed by multiregion sequencing.

    PubMed

    Yates, Lucy R; Gerstung, Moritz; Knappskog, Stian; Desmedt, Christine; Gundem, Gunes; Van Loo, Peter; Aas, Turid; Alexandrov, Ludmil B; Larsimont, Denis; Davies, Helen; Li, Yilong; Ju, Young Seok; Ramakrishna, Manasa; Haugland, Hans Kristian; Lilleng, Peer Kaare; Nik-Zainal, Serena; McLaren, Stuart; Butler, Adam; Martin, Sancha; Glodzik, Dominic; Menzies, Andrew; Raine, Keiran; Hinton, Jonathan; Jones, David; Mudie, Laura J; Jiang, Bing; Vincent, Delphine; Greene-Colozzi, April; Adnet, Pierre-Yves; Fatima, Aquila; Maetens, Marion; Ignatiadis, Michail; Stratton, Michael R; Sotiriou, Christos; Richardson, Andrea L; Lønning, Per Eystein; Wedge, David C; Campbell, Peter J

    2015-07-01

    The sequencing of cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.

  15. Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    DOE PAGES

    Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian; ...

    2015-06-22

    Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and latemore » in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.« less

  16. Amino acid sequence of homologous rat atrial peptides: natriuretic activity of native and synthetic forms.

    PubMed Central

    Seidah, N G; Lazure, C; Chrétien, M; Thibault, G; Garcia, R; Cantin, M; Genest, J; Nutt, R F; Brady, S F; Lyle, T A

    1984-01-01

    A substance called atrial natriuretic factor (ANF), localized in secretory granules of atrial cardiocytes, was isolated as four homologous natriuretic peptides from homogenates of rat atria. The complete sequence of the longest form showed that it is composed of 33 amino acids. The three other shorter forms (2-33, 3-33, and 8-33) represent amino-terminally truncated versions of the 33 amino acid parent molecule as shown by analysis of sequence, amino acid composition, or both. The proposed primary structure agrees entirely with the amino acid composition and reveals no significant sequence homology with any known protein or segment of protein. The short form ANF-(8-33) was synthesized by a multi-fragment condensation approach and the synthetic product was shown to exhibit specific activity comparable to that of the natural ANF-(3-33). PMID:6232612

  17. Widespread occurrence of the tfd-II genes in soil bacteria revealed by nucleotide sequence analysis of 2,4-dichlorophenoxyacetic acid degradative plasmids pDB1 and p712.

    PubMed

    Kim, Dong-Uk; Kim, Min-Sun; Lim, Jong-Sung; Ka, Jong-Ok

    2013-05-01

    Variovorax sp. strain DB1 and Pseudomonas pickettii strain 712 are 2,4-dicholorophenoxy-acetic acid (2,4-D)-degrading bacteria, which were isolated from agricultural soils in Republic of Korea and USA, respectively. Each strain harbors a 2,4-D degradative plasmid and is able to utilize 2,4-D as the sole source of carbon for its growth. The 2,4-D degradative plasmid pDB1 of strain DB1 consisted of a 65,269-bp circular molecule with a G+C content of 66.23% and had 68 ORFs. The 2,4-D degradative plasmid p712 of strain 712 was composed of a 62,798-bp circular molecule with a 62.11% G+C content and had 62 ORFs. The plasmids pDB1 and p712 share significantly homologous 2,4-D degradative genes with high similarity to the tfdR, tfdB-II, tfdC-II, tfdD-II, tfdE-II, tfdF-II, tfdK and tfdA genes of plasmid pJP4 of Alcaligenes eutrophus isolated from Australia. In a phylogenetic analysis with trfA, traL, and trbA genes, pDB1 belonged to IncP-1β with pJP4, while p712 belonged to IncP-1ε with pKJK5 and pEMT3. The results indicated that, in spite of the differences in their backbone regions, the 2,4-D catabolic genes of the two plasmids were closely related and also related to the well-known 2,4-D degradative plasmid pJP4 even though all were isolated from different geographic regions. Other similarities in the genetic organization and the presence of IS1071 suggested that these catabolic genes may be on a transposable element, leading to widespread occurrence in soil bacteria.

  18. A Siglec-like sialic-acid-binding motif revealed in an adenovirus capsid protein

    PubMed Central

    Rademacher, Christoph; Bru, Thierry; McBride, Ryan; Robison, Elizabeth; Nycholat, Corwin M; Kremer, Eric J; Paulson, James C

    2012-01-01

    Sialic-acid-binding immunoglobulin-like lectins (Siglecs) are a family of transmembrane receptors that are well documented to play roles in regulation of innate and adaptive immune responses. To see whether the features that define the molecular recognition of sialic acid were found in other sialic-acid-binding proteins, we analyzed 127 structures with bound sialic acids found in the Protein Data Bank database. Of these, the canine adenovirus 2-fiber knob protein showed close local structural relationship to Siglecs despite low sequence similarity. The fiber knob harbors a noncanonical sialic-acid recognition site, which was then explored for detailed specificity using a custom glycan microarray comprising 58 diverse sialosides. It was found that the adenoviral protein preferentially recognizes the epitope Neu5Acα2-3[6S]Galβ1-4GlcNAc, a structure previously identified as the preferred ligand for Siglec-8 in humans and Siglec-F in mice. Comparison of the Siglec and fiber knob sialic-acid-binding sites reveal conserved structural elements that are not clearly identifiable from the primary amino acid sequence, suggesting a Siglec-like sialic-acid-binding motif that comprises the consensus features of these proteins in complex with sialic acid. PMID:22522600

  19. The genome sequencing of an albino Western lowland gorilla reveals inbreeding in the wild

    PubMed Central

    2013-01-01

    Background The only known albino gorilla, named Snowflake, was a male wild born individual from Equatorial Guinea who lived at the Barcelona Zoo for almost 40 years. He was diagnosed with non-syndromic oculocutaneous albinism, i.e. white hair, light eyes, pink skin, photophobia and reduced visual acuity. Despite previous efforts to explain the genetic cause, this is still unknown. Here, we study the genetic cause of his albinism and making use of whole genome sequencing data we find a higher inbreeding coefficient compared to other gorillas. Results We successfully identified the causal genetic variant for Snowflake’s albinism, a non-synonymous single nucleotide variant located in a transmembrane region of SLC45A2. This transporter is known to be involved in oculocutaneous albinism type 4 (OCA4) in humans. We provide experimental evidence that shows that this amino acid replacement alters the membrane spanning capability of this transmembrane region. Finally, we provide a comprehensive study of genome-wide patterns of autozygogosity revealing that Snowflake’s parents were related, being this the first report of inbreeding in a wild born Western lowland gorilla. Conclusions In this study we demonstrate how the use of whole genome sequencing can be extended to link genotype and phenotype in non-model organisms and it can be a powerful tool in conservation genetics (e.g., inbreeding and genetic diversity) with the expected decrease in sequencing cost. PMID:23721540

  20. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants

    PubMed Central

    Llauro, Christel; Jobet, Edouard; Robakowska-Hyzorek, Dagmara; Lasserre, Eric; Ghesquière, Alain; Panaud, Olivier

    2017-01-01

    Retrotransposons are mobile genetic elements abundant in plant and animal genomes. While efficiently silenced by the epigenetic machinery, they can be reactivated upon stress or during development. Their level of transcription not reflecting their transposition ability, it is thus difficult to evaluate their contribution to the active mobilome. Here we applied a simple methodology based on the high throughput sequencing of extrachromosomal circular DNA (eccDNA) forms of active retrotransposons to characterize the repertoire of mobile retrotransposons in plants. This method successfully identified known active retrotransposons in both Arabidopsis and rice material where the epigenome is destabilized. When applying mobilome-seq to developmental stages in wild type rice, we identified PopRice as a highly active retrotransposon producing eccDNA forms in the wild type endosperm. The mobilome-seq strategy opens new routes for the characterization of a yet unexplored fraction of plant genomes. PMID:28212378

  1. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  2. Molecular Dynamic Simulations Reveal the Structural Determinants of Fatty Acid Binding to Oxy-Myoglobin

    PubMed Central

    Chintapalli, Sree V.; Bhardwaj, Gaurav; Patel, Reema; Shah, Natasha; Patterson, Randen L.; van Rossum, Damian B.; Anishkin, Andriy; Adams, Sean H.

    2015-01-01

    The mechanism(s) by which fatty acids are sequestered and transported in muscle have not been fully elucidated. A potential key player in this process is the protein myoglobin (Mb). Indeed, there is a catalogue of empirical evidence supporting direct interaction of globins with fatty acid metabolites; however, the binding pocket and regulation of the interaction remains to be established. In this study, we employed a computational strategy to elucidate the structural determinants of fatty acids (palmitic & oleic acid) binding to Mb. Sequence analysis and docking simulations with a horse (Equus caballus) structural Mb reference reveals a fatty acid-binding site in the hydrophobic cleft near the heme region in Mb. Both palmitic acid and oleic acid attain a “U” shaped structure similar to their conformation in pockets of other fatty acid-binding proteins. Specifically, we found that the carboxyl head group of palmitic acid coordinates with the amino group of Lys45, whereas the carboxyl group of oleic acid coordinates with both the amino groups of Lys45 and Lys63. The alkyl tails of both fatty acids are supported by surrounding hydrophobic residues Leu29, Leu32, Phe33, Phe43, Phe46, Val67, Val68 and Ile107. In the saturated palmitic acid, the hydrophobic tail moves freely and occasionally penetrates deeper inside the hydrophobic cleft, making additional contacts with Val28, Leu69, Leu72 and Ile111. Our simulations reveal a dynamic and stable binding pocket in which the oxygen molecule and heme group in Mb are required for additional hydrophobic interactions. Taken together, these findings support a mechanism in which Mb acts as a muscle transporter for fatty acid when it is in the oxygenated state and releases fatty acid when Mb converts to deoxygenated state. PMID:26030763

  3. Extensive amino acid sequence homologies between animal lectins

    SciTech Connect

    Paroutaud, P.; Levi, G.; Teichberg, V.I.; Strosberg, A.D.

    1987-09-01

    The authors have established the amino acid sequence of the ..beta..-D-galactoside binding lectin from the electric eel and the sequences of several peptides from a similar lectin isolated from human placenta. These sequences were compared with the published sequences of peptides derived from the ..beta..-D-galactoside binding lectin from human lung and with sequences deduced from cDNAs assigned to the ..beta..-D-galactoside binding lectins from chicken embryo skin and human hepatomas. Significant homologies were observed. One of the highly conserved regions that contains a tryptophan residue and two glutamic acid resides is probably part of the ..beta..-D-galactoside binding site, which, on the basis of spectroscopic studies of the electric eel lectin, is expected to contain such residues. The similarity of the hydropathy profiles and the predicted secondary structure of the lectins from chicken skin and electric eel, in spite of differences in their amino acid sequences, strongly suggests that these proteins have maintained structural homologies during evolution and together with the other ..beta..-D-galactoside binding lectins were derived form a common ancestor gene.

  4. The Arthrobacter arilaitensis Re117 Genome Sequence Reveals Its Genetic Adaptation to the Surface of Cheese

    PubMed Central

    Monnet, Christophe; Loux, Valentin; Gibrat, Jean-François; Spinnler, Eric; Barbe, Valérie; Vacherie, Benoit; Gavory, Frederick; Gourbeyre, Edith; Siguier, Patricia; Chandler, Michaël; Elleuch, Rayda

    2010-01-01

    Arthrobacter arilaitensis is one of the major bacterial species found at the surface of cheeses, especially in smear-ripened cheeses, where it contributes to the typical colour, flavour and texture properties of the final product. The A. arilaitensis Re117 genome is composed of a 3,859,257 bp chromosome and two plasmids of 50,407 and 8,528 bp. The chromosome shares large regions of synteny with the chromosomes of three environmental Arthrobacter strains for which genome sequences are available: A. aurescens TC1, A. chlorophenolicus A6 and Arthrobacter sp. FB24. In contrast however, 4.92% of the A. arilaitensis chromosome is composed of ISs elements, a portion that is at least 15 fold higher than for the other Arthrobacter strains. Comparative genomic analyses reveal an extensive loss of genes associated with catabolic activities, presumably as a result of adaptation to the properties of the cheese surface habitat. Like the environmental Arthrobacter strains, A. arilaitensis Re117 is well-equipped with enzymes required for the catabolism of major carbon substrates present at cheese surfaces such as fatty acids, amino acids and lactic acid. However, A. arilaitensis has several specificities which seem to be linked to its adaptation to its particular niche. These include the ability to catabolize D-galactonate, a high number of glycine betaine and related osmolyte transporters, two siderophore biosynthesis gene clusters and a high number of Fe3+/siderophore transport systems. In model cheese experiments, addition of small amounts of iron strongly stimulated the growth of A. arilaitensis, indicating that cheese is a highly iron-restricted medium. We suggest that there is a strong selective pressure at the surface of cheese for strains with efficient iron acquisition and salt-tolerance systems together with abilities to catabolize substrates such as lactic acid, lipids and amino acids. PMID:21124797

  5. Amino acid sequence of porcine spleen cathepsin D.

    PubMed Central

    Shewale, J G; Tang, J

    1984-01-01

    The amino acid sequence of porcine spleen cathepsin D heavy chain has been determined and, hence, the complete structure of this enzyme is now known. The sequence of heavy chain was constructed by aligning the structures of peptides generated by cyanogen bromide, trypsin, and endo-proteinase Lys C cleavages. The structure of the light chain has been published previously. The cathepsin D molecule contains 339 amino acid residues in two polypeptide chains: a 97-residue light chain and a 242-residue heavy chain, with a combined Mr of 36,779 (without carbohydrate). There are two carbohydrate units linked to asparagine residues 70 and 192. The disulfide bond arrangement in cathepsin D is probably similar to that of pepsin, because the positions of six half-cystine residues are conserved. The active site aspartyl residues, corresponding to aspartic acid-32 and -215 of pepsin, are located at residues 33 and 224 in the cathepsin D molecule. The amino acid sequence around these aspartyl residues is strongly conserved. Cathepsin D shows a strong homology with other acid proteases. When the sequence of cathepsin D, renin, and pepsin are aligned, 32.7% of the residues are identical. The homology is observed throughout the length of the molecules, indicating that three-dimensional structures of all three molecules are similar. PMID:6587385

  6. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  7. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  8. Key roles for freshwater Actinobacteria revealed by deep metagenomic sequencing.

    PubMed

    Ghai, Rohit; Mizuno, Carolina Megumi; Picazo, Antonio; Camacho, Antonio; Rodriguez-Valera, Francisco

    2014-12-01

    Freshwater ecosystems are critical but fragile environments directly affecting society and its welfare. However, our understanding of genuinely freshwater microbial communities, constrained by our capacity to manipulate its prokaryotic participants in axenic cultures, remains very rudimentary. Even the most abundant components, freshwater Actinobacteria, remain largely unknown. Here, applying deep metagenomic sequencing to the microbial community of a freshwater reservoir, we were able to circumvent this traditional bottleneck and reconstruct de novo seven distinct streamlined actinobacterial genomes. These genomes represent three new groups of photoheterotrophic, planktonic Actinobacteria. We describe for the first time genomes of two novel clades, acMicro (Micrococcineae, related to Luna2,) and acAMD (Actinomycetales, related to acTH1). Besides, an aggregate of contigs belonged to a new branch of the Acidimicrobiales. All are estimated to have small genomes (approximately 1.2 Mb), and their GC content varied from 40 to 61%. One of the Micrococcineae genomes encodes a proteorhodopsin, a rhodopsin type reported for the first time in Actinobacteria. The remarkable potential capacity of some of these genomes to transform recalcitrant plant detrital material, particularly lignin-derived compounds, suggests close linkages between the terrestrial and aquatic realms. Moreover, abundances of Actinobacteria correlate inversely to those of Cyanobacteria that are responsible for prolonged and frequently irretrievable damage to freshwater ecosystems. This suggests that they might serve as sentinels of impending ecological catastrophes.

  9. Nascent RNA sequencing reveals distinct features in plant transcription

    PubMed Central

    Hetzel, Jonathan; Duttke, Sascha H.; Benner, Christopher; Chory, Joanne

    2016-01-01

    Transcriptional regulation of gene expression is a major mechanism used by plants to confer phenotypic plasticity, and yet compared with other eukaryotes or bacteria, little is known about the design principles. We generated an extensive catalog of nascent and steady-state transcripts in Arabidopsis thaliana seedlings using global nuclear run-on sequencing (GRO-seq), 5′GRO-seq, and RNA-seq and reanalyzed published maize data to capture characteristics of plant transcription. De novo annotation of nascent transcripts accurately mapped start sites and unstable transcripts. Examining the promoters of coding and noncoding transcripts identified comparable chromatin signatures, a conserved “TGT” core promoter motif and unreported transcription factor-binding sites. Mapping of engaged RNA polymerases showed a lack of enhancer RNAs, promoter-proximal pausing, and divergent transcription in Arabidopsis seedlings and maize, which are commonly present in yeast and humans. In contrast, Arabidopsis and maize genes accumulate RNA polymerases in proximity of the polyadenylation site, a trend that coincided with longer genes and CpG hypomethylation. Lack of promoter-proximal pausing and a higher correlation of nascent and steady-state transcripts indicate Arabidopsis may regulate transcription predominantly at the level of initiation. Our findings provide insight into plant transcription and eukaryotic gene expression as a whole. PMID:27729530

  10. The amino acid sequence of iguana (Iguana iguana) pancreatic ribonuclease.

    PubMed

    Zhao, W; Beintema, J J; Hofsteenge, J

    1994-01-15

    The pyrimidine-specific ribonuclease superfamily constitutes a group of homologous proteins so far found only in higher vertebrates. Four separate families are found in mammals, which have resulted from gene duplications in mammalian ancestors. To learn more about the evolutionary history of this superfamily, the primary structure and other characteristics of the pancreatic enzyme from iguana (Iguana iguana), a herbivorous lizard species belonging to the reptiles, have been determined. The polypeptide chain consists of 119 amino acid residues. The positions of insertions and deletions in the sequence are identical to those in the enzyme from snapping turtle. However, the two enzymes differ at 54% of the amino acid positions. Iguana ribonuclease contains no carbohydrate, although the enzyme possesses three recognition sites for carbohydrate attachment, and has a high number of acidic residues in a localized part of the sequence.

  11. Analysis of putative nonulosonic acid biosynthesis pathways in Archaea reveals a complex evolutionary history.

    PubMed

    Kandiba, Lina; Eichler, Jerry

    2013-08-01

    Sialic acids and the other nonulosonic acid sugars, legionaminic acid and pseudaminic acid, are nine carbon-containing sugars that can be detected as components of the glycans decorating proteins and other molecules in Eukarya and Bacteria. Yet, despite the prevalence of N-glycosylation in Archaea and the variety of sugars recruited for the archaeal version of this post-translational modification, only a single report of a nonulosonic acid sugar in an archaeal N-linked glycan has appeared. Hence, to obtain a clearer picture of nonulosonic acid sugar biosynthesis capability in Archaea, 122 sequenced genomes were scanned for the presence of genes involved in the biogenesis of these sugars. The results reveal that while Archaea and Bacteria share a common route of sialic acid biosynthesis, numerous archaeal nonulosonic acid sugar biosynthesis pathway components were acquired from elsewhere via various routes. Still, the limited number of Archaea encoding components involved in the synthesis of nonulosonic acid sugars implies that such saccharides are not major components of glycans in this domain.

  12. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

    PubMed

    Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

    2007-04-13

    Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.

  13. Molecular Phylogeny of Sequenced Saccharomycetes Reveals Polyphyly of the Alternative Yeast Codon Usage

    PubMed Central

    Mühlhausen, Stefanie; Kollmar, Martin

    2014-01-01

    The universal genetic code defines the translation of nucleotide triplets, called codons, into amino acids. In many Saccharomycetes a unique alteration of this code affects the translation of the CUG codon, which is normally translated as leucine. Most of the species encoding CUG alternatively as serine belong to the Candida genus and were grouped into a so-called CTG clade. However, the “Candida genus” is not a monophyletic group and several Candida species are known to use the standard CUG translation. The codon identity could have been changed in a single branch, the ancestor of the Candida, or to several branches independently leading to a polyphyletic alternative yeast codon usage (AYCU). In order to resolve the monophyly or polyphyly of the AYCU, we performed a phylogenomics analysis of 26 motor and cytoskeletal proteins from 60 sequenced yeast species. By investigating the CUG codon positions with respect to sequence conservation at the respective alignment positions, we were able to unambiguously assign the standard code or AYCU. Quantitative analysis of the highly conserved leucine and serine alignment positions showed that 61.1% and 17% of the CUG codons coding for leucine and serine, respectively, are at highly conserved positions, whereas only 0.6% and 2.3% of the CUG codons, respectively, are at positions conserved in the respective other amino acid. Plotting the codon usage onto the phylogenetic tree revealed the polyphyly of the AYCU with Pachysolen tannophilus and the CTG clade branching independently within a time span of 30–100 Ma. PMID:25646540

  14. Amino acid sequence and comparative antigenicity of chicken metallothionein.

    PubMed Central

    McCormick, C C; Fullmer, C S; Garvey, J S

    1988-01-01

    The complete amino acid sequence of metallothionein (MT) from chicken liver is reported. The primary structure was determined by automated sequence analysis of peptides produced by limited acid hydrolysis and by trypsin digestion. The comparative antigenicity of chicken MT was determined by radioimmunoassay using rabbit anti-rat MT polyclonal antibody. Chicken MT consists of 63 amino acids as compared to 61 found in MTs from mammals. One insertion (and two substitutions) occurs in the amino-terminal region, a region considered invariant among mammalian MTs. Eighteen of the 20 cysteines in chicken MT were aligned with cysteines from other mammalian sequences. Two cysteines near the carboxyl terminus are shifted by one residue due to the insertion of proline in that region. Overall, the chicken protein showed approximately equal to 68% sequence identity in a comparison with various mammalian MTs. The affinity of the polyclonal antibody for chicken MT was decreased by 2 orders of magnitude in comparison to that of a mammalian MT (rat MT isoforms). This reduced affinity is attributed to major substitutions in chicken MT in the regions of the principal determinants of mammalian MTs. Theoretical analysis of the primary structure predicted the secondary structure to consist of reverse turns and random coils with no stable beta or helix conformations. There is no evidence that chicken MT differs functionally from mammalian MTs. PMID:2448773

  15. Genome Sequence of Thermofilum pendens Reveals an Exceptional Loss of Biosynthetic Pathways without Genome Reduction

    SciTech Connect

    Anderson, Iain; Rodriquez, Jason; Susanti, Dwi; Porat, I.; Reich, Claudia; Ulrich, Luke; Elkins, James G; Mavromatis, K; Lykidis, A; Kim, Edwin; Thompson, Linda S; Nolan, Matt; Land, Miriam L; Copeland, A; Lapidus, Alla L.; Lucas, Susan; Detter, J C; Zhulin, Igor B; Olsen, Gary; Whitman, W. B.; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos C

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching member of class Thermoproteales of Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first Crenarchaeote and only the second archaeon found to have transporters of the phosphotransferase system. T. pendens is known to require an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. T. pendens has fewer biosynthetic enzymes than any other free-living organism. In addition to heterotrophy, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein from a new subfamily. Predicted highly expressed proteins include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins, suggesting that defense against viruses is a high priority.

  16. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    SciTech Connect

    Kyrpides, Nikos; Anderson, Iain; Rodriguez, Jason; Susanti, Dwi; Porat, Iris; Reich, Claudia; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Lykidis, Athanasios; Kim, Edwin; Thompson, Linda S.; Nolan, Matt; Land, Miriam; Copeland, Alex; Lapidus, Alla; Lucas, Susan; Detter, Chris; Zhulin, Igor B.; Olsen, Gary J.; Whitman, William; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. In fact T. pendens has fewer biosynthetic enzymes than obligate intracellular parasites, although it does not display other features common among obligate parasites and thus does not appear to be in the process of becoming a parasite. It appears that T. pendens has adapted to life in an environment rich in nutrients. T. pendens was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first crenarchaeote and only the second archaeon found to have a transporter of the phosphotransferase system. In addition to fermentation, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein. Predicted highly expressed proteins do not include housekeeping genes, and instead include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins.

  17. Genome sequence of the necrotrophic plant pathogen Pythium ultimum reveals original pathogenicity mechanisms and effector repertoire.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The P. ultimum DAOM BR144 (=CBS 805.95 = ATCC200006) genome (42.8 Mb) encodes 15,290 genes, and has extensive sequence similarity and synteny with related Phytophthora spp., including the potato late blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86 % o...

  18. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  19. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  20. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group.

  1. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  2. Amino-Acid Sequence of NADP-Specific Glutamate Dehydrogenase of Neurospora crassa

    PubMed Central

    Wootton, John C.; Chambers, Geoffrey K.; Holder, Anthony A.; Baron, Andrew J.; Taylor, John G.; Fincham, John R. S.; Blumenthal, Kenneth M.; Moon, Kenneth; Smith, Emil L.

    1974-01-01

    A tentative primary structure of the NADP-specific glutamate dehydrogenase [L-glutamate: NADP oxidoreductase (deaminating), EC 1.4.1.4] from Neurospora crassa has been determined. The proposed sequence contains 452 amino-acid residues in each of the identical subunits of the hexameric enzyme. Comparison of the sequence with that of the bovine liver enzyme reveals considerable homology in the amino-terminal portion of the chain, including the vicinity of the reactive lysine, with only shorter stretches of homology within the carboxyl-terminal regions. The significance of this distribution of homologous regions is discussed. PMID:4155068

  3. Exhaustive T-cell repertoire sequencing of human peripheral blood samples reveals signatures of antigen selection and a directly measured repertoire size of at least 1 million clonotypes.

    PubMed

    Warren, René L; Freeman, J Douglas; Zeng, Thomas; Choe, Gina; Munro, Sarah; Moore, Richard; Webb, John R; Holt, Robert A

    2011-05-01

    Massively parallel sequencing is a useful approach for characterizing T-cell receptor diversity. However, immune receptors are extraordinarily difficult sequencing targets because any given receptor variant may be present in very low abundance and may differ legitimately by only a single nucleotide. We show that the sensitivity of sequence-based repertoire profiling is limited by both sequencing depth and sequencing accuracy. At two timepoints, 1 wk apart, we isolated bulk PBMC plus naïve (CD45RA+/CD45RO-) and memory (CD45RA-/CD45RO+) T-cell subsets from a healthy donor. From T-cell receptor beta chain (TCRB) mRNA we constructed and sequenced multiple libraries to obtain a total of 1.7 billion paired sequence reads. The sequencing error rate was determined empirically and used to inform a high stringency data filtering procedure. The error filtered data yielded 1,061,522 distinct TCRB nucleotide sequences from this subject which establishes a new, directly measured, lower limit on individual T-cell repertoire size and provides a useful reference set of sequences for repertoire analysis. TCRB nucleotide sequences obtained from two additional donors were compared to those from the first donor and revealed limited sharing (up to 1.1%) of nucleotide sequences among donors, but substantially higher sharing (up to 14.2%) of inferred amino acid sequences. For each donor, shared amino acid sequences were encoded by a much larger diversity of nucleotide sequences than were unshared amino acid sequences. We also observed a highly statistically significant association between numbers of shared sequences and shared HLA class I alleles.

  4. The complementary deoxyribonucleic acid sequence of guinea pig endometrial prorelaxin.

    PubMed

    Lee, Y A; Bryant-Greenwood, G D; Mandel, M; Greenwood, F C

    1992-03-01

    The nucleotide sequence of the relaxin gene transcript in the endometrium of the late pregnant guinea pig has been determined. The strategy used was a combination of polymerase chain reaction (PCR) with primers designed from the mRNA sequence of porcine preprorelaxin, rapid amplification of cDNA ends-PCR, and blunt end cloning in M13 mp18. With heterologous primers, a 226-basepair (bp) segment of the guinea pig relaxin gene sequence was obtained and was used to design a guinea pig-specific primer for use with the rapid amplification of cDNA ends-PCR method. The latter allowed completion of the sequence of 336 bp, with a 96-bp overlap. The sequence obtained shows greater homology at both the nucleotide and amino acid levels with porcine and human relaxins H1 and H2 than with rat relaxin, supporting the thesis that the guinea pig is not a rodent. The transcription of the guinea pig endometrial relaxin gene during pregnancy was confirmed by Northern analysis of guinea pig endometrial tissues with a species-specific cDNA probe. The endometrial relaxin gene is transcribed during pregnancy, but not in lactation, consistent with the observed immunostaining for relaxin.

  5. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  6. Comment on "Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry".

    PubMed

    Buckley, Mike; Walker, Angela; Ho, Simon Y W; Yang, Yue; Smith, Colin; Ashton, Peter; Oates, Jane Thomas; Cappellini, Enrico; Koon, Hannah; Penkman, Kirsty; Elsworth, Ben; Ashford, Dave; Solazzo, Caroline; Andrews, Phillip; Strahler, John; Shapiro, Beth; Ostrom, Peggy; Gandhi, Hasand; Miller, Webb; Raney, Brian; Zylber, Maria Ines; Gilbert, M Thomas P; Prigodich, Richard V; Ryan, Michael; Rijsdijk, Kenneth F; Janoo, Anwar; Collins, Matthew J

    2008-01-04

    We used authentication tests developed for ancient DNA to evaluate claims by Asara et al. (Reports, 13 April 2007, p. 280) of collagen peptide sequences recovered from mastodon and Tyrannosaurus rex fossils. Although the mastodon samples pass these tests, absence of amino acid composition data, lack of evidence for peptide deamidation, and association of alpha1(I) collagen sequences with amphibians rather than birds suggest that T. rex does not.

  7. Preliminary whole-exome sequencing reveals mutations that imply common tumorigenicity pathways in multiple endocrine neoplasia type 1 patients

    PubMed Central

    Arenas, Minerva Angélica Romero; Fowler, Richard G.; Lucas, F. Anthony San; Shen, Jie; Rich, Thereasa A.; Grubbs, Elizabeth G.; Lee, Jeffrey E.; Scheet, Paul; Perrier, Nancy D.; Zhao, Hua

    2016-01-01

    Background Whole-exome sequencing studies have not established definitive somatic mutation patterns among patients with sporadic hyperparathyroidism (HPT). No sequencing has evaluated multiple endocrine neoplasia type 1 (MEN1)-related HPT. We sought to perform whole-exome sequencing in HPT patients to identify somatic mutations and associated biological pathways and tumorigenic networks. Methods Whole-exome sequencing was performed on blood and tissue from HPT patients (MEN1 and sporadic) and somatic single nucleotide variants (SNVs) were identified. Stop-gain and stop-loss SNVs were analyzed with Ingenuity Pathways Analysis (IPA). Loss of heterozygosity (LOH) was also assessed. Results Sequencing was performed on 4 MEN1 and 10 sporadic cases. Eighteen stop-gain/stop-loss SNV mutations were identified in 3 MEN1 patients. One complex network was identified on IPA: Cellular function and maintenance, tumor morphology, and cardiovascular disease (IPA score = 49). A nonsynonymous SNV of TP53 (lysine-to-glutamic acid change at codon 81) identified in a MEN1 patient was suggested to be a driver mutation (Cancer-specific High-throughput Annotation of Somatic Mutations; P = .002). All MEN1 and 3/10 sporadic specimens demonstrated LOH of chromosome 11. Conclusion Whole-exome sequencing revealed somatic mutations in MEN1 associated with a single tumorigenic network, whereas sporadic pathogenesis seemed to be more diverse. A somatic TP53 mutation was also identified. LOH of chromosome 11 was seen in all MEN1 and 3 of 10 sporadic patients. PMID:25456907

  8. Deep sequencing reveals microbiota dysbiosis of tongue coat in patients with liver carcinoma

    NASA Astrophysics Data System (ADS)

    Lu, Haifeng; Ren, Zhigang; Li, Ang; Zhang, Hua; Jiang, Jianwen; Xu, Shaoyan; Luo, Qixia; Zhou, Kai; Sun, Xiaoli; Zheng, Shusen; Li, Lanjuan

    2016-09-01

    Liver carcinoma (LC) is a common malignancy worldwide, associated with high morbidity and mortality. Characterizing microbiome profiles of tongue coat may provide useful insights and potential diagnostic marker for LC patients. Herein, we are the first time to investigate tongue coat microbiome of LC patients with cirrhosis based on 16S ribosomal RNA (rRNA) gene sequencing. After strict inclusion and exclusion criteria, 35 early LC patients with cirrhosis and 25 matched healthy subjects were enrolled. Microbiome diversity of tongue coat in LC patients was significantly increased shown by Shannon, Simpson and Chao 1 indexes. Microbiome on tongue coat was significantly distinguished LC patients from healthy subjects by principal component analysis. Tongue coat microbial profiles represented 38 operational taxonomic units assigned to 23 different genera, distinguishing LC patients. Linear discriminant analysis (LDA) effect size (LEfSe) reveals significant microbial dysbiosis of tongue coats in LC patients. Strikingly, Oribacterium and Fusobacterium could distinguish LC patients from healthy subjects. LEfSe outputs show microbial gene functions related to categories of nickel/iron_transport, amino_acid_transport, energy produced system and metabolism between LC patients and healthy subjects. These findings firstly identify microbiota dysbiosis of tongue coat in LC patients, may providing novel and non-invasive potential diagnostic biomarker of LC.

  9. Metagenomic sequencing reveals the relationship between microbiota composition and quality of Chinese Rice Wine

    PubMed Central

    Hong, Xutao; Chen, Jing; Liu, Lin; Wu, Huan; Tan, Haiqin; Xie, Guangfa; Xu, Qian; Zou, Huijun; Yu, Wenjing; Wang, Lan; Qin, Nan

    2016-01-01

    Chinese Rice Wine (CRW) is a common alcoholic beverage in China. To investigate the influence of microbial composition on the quality of CRW, high throughput sequencing was performed for 110 wine samples on bacterial 16S rRNA gene and fungal Internal Transcribed Spacer II (ITS2). Bioinformatic analyses demonstrated that the quality of yeast starter and final wine correlated with microbial taxonomic composition, which was exemplified by our finding that wine spoilage resulted from a high proportion of genus Lactobacillus. Subsequently, based on Lactobacillus abundance of an early stage, a model was constructed to predict final wine quality. In addition, three batches of 20 representative wine samples selected from a pool of 110 samples were further analyzed in metagenomics. The results revealed that wine spoilage was due to rapid growth of Lactobacillus brevis at the early stage of fermentation. Gene functional analysis indicated the importance of some pathways such as synthesis of biotin, malolactic fermentation and production of short-chain fatty acid. These results led to a conclusion that metabolisms of microbes influence the wine quality. Thus, nurturing of beneficial microbes and inhibition of undesired ones are both important for the mechanized brewery. PMID:27241862

  10. Deep sequencing reveals microbiota dysbiosis of tongue coat in patients with liver carcinoma

    PubMed Central

    Lu, Haifeng; Ren, Zhigang; Li, Ang; Zhang, Hua; Jiang, Jianwen; Xu, Shaoyan; Luo, Qixia; Zhou, Kai; Sun, Xiaoli; Zheng, Shusen; Li, Lanjuan

    2016-01-01

    Liver carcinoma (LC) is a common malignancy worldwide, associated with high morbidity and mortality. Characterizing microbiome profiles of tongue coat may provide useful insights and potential diagnostic marker for LC patients. Herein, we are the first time to investigate tongue coat microbiome of LC patients with cirrhosis based on 16S ribosomal RNA (rRNA) gene sequencing. After strict inclusion and exclusion criteria, 35 early LC patients with cirrhosis and 25 matched healthy subjects were enrolled. Microbiome diversity of tongue coat in LC patients was significantly increased shown by Shannon, Simpson and Chao 1 indexes. Microbiome on tongue coat was significantly distinguished LC patients from healthy subjects by principal component analysis. Tongue coat microbial profiles represented 38 operational taxonomic units assigned to 23 different genera, distinguishing LC patients. Linear discriminant analysis (LDA) effect size (LEfSe) reveals significant microbial dysbiosis of tongue coats in LC patients. Strikingly, Oribacterium and Fusobacterium could distinguish LC patients from healthy subjects. LEfSe outputs show microbial gene functions related to categories of nickel/iron_transport, amino_acid_transport, energy produced system and metabolism between LC patients and healthy subjects. These findings firstly identify microbiota dysbiosis of tongue coat in LC patients, may providing novel and non-invasive potential diagnostic biomarker of LC. PMID:27605161

  11. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  12. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  13. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  14. The amino acid sequence of rabbit cardiac troponin I.

    PubMed Central

    Grand, R J; Wilkinson, J M

    1976-01-01

    The complete amino acid sequence of troponin I from rabbit cardiac muscle was determined by the isolation of four unique CNBr fragments, together with overlapping tryptic peptides containing radioactive methionine residues. Overlap data for residues 35-36, 93-94 and 140-145 are incomplete, the sequence at these positions being based on homology with the sequence of the fast-skeletal-muscle protein. Cardiac troponin I is a single polypeptide chain of 206 residues with mol.wt. 23550 and an extinction coefficient, E 1%,1cm/280, of 4.37. The protein has a net positive charge of 14 and is thus somewhat more basic than troponin I from fast-skeletal muscle. Comparison of the sequences of troponin I from cardiac and fast skeletal muscle show that the cardiac protein has 26 extra residues at the N-terminus which account for the larger size of the protein. In the remainder of sequence there is a considerable degree of homology, this being greater in the C-terminal two-thirds of the molecule. The region in the cardiac protein corresponding to the peptide with inhibitory activity from the fast-skeletal-muscle protein is very similar and it seems unlikely that this is the cause of the difference in inhibitory activity between the two proteins. The region responsible for binding troponin C, however, possesses a lower degree of homology. Detailed evidence on which the sequence is based has been deposited as Supplementary Publication SUP 50072 (20 pages), at the British Library Lending Division, Boston Spa, Wetherby, West Yorkshire LS23 7QB, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1976) 153, 5. PMID:1008822

  15. Deep sequencing reveals global patterns of mRNA recruitment during translation initiation

    PubMed Central

    Gao, Rong; Yu, Kai; Nie, Jukui; Lian, Tengfei; Jin, Jianshi; Liljas, Anders; Su, Xiao-Dong

    2016-01-01

    In this work, we developed a method to systematically study the sequence preference of mRNAs during translation initiation. Traditionally, the dynamic process of translation initiation has been studied at the single molecule level with limited sequencing possibility. Using deep sequencing techniques, we identified the sequence preference at different stages of the initiation complexes. Our results provide a comprehensive and dynamic view of the initiation elements in the translation initiation region (TIR), including the S1 binding sequence, the Shine-Dalgarno (SD)/anti-SD interaction and the second codon, at the equilibrium of different initiation complexes. Moreover, our experiments reveal the conformational changes and regional dynamics throughout the dynamic process of mRNA recruitment. PMID:27460773

  16. Isolation of Hox Cluster Genes from Insects Reveals an Accelerated Sequence Evolution Rate

    PubMed Central

    Hadrys, Heike; Simon, Sabrina; Kaune, Barbara; Schmitt, Oliver; Schöner, Anja; Jakob, Wolfgang; Schierwater, Bernd

    2012-01-01

    Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda) that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera). We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx) from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution. PMID:22685537

  17. Complete amino acid sequence of the A chain of human complement-classical-pathway enzyme C1r.

    PubMed Central

    Arlaud, G J; Willis, A C; Gagnon, J

    1987-01-01

    The amino acid sequence of human C1r A chain was determined, from sequence analysis performed on fragments obtained from C1r autolytic cleavage, cleavage of methionyl bonds, tryptic cleavages at arginine and lysine residues, and cleavages by staphylococcal proteinase. The polypeptide chain has an N-terminal serine residue and contains 446 amino acid residues (Mr 51,200). The sequence data allow chemical characterization of fragments alpha (positions 1-211), beta (positions 212-279) and gamma (positions 280-446) yielded from C1r autolytic cleavage, and identification of the two major cleavage sites generating these fragments. Position 150 of C1r A chain is occupied by a modified amino acid residue that, upon acid hydrolysis, yields erythro-beta-hydroxyaspartic acid, and that is located in a sequence homologous to the beta-hydroxyaspartic acid-containing regions of Factor IX, Factor X, protein C and protein Z. Sequence comparison reveals internal homology between two segments (positions 10-78 and 186-257). Two carbohydrate moieties are attached to the polypeptide chain, both via asparagine residues at positions 108 and 204. Combined with the previously determined sequence of C1r B chain [Arlaud & Gagnon (1983) Biochemistry 22, 1758-1764], these data give the complete sequence of human C1r. PMID:3036070

  18. Nucleotide and derived amino acid sequences of the major porin of Comamonas acidovorans and comparison of porin primary structures.

    PubMed Central

    Gerbl-Rieger, S; Peters, J; Kellermann, J; Lottspeich, F; Baumeister, W

    1991-01-01

    The DNA sequence of the gene which codes for the major outer membrane porin (Omp32) of Comamonas acidovorans has been determined. The structural gene encodes a precursor consisting of 351 amino acid residues with a signal peptide of 19 amino acid residues. Comparisons with amino acid sequences of outer membrane proteins and porins from several other members of the class Proteobacteria and of the Chlamydia trachomatis porin and the Neurospora crassa mitochondrial porin revealed a motif of eight regions of local homology. The results of this analysis are discussed with regard to common structural features of porins. PMID:1848840

  19. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  20. Nucleotide Sequence of the Envelope Gene of Gardner-Arnstein Feline Leukemia Virus B Reveals Unique Sequence Homologies with a Murine Mink Cell Focus-Forming Virus †

    PubMed Central

    Elder, John H.; Mullins, James I.

    1983-01-01

    The nucleotide sequence of the envelope gene and the adjacent 3′ long terminal repeat (LTR) of Gardner-Arnstein feline leukemia virus of subgroup B (GA-FeLV-B) has been determined. Comparison of the derived amino acid sequence of the gp70-p15E polyprotein to those of several previously reported murine retroviruses revealed striking homologies between GA-FeLV-B gp70 and the gp70 of a Moloney virus-derived mink cell focus-forming virus. These homologies were located within the substituted (presumably xenotropic) portion of the mink cell focus-forming virus envelope gene and comprised amino acid sequences not present in three ecotropic virus gp70s. In addition, areas of insertions and deletions, in general, were the same between GA-FeLV-B and Moloney mink cell focus-forming virus, although the sizes of the insertions and deletions differed. Homologies between GA-FeLV-B and mink cell focus-forming virus gp70s is functionally significant in that they both possess expanded host ranges, a property dictated by gp70. The amino acid sequence of FeLV-B contains 12 Asn-X-Ser/Thr sequences, indicating 12 possible sites of N-linked glycosylation as compared with 7 or 8 for its murine counterparts. Comparison of the 3′ LTR of GA-FeLV-B to AKR and Moloney virus LTRs revealed extensive conservation in several regions including the “CCAAT” and Goldberg-Hogness (TATA) boxes thought to be involved in promotion of transcription and in the repeat region of the LTR. The inverted repeats that flanked the LTR of GA-FeLV-B were identical to the murine inverted repeats, but were one base longer than the latter. The region of U3 corresponding to the approximately 75-nucleotide “enhancer sequence” is present in GA-FeLV-B, but contains deletions relative to AKR and Moloney virus and is not repeated. An interesting pallindrome in the repeat region immediately 3′ to the U3 region was noted in all the LTRs, but was particularly pronounced in GA-FeLV-B. Possible roles for this

  1. Amino acid sequence of band-3 protein from rainbow trout erythrocytes derived from cDNA.

    PubMed Central

    Hübner, S; Michel, F; Rudloff, V; Appelhans, H

    1992-01-01

    In this report we present the first complete band-3 cDNA sequence of a poikilothermic lower vertebrate. The primary structure of the anion-exchange protein band 3 (AE1) from rainbow trout erythrocytes was determined by nucleotide sequencing of cDNA clones. The overlapping clones have a total length of 3827 bp with a 5'-terminal untranslated region of 150 bp, a 2754 bp open reading frame and a 3'-untranslated region of 924 bp. Band-3 protein from trout erythrocytes consists of 918 amino acid residues with a calculated molecular mass of 101 827 Da. Comparison of its amino acid sequence revealed a 60-65% identity within the transmembrane spanning sequence of band-3 proteins published so far. An additional insertion of 24 amino acid residues within the membrane-associated domain of trout band-3 protein was identified, which until now was thought to be a general feature only of mammalian band-3-related proteins. PMID:1637296

  2. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand.

  3. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  4. The amino acid sequence around the active-site cysteine and histidine residues, and the buried cysteine residue in ficin.

    PubMed

    Husain, S S; Lowe, G

    1970-04-01

    Ficin that had been prepared from the latex of Ficus glabrata by salt fractionation and chromatography on carboxymethylcellulose was completely and irreversibly inhibited with 1,3-dibromo[2-(14)C]acetone and then treated with N-(4-dimethylamino-3,5-dinitrophenyl)maleimide in 6m-guanidinium chloride. After reduction and carboxymethylation of the labelled protein, it was digested with trypsin and alpha-chymotrypsin. Two radioactive peptides and two coloured peptides were isolated chromatographically and their sequences determined. The radioactive peptides revealed the amino acid sequences around the active-site cysteine and histidine residues and showed a high degree of homology with the omino acid sequence around the active-site cysteine and histidine residues in papain. The coloured peptides allowed the amino acid sequence around the buried cysteine residue in ficin to be determined.

  5. The `heavy' subunit of the photosynthetic reaction centre from Rhodopseudomonas viridis: isolation of the gene, nucleotide and amino acid sequence

    PubMed Central

    Michel, H.; Weyer, K. A.; Gruenberg, H.; Lottspeich, F.

    1985-01-01

    The gene coding for the `heavy' subunit of the photosynthetic reaction centre from Rhodopseudomonas viridis was isolated in an expression vector. Expression of the heavy subunit in Escherichia coli was detected with antibodies raised against crystalline reaction centres. The entire subunit, and not a fusion protein, was expressed in E. coli. The protein coding region of the gene was sequenced and the amino acid sequence derived. Part of the amino acid sequence was confirmed by chemical sequence analysis of the protein. The heavy subunit consists of 258 amino acids and its mol. wt. is 28 345. It possesses one membrane-spanning α-helical segment, as was revealed by the concomitant X-ray structure analysis. ImagesFig. 1.Fig. 2. PMID:16453623

  6. Chromosome-specific sequencing reveals an extensive dispensable genome component in wheat

    PubMed Central

    Liu, Miao; Stiller, Jiri; Holušová, Kateřina; Vrána, Jan; Liu, Dengcai; Doležel, Jaroslav; Liu, Chunji

    2016-01-01

    The hexaploid wheat genotype Chinese Spring (CS) has been used worldwide as the reference base for wheat genetics and genomics, and significant resources have been used by the international community to generate a reference wheat genome based on this genotype. By sequencing flow-sorted 3B chromosome from a hexaploid wheat genotype CRNIL1A and comparing the obtained sequences with those available for CS, we detected that a large number of sequences in the former were missing in the latter. If the distribution of such sequences in the hexaploid wheat genome is random, CRNILA sequences missing in CS could be as much as 159.3 Mb even if only fragments of 50 bp or longer were considered. Analysing RNA sequences available in the public domains also revealed that dispensable genes are common in hexaploid wheat. Together with those extensive intra- and interchromosomal rearrangements in CS, the existence of such dispensable genes is another factor highlighting potential issues with the use of reference genomes in various studies. Strong deviation in distributions of these dispensable sequences among genotypes with different geographical origins provided the first evidence indicating that they could be associated with adaptation in wheat. PMID:27821854

  7. A Statistical Model of Protein Sequence Similarity and Function Similarity Reveals Overly-Specific Function Predictions

    PubMed Central

    Kolker, Eugene

    2009-01-01

    Background Predicting protein function from primary sequence is an important open problem in modern biology. Not only are there many thousands of proteins of unknown function, current approaches for predicting function must be improved upon. One problem in particular is overly-specific function predictions which we address here with a new statistical model of the relationship between protein sequence similarity and protein function similarity. Methodology Our statistical model is based on sets of proteins with experimentally validated functions and numeric measures of function specificity and function similarity derived from the Gene Ontology. The model predicts the similarity of function between two proteins given their amino acid sequence similarity measured by statistics from the BLAST sequence alignment algorithm. A novel aspect of our model is that it predicts the degree of function similarity shared between two proteins over a continuous range of sequence similarity, facilitating prediction of function with an appropriate level of specificity. Significance Our model shows nearly exact function similarity for proteins with high sequence similarity (bit score >244.7, e-value >1e−62, non-redundant NCBI protein database (NRDB)) and only small likelihood of specific function match for proteins with low sequence similarity (bit score <54.6, e-value <1e−05, NRDB). For sequence similarity ranges in between our annotation model shows an increasing relationship between function similarity and sequence similarity, but with considerable variability. We applied the model to a large set of proteins of unknown function, and predicted functions for thousands of these proteins ranging from general to very specific. We also applied the model to a data set of proteins with previously assigned, specific functions that were electronically based. We show that, on average, these prior function predictions are more specific (quite possibly overly-specific) compared to

  8. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  9. An amino acid sequence motif sufficient for subnuclear localization of an arginine/serine-rich splicing factor.

    PubMed

    Hedley, M L; Amrein, H; Maniatis, T

    1995-12-05

    We have identified an amino acid sequence in the Drosophila Transformer (Tra) protein that is capable of directing a heterologous protein to nuclear speckles, regions of the nucleus previously shown to contain high concentrations of spliceosomal small nuclear RNAs and splicing factors. This sequence contains a nucleoplasmin-like bipartite nuclear localization signal (NLS) and a repeating arginine/serine (RS) dipeptide sequence adjacent to a short stretch of basic amino acids. Sequence comparisons from a number of other splicing factors that colocalize to nuclear speckles reveal the presence of one or more copies of this motif. We propose a two-step subnuclear localization mechanism for splicing factors. The first step is transport across the nuclear envelope via the nucleoplasmin-like NLS, while the second step is association with components in the speckled domain via the RS dipeptide sequence.

  10. Metagenomic Sequencing of the Chronic Obstructive Pulmonary Disease Upper Bronchial Tract Microbiome Reveals Functional Changes Associated with Disease Severity.

    PubMed

    Cameron, Simon J S; Lewis, Keir E; Huws, Sharon A; Lin, Wanchang; Hegarty, Matthew J; Lewis, Paul D; Mur, Luis A J; Pachebat, Justin A

    2016-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a major source of mortality and morbidity worldwide. The microbiome associated with this disease may be an important component of the disease, though studies to date have been based on sequencing of the 16S rRNA gene, and have revealed unequivocal results. Here, we employed metagenomic sequencing of the upper bronchial tract (UBT) microbiome to allow for greater elucidation of its taxonomic composition, and revealing functional changes associated with the disease. The bacterial metagenomes within sputum samples from eight COPD patients and ten 'healthy' smokers (Controls) were sequenced, and suggested significant changes in the abundance of bacterial species, particularly within the Streptococcus genus. The functional capacity of the COPD UBT microbiome indicated an increased capacity for bacterial growth, which could be an important feature in bacterial-associated acute exacerbations. Regression analyses correlated COPD severity (FEV1% of predicted) with differences in the abundance of Streptococcus pneumoniae and functional classifications related to a reduced capacity for bacterial sialic acid metabolism. This study suggests that the COPD UBT microbiome could be used in patient risk stratification and in identifying novel monitoring and treatment methods, but study of a longitudinal cohort will be required to unequivocally relate these features of the microbiome with COPD severity.

  11. Genomic Analysis by Deep Sequencing of the Probiotic Lactobacillus brevis KB290 Harboring Nine Plasmids Reveals Genomic Stability

    PubMed Central

    Fukao, Masanori; Oshima, Kenshiro; Morita, Hidetoshi; Toh, Hidehiro; Suda, Wataru; Kim, Seok-Won; Suzuki, Shigenori; Yakabe, Takafumi; Hattori, Masahira; Yajima, Nobuhiro

    2013-01-01

    We determined the complete genome sequence of Lactobacillus brevis KB290, a probiotic lactic acid bacterium isolated from a traditional Japanese fermented vegetable. The genome contained a 2,395,134-bp chromosome that housed 2,391 protein-coding genes and nine plasmids that together accounted for 191 protein-coding genes. KB290 contained no virulence factor genes, and several genes related to presumptive cell wall-associated polysaccharide biosynthesis and the stress response were present in L. brevis KB290 but not in the closely related L. brevis ATCC 367. Plasmid-curing experiments revealed that the presence of plasmid pKB290-1 was essential for the strain's gastrointestinal tract tolerance and tendency to aggregate. Using next-generation deep sequencing of current and 18-year-old stock strains to detect low frequency variants, we evaluated genome stability. Deep sequencing of four periodic KB290 culture stocks with more than 1,000-fold coverage revealed 3 mutation sites and 37 minority variation sites, indicating long-term stability and providing a useful method for assessing the stability of industrial bacteria at the nucleotide level. PMID:23544154

  12. Characterization of expressed class II MHC sequences in the banner-tailed kangaroo rat (Dipodomys spectabilis) reveals multiple DRB loci.

    PubMed

    Busch, Joseph D; Waser, Peter M; DeWoody, J Andrew

    2008-11-01

    Genes of the major histocompatibility complex (MHC) are exceptionally polymorphic due to the combined effects of natural and sexual selection. Most research in wild populations has focused on the second exon of a single class II locus (DRB), but complete gene sequences can provide an illuminating backdrop for studies of intragenic selection, recombination, and organization. To this end, we characterized class II loci in the banner-tailed kangaroo rat (Dipodomys spectabilis). Seven DRB-like sequences (provisionally named MhcDisp-DRB*01 through *07) were isolated from spleen cDNA and most likely comprise > or =5 loci; this multiformity is quite unlike the situation in muroid rodents such as Mus, Rattus, and Peromyscus. In silico translation revealed the presence of important structural residues for glycosylation sites, salt bonds, and CD4+ T-cell recognition. Amino-acid distances varied widely among the seven sequences (2-34%). Nuclear DNA sequences from the Disp-DRB*07 locus (approximately 10 kb) revealed a conventional exon/intron structure as well as a number of microsatellites and short interspersed nuclear elements (B4, Alu, and IDL-Geo subfamilies). Rates of nucleotide substitution at Disp-DRB*07 are similar in both exons and introns (pi = 0.015 and 0.012, respectively), which suggests relaxed selection and may indicate that this locus is an expressed pseudogene. Finally, we performed BLASTn searches against Dipodomys ordii genomic sequences (unassembled reads) and find 90-97% nucleotide similarity between the two kangaroo rat species. Collectively, these data suggest that class II diversity in heteromyid rodents is based on polylocism and departs from the muroid architecture.

  13. Comparative sequence and genetic analyses of asparagus BACs reveal no microsynteny with onion or rice.

    PubMed

    Jakse, Jernej; Telgmann, Alexa; Jung, Christian; Khar, Anil; Melgar, Sergio; Cheung, Foo; Town, Christopher D; Havey, Michael J

    2006-12-01

    The Poales (includes the grasses) and Asparagales [includes onion (Allium cepa L.) and asparagus (Asparagus officinalis L.)] are the two most economically important monocot orders. The Poales are a member of the commelinoid monocots, a group of orders sister to the Asparagales. Comparative genomic analyses have revealed a high degree of synteny among the grasses; however, it is not known if this synteny extends to other major monocot groups such as the Asparagales. Although we previously reported no evidence for synteny at the recombinational level between onion and rice, microsynteny may exist across shorter genomic regions in the grasses and Asparagales. We sequenced nine asparagus BACs to reveal physically linked genic-like sequences and determined their most similar positions in the onion and rice genomes. Four of the asparagus BACs were selected using molecular markers tightly linked to the sex-determining M locus on chromosome 5 of asparagus. These BACs possessed only two putative coding regions and had long tracts of degenerated retroviral elements and transposons. Five asparagus BACs were selected after hybridization of three onion cDNAs that mapped to three different onion chromosomes. Genic-like sequences that were physically linked on the cDNA-selected BACs or genetically linked on the M-linked BACs showed significant similarities (e < -20) to expressed sequences on different rice chromosomes, revealing no evidence for microsynteny between asparagus and rice across these regions. Genic-like sequences that were linked in asparagus were used to identify highly similar (e < -20) expressed sequence tags (ESTs) of onion. These onion ESTs mapped to different onion chromosomes and no relationship was observed between physical or genetic linkages in asparagus and genetic linkages in onion. These results further indicate that synteny among grass genomes does not extend to a sister order in the monocots and that asparagus may not be an appropriate smaller genome

  14. Derived amino acid sequences of the nosZ gene (respiratory N2O reductase) from Alcaligenes eutrophus, Pseudomonas aeruginosa and Pseudomonas stutzeri reveal potential copper-binding residues. Implications for the CuA site of N2O reductase and cytochrome-c oxidase.

    PubMed

    Zumft, W G; Dreusch, A; Löchelt, S; Cuypers, H; Friedrich, B; Schneider, B

    1992-08-15

    The nosZ genes encoding the multicopper enzyme nitrous oxide reductase of Alcaligenes eutrophus H16 and the type strain of Pseudomonas aeruginosa were cloned and sequenced for structural comparison of their gene products with the homologous product of the nosZ gene from Pseudomonas stutzeri [Viebrock, A. & Zumft, W. G. (1988) J. Bacteriol. 170, 4658-4668] and the subunit II of cytochrome-c oxidase (COII). Both types of enzymes possess the CuA binding site. The nosZ genes were identified in cosmid libraries by hybridization with an internal 1.22-kb PstI fragment (NS220) of nosZ from P. stutzeri. The derived amino acid sequences indicate unprocessed gene products of 70084 Da (A. eutrophus) and 70695 Da (P. aeruginosa). The N-terminal sequences of the NosZ proteins have the characteristics of signal peptides for transport. A homologous domain, extending over at least 50 residues, is shared among the three derived NosZ sequences and the CuA binding region of 32 COII sequences. Only three out of nine cysteine residues of the NosZ protein (P. stutzeri) are invariant. Cys618 and Cys622 are assigned to a binuclear center, A, which is thought to represent the CuA site of NosZ and is located close to the C terminus. Two conserved histidines, one methionine, one aspartate, one valine and two aromatic residues are also part of the CuA consensus sequence, which is the domain homologous between the two enzymes. The CuA consensus sequence, however, lacks four strictly conserved residues present in all COII sequences. Cys165 is likely to be a ligand of a second binuclear center, Z, for which we assume mainly histidine coordination. Of 23 histidine residues in NosZ (P. stutzeri), 14 are invariant, 7 of which are in regions with a degree of conservation well above the 50% positional identity between the Alcaligenes and Pseudomonas sequences. Conserved tryptophan residues are located close to several potential copper ligands. Trp615 may contribute to the observed quenching of

  15. Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing.

    PubMed

    Hansey, Candice N; Vaillancourt, Brieanne; Sekhon, Rajandeep S; de Leon, Natalia; Kaeppler, Shawn M; Buell, C Robin

    2012-01-01

    Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.

  16. Analysis of amino acid sequence variations and immunoglobulin E-binding epitopes of German cockroach tropomyosin.

    PubMed

    Jeong, Kyoung Yong; Lee, Jongweon; Lee, In-Yong; Ree, Han-Il; Hong, Chein-Soo; Yong, Tai-Soon

    2004-09-01

    The allergenicities of tropomyosins from different organisms have been reported to vary. The cDNA encoding German cockroach tropomyosin (Bla g 7) was isolated, expressed, and characterized previously. In the present study, the amino acid sequence variations in German cockroach tropomyosin were analyzed in order to investigate its influence on allergenicity. We also undertook the identification of immunodominant peptides containing immunoglobulin E (IgE) epitopes which may facilitate the development of diagnostic and immunotherapeutic strategies based on the recombinant proteins. Two-dimensional gel electrophoresis and immunoblot analysis with mouse anti-recombinant German cockroach tropomyosin serum was performed to investigate the isoforms at the protein level. Reverse transcriptase PCR (RT-PCR) was applied to examine the sequence diversity. Eleven different variants of the deduced amino acid sequences were identified by RT-PCR. German cockroach tropomyosin has only minor sequence variations that did not seem to affect its allergenicity significantly. These results support the molecular basis underlying the cross-reactivities of arthropod tropomyosins. Recombinant fragments were also generated by PCR, and IgE-binding epitopes were assessed by enzyme-linked immunosorbent assay. Sera from seven patients revealed heterogeneous IgE-binding responses. This study demonstrates multiple IgE-binding epitope regions in a single molecule, suggesting that full-length tropomyosin should be used for the development of diagnostic and therapeutic reagents.

  17. Characterisation of Drosophila CMP-sialic acid synthetase activity reveals unusual enzymatic properties

    PubMed Central

    Mertsalov, Ilya B.; Novikov, Boris N.; Scott, Hilary; Dangott, Lawrence; Panin, Vladislav M.

    2016-01-01

    CMP-sialic acid synthetase (CSAS) is a key enzyme of the sialylation pathway. CSAS produces the activated sugar donor, CMP-sialic acid, which serves as a substrate for sialyltransferases to modify glycan termini with sialic acid. Unlike other animal CMP-Sia synthetases that normally localize in the nucleus, Drosophila melanogaster CSAS (DmCSAS) localizes in the cell secretory compartment, predominantly in the Golgi, which suggests that this enzyme has properties distinct from those of its vertebrate counterparts. To test this hypothesis, we purified recombinant DmCSAS and characterised its activity in vitro. Our experiments revealed several unique features of this enzyme. DmCSAS displays specificity for N-acetylneuraminic acid as a substrate, shows preference for lower pH and can function with a broad range of metal cofactors. When tested at a pH corresponding to the Golgi compartment, the enzyme showed significant activity with several metal cations, including Zn2+, Fe2+, Co2+ and Mn2+, while the activity with Mg2+ was found to be low. Protein sequence analysis and site-specific mutagenesis identified an aspartic acid residue that is necessary for enzymatic activity and predicted to be involved in coordinating a metal cofactor. DmCSAS enzymatic activity was found to be essential in vivo for rescuing the phenotype of DmCSAS mutants. Finally, our experiments revealed a steep dependence of the enzymatic activity on temperature. Taken together, our results indicate that DmCSAS underwent evolutionary adaptation to pH and ionic environment different from that of counterpart synthetases in vertebrates. Our data also suggest that environmental temperatures can regulate Drosophila sialylation, thus modulating neural transmission. PMID:27114558

  18. Characterization of Drosophila CMP-sialic acid synthetase activity reveals unusual enzymatic properties.

    PubMed

    Mertsalov, Ilya B; Novikov, Boris N; Scott, Hilary; Dangott, Lawrence; Panin, Vladislav M

    2016-07-01

    CMP-sialic acid synthetase (CSAS) is a key enzyme of the sialylation pathway. CSAS produces the activated sugar donor, CMP-sialic acid, which serves as a substrate for sialyltransferases to modify glycan termini with sialic acid. Unlike other animal CSASs that normally localize in the nucleus, Drosophila melanogaster CSAS (DmCSAS) localizes in the cell secretory compartment, predominantly in the Golgi, which suggests that this enzyme has properties distinct from those of its vertebrate counterparts. To test this hypothesis, we purified recombinant DmCSAS and characterized its activity in vitro Our experiments revealed several unique features of this enzyme. DmCSAS displays specificity for N-acetylneuraminic acid as a substrate, shows preference for lower pH and can function with a broad range of metal cofactors. When tested at a pH corresponding to the Golgi compartment, the enzyme showed significant activity with several metal cations, including Zn(2+), Fe(2+), Co(2+) and Mn(2+), whereas the activity with Mg(2+) was found to be low. Protein sequence analysis and site-specific mutagenesis identified an aspartic acid residue that is necessary for enzymatic activity and predicted to be involved in co-ordinating a metal cofactor. DmCSAS enzymatic activity was found to be essential in vivo for rescuing the phenotype of DmCSAS mutants. Finally, our experiments revealed a steep dependence of the enzymatic activity on temperature. Taken together, our results indicate that DmCSAS underwent evolutionary adaptation to pH and ionic environment different from that of counterpart synthetases in vertebrates. Our data also suggest that environmental temperatures can regulate Drosophila sialylation, thus modulating neural transmission.

  19. Multilocus sequence typing of Mycoplasma bovis reveals host-specific genotypes in cattle versus bison.

    PubMed

    Register, Karen B; Thole, Luke; Rosenbush, Ricardo F; Minion, F Chris

    2015-01-30

    Mycoplasma bovis is a primary agent of mastitis, pneumonia and arthritis in cattle and the bacterium most frequently isolated from the polymicrobial syndrome known as bovine respiratory disease complex. Recently, M. bovis has emerged as a significant health problem in bison, causing necrotic pharyngitis, pneumonia, dystocia and abortion. Whether isolates from cattle and bison comprise genetically distinct populations is unknown. This study describes the development of a highly discriminatory multilocus sequencing typing (MLST) method for M. bovis and its use to investigate the population structure of the bacterium. Genome sequences from six M. bovis isolates were used for selection of gene targets. Seven of 44 housekeeping genes initially evaluated were selected as targets on the basis of sequence variability and distribution within the genome. For each gene target sequence, four to seven alleles could be distinguished that collectively define 32 sequence types (STs) from a collection of 94 cattle isolates and 42 bison isolates. A phylogeny based on concatenated target gene sequences of each isolate revealed that bison isolates are genetically distinct from strains that infect cattle, suggesting recent disease outbreaks in bison may be due to the emergence of unique genetic variants. No correlation was found between ST and disease presentation or geographic origin. MLST data reported here were used to populate a newly created and publicly available, curated database to which researchers can contribute. The MLST scheme and database provide novel tools for exploring the population structure of M. bovis and tracking the evolution and spread of strains.

  20. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  1. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  2. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  3. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  4. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  5. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  6. Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities

    PubMed Central

    Stoeck, Thorsten; Behnke, Anke; Christen, Richard; Amaral-Zettler, Linda; Rodriguez-Mora, Maria J; Chistoserdov, Andrei; Orsi, William; Edgcomb, Virginia P

    2009-01-01

    Background Recent advances in sequencing strategies make possible unprecedented depth and scale of sampling for molecular detection of microbial diversity. Two major paradigm-shifting discoveries include the detection of bacterial diversity that is one to two orders of magnitude greater than previous estimates, and the discovery of an exciting 'rare biosphere' of molecular signatures ('species') of poorly understood ecological significance. We applied a high-throughput parallel tag sequencing (454 sequencing) protocol adopted for eukaryotes to investigate protistan community complexity in two contrasting anoxic marine ecosystems (Framvaren Fjord, Norway; Cariaco deep-sea basin, Venezuela). Both sampling sites have previously been scrutinized for protistan diversity by traditional clone library construction and Sanger sequencing. By comparing these clone library data with 454 amplicon library data, we assess the efficiency of high-throughput tag sequencing strategies. We here present a novel, highly conservative bioinformatic analysis pipeline for the processing of large tag sequence data sets. Results The analyses of ca. 250,000 sequence reads revealed that the number of detected Operational Taxonomic Units (OTUs) far exceeded previous richness estimates from the same sites based on clone libraries and Sanger sequencing. More than 90% of this diversity was represented by OTUs with less than 10 sequence tags. We detected a substantial number of taxonomic groups like Apusozoa, Chrysomerophytes, Centroheliozoa, Eustigmatophytes, hyphochytriomycetes, Ichthyosporea, Oikomonads, Phaeothamniophytes, and rhodophytes which remained undetected by previous clone library-based diversity surveys of the sampling sites. The most important innovations in our newly developed bioinformatics pipeline employ (i) BLASTN with query parameters adjusted for highly variable domains and a complete database of public ribosomal RNA (rRNA) gene sequences for taxonomic assignments of tags; (ii

  7. Reconstruction of the lipid metabolism for the microalga Monoraphidium neglectum from its genome sequence reveals characteristics suitable for biofuel production

    PubMed Central

    2013-01-01

    Background Microalgae are gaining importance as sustainable production hosts in the fields of biotechnology and bioenergy. A robust biomass accumulating strain of the genus Monoraphidium (SAG 48.87) was investigated in this work as a potential feedstock for biofuel production. The genome was sequenced, annotated, and key enzymes for triacylglycerol formation were elucidated. Results Monoraphidium neglectum was identified as an oleaginous species with favourable growth characteristics as well as a high potential for crude oil production, based on neutral lipid contents of approximately 21% (dry weight) under nitrogen starvation, composed of predominantly C18:1 and C16:0 fatty acids. Further characterization revealed growth in a relatively wide pH range and salt concentrations of up to 1.0% NaCl, in which the cells exhibited larger structures. This first full genome sequencing of a member of the Selenastraceae revealed a diploid, approximately 68 Mbp genome with a G + C content of 64.7%. The circular chloroplast genome was assembled to a 135,362 bp single contig, containing 67 protein-coding genes. The assembly of the mitochondrial genome resulted in two contigs with an approximate total size of 94 kb, the largest known mitochondrial genome within algae. 16,761 protein-coding genes were assigned to the nuclear genome. Comparison of gene sets with respect to functional categories revealed a higher gene number assigned to the category “carbohydrate metabolic process” and in “fatty acid biosynthetic process” in M. neglectum when compared to Chlamydomonas reinhardtii and Nannochloropsis gaditana, indicating a higher metabolic diversity for applications in carbohydrate conversions of biotechnological relevance. Conclusions The genome of M. neglectum, as well as the metabolic reconstruction of crucial lipid pathways, provides new insights into the diversity of the lipid metabolism in microalgae. The results of this work provide a platform to encourage the

  8. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  9. Computer selection of oligonucleotide probes from amino acid sequences for use in gene library screening.

    PubMed

    Yang, J H; Ye, J H; Wallace, D C

    1984-01-11

    We present a computer program, FINPROBE, which utilizes known amino acid sequence data to deduce minimum redundancy oligonucleotide probes for use in screening cDNA or genomic libraries or in primer extension. The user enters the amino acid sequence of interest, the desired probe length, the number of probes sought, and the constraints on oligonucleotide synthesis. The computer generates a table of possible probes listed in increasing order of redundancy and provides the location of each probe in the protein and mRNA coding sequence. Activation of a next function provides the amino acid and mRNA sequences of each probe of interest as well as the complementary sequence and the minimum dissociation temperature of the probe. A final routine prints out the amino acid sequence of the protein in parallel with the mRNA sequence listing all possible codons for each amino acid.

  10. Genetic relationships among Enterococcus faecalis isolates from different sources as revealed by multilocus sequence typing.

    PubMed

    Chen, X; Song, Y Q; Xu, H Y; Menghe, B L G; Zhang, H P; Sun, Z H

    2015-08-01

    Enterococcus faecalis is part of the natural gut flora of humans and other mammals; some isolates are also used in food production. So, it is important to evaluate the genetic diversity and phylogenetic relationships among E. faecalis isolates from different sources. Multilocus sequence typing protocol was used to compare 39 E. faecalis isolates from Chinese traditional food products (including dairy products, acidic gruel) and 4 published E. faecalis isolates from other sources including human-derived isolates employing 5 housekeeping genes (groEL, clpX, recA, rpoB, and pepC). A total of 23 unique sequence types were identified, which were grouped into 5 clonal complexes and 10 singletons. The value of standardized index of association of the alleles (IA(S)=0.1465) and network structure indicated a high frequency of intraspecies recombination across these isolates. Enterococcus faecalis lineages also exhibited clearly source-clustered distributions. The isolates from dairy source were clustered together. However, the relationship between isolates from acidic gruel and one isolate from a human source was close. The MLST scheme presented in this study provides a sharable and continuously growing sequence database enabling global comparison of strains from different sources, and will further advance our understanding of the microbial ecology of this important species.

  11. Phylogenetic Analysis of Geographically Diverse Radopholus similis via rDNA Sequence Reveals a Monomorphic Motif.

    PubMed

    Kaplan, D T; Thomas, W K; Frisse, L M; Sarah, J L; Stanton, J M; Speijer, P R; Marin, D H; Opperman, C H

    2000-06-01

    The nucleic acid sequences of rDNA ITS1 and the rDNA D2/D3 expansion segment were compared for 57 burrowing nematode isolates collected from Australia, Cameroon, Central America, Cuba, Dominican Republic, Florida, Guadeloupe, Hawaii, Nigeria, Honduras, Indonesia, Ivory Coast, Puerto Rico, South Africa, and Uganda. Of the 57 isolates, 55 were morphologically similar to Radopholus similis and seven were citrus-parasitic. The nucleic acid sequences for PCR-amplified ITS1 and for the D2/D3 expansion segment of the 28S rDNA gene were each identical for all putative R. similis. Sequence divergence for both the ITS1 and the D2/D3 was concordant with morphological differences that distinguish R. similis from other burrowing nematode species. This result substantiates previous observations that the R. similis genome is highly conserved across geographic regions. Autapomorphies that would delimit phylogenetic lineages of non-citrus-parasitic R. similis from those that parasitize citrus were not observed. The data presented herein support the concept that R. similis is comprised of two pathotypes-one that parasitizes citrus and one that does not.

  12. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  13. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  14. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  15. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  16. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  17. Distinct Genetic Lineages of Bactrocera caudata (Insecta: Tephritidae) Revealed by COI and 16S DNA Sequences

    PubMed Central

    Lim, Phaik-Eem; Tan, Ji; Suana, I. Wayan; Eamsobhana, Praphathip; Yong, Hoi Sen

    2012-01-01

    The fruit fly Bactrocera caudata is a pest species of economic importance in Asia. Its larvae feed on the flowers of Cucurbitaceae such as Cucurbita moschata. To-date it is distinguished from related species based on morphological characters. Specimens of B. caudata from Peninsular Malaysia and Indonesia (Bali and Lombok) were analysed using the partial DNA sequences of cytochrome c oxidase subunit I (COI) and 16S rRNA genes. Both gene sequences revealed that B. caudata from Peninsular Malaysia was distinctly different from B. caudata of Bali and Lombok, without common haplotype between them. Phylogenetic analysis revealed two distinct clades, indicating distinct genetic lineage. The uncorrected ‘p’ distance for COI sequences between B. caudata of Malaysia-Thailand-China and B. caudata of Bali-Lombok was 5.65%, for 16S sequences from 2.76 to 2.99%, and for combined COI and 16S sequences 4.45 to 4.46%. The ‘p’ values are distinctly different from intraspecific ‘p’ distance (0–0.23%). Both the B. caudata lineages are distinctly separated from related species in the subgenus Zeugodacus – B. ascita, B. scutellata, B. ishigakiensis, B. diaphora, B. tau, B. cucurbitae, and B. depressa. Molecular phylogenetic analysis indicates that the B. caudata lineages are closely related to B. ascita sp. B, and form a clade with B. scutellata, B. ishigakiensis, B. diaphora and B. ascita sp. A. This study provides additional baseline for the phylogenetic relationships of Bactrocera fruit flies of the subgenus Zeugodacus. Both the COI and 16S genes could be useful markers for the molecular differentiation and phylogenetic analysis of tephritid fruit flies. PMID:22615962

  18. Distinct genetic lineages of Bactrocera caudata (Insecta: Tephritidae) revealed by COI and 16S DNA sequences.

    PubMed

    Lim, Phaik-Eem; Tan, Ji; Suana, I Wayan; Eamsobhana, Praphathip; Yong, Hoi Sen

    2012-01-01

    The fruit fly Bactrocera caudata is a pest species of economic importance in Asia. Its larvae feed on the flowers of Cucurbitaceae such as Cucurbita moschata. To-date it is distinguished from related species based on morphological characters. Specimens of B. caudata from Peninsular Malaysia and Indonesia (Bali and Lombok) were analysed using the partial DNA sequences of cytochrome c oxidase subunit I (COI) and 16S rRNA genes. Both gene sequences revealed that B. caudata from Peninsular Malaysia was distinctly different from B. caudata of Bali and Lombok, without common haplotype between them. Phylogenetic analysis revealed two distinct clades, indicating distinct genetic lineage. The uncorrected 'p' distance for COI sequences between B. caudata of Malaysia-Thailand-China and B. caudata of Bali-Lombok was 5.65%, for 16S sequences from 2.76 to 2.99%, and for combined COI and 16S sequences 4.45 to 4.46%. The 'p' values are distinctly different from intraspecific 'p' distance (0-0.23%). Both the B. caudata lineages are distinctly separated from related species in the subgenus Zeugodacus - B. ascita, B. scutellata, B. ishigakiensis, B. diaphora, B. tau, B. cucurbitae, and B. depressa. Molecular phylogenetic analysis indicates that the B. caudata lineages are closely related to B. ascita sp. B, and form a clade with B. scutellata, B. ishigakiensis, B. diaphora and B. ascita sp. A. This study provides additional baseline for the phylogenetic relationships of Bactrocera fruit flies of the subgenus Zeugodacus. Both the COI and 16S genes could be useful markers for the molecular differentiation and phylogenetic analysis of tephritid fruit flies.

  19. Whole-genome sequencing of six dog breeds from continuous altitudes reveals adaptation to high-altitude hypoxia

    PubMed Central

    Gou, Xiao; Wang, Zhen; Li, Ning; Qiu, Feng; Xu, Ze; Yan, Dawei; Yang, Shuli; Jia, Jia; Kong, Xiaoyan; Wei, Zehui; Lu, Shaoxiong; Lian, Linsheng; Wu, Changxin; Wang, Xueyan; Li, Guozhi; Ma, Teng; Jiang, Qiang; Zhao, Xue; Yang, Jiaqiang; Liu, Baohong; Wei, Dongkai; Li, Hong; Yang, Jianfa; Yan, Yulin; Zhao, Guiying; Dong, Xinxing; Li, Mingli; Deng, Weidong; Leng, Jing; Wei, Chaochun; Wang, Chuan; Mao, Huaming; Zhang, Hao; Ding, Guohui; Li, Yixue

    2014-01-01

    The hypoxic environment imposes severe selective pressure on species living at high altitude. To understand the genetic bases of adaptation to high altitude in dogs, we performed whole-genome sequencing of 60 dogs including five breeds living at continuous altitudes along the Tibetan Plateau from 800 to 5100 m as well as one European breed. More than 150× sequencing coverage for each breed provides us with a comprehensive assessment of the genetic polymorphisms of the dogs, including Tibetan Mastiffs. Comparison of the breeds from different altitudes reveals strong signals of population differentiation at the locus of hypoxia-related genes including endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) and beta hemoglobin cluster. Notably, four novel nonsynonymous mutations specific to high-altitude dogs are identified at EPAS1, one of which occurred at a quite conserved site in the PAS domain. The association testing between EPAS1 genotypes and blood-related phenotypes on additional high-altitude dogs reveals that the homozygous mutation is associated with decreased blood flow resistance, which may help to improve hemorheologic fitness. Interestingly, EPAS1 was also identified as a selective target in Tibetan highlanders, though no amino acid changes were found. Thus, our results not only indicate parallel evolution of humans and dogs in adaptation to high-altitude hypoxia, but also provide a new opportunity to study the role of EPAS1 in the adaptive processes. PMID:24721644

  20. Timing of human protein evolution as revealed by massively parallel capture of Neandertal nuclear DNA sequences

    PubMed Central

    Burbano, Hernán A.; Hodges, Emily; Green, Richard E.; Briggs, Adrian W.; Krause, Johannes; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Johnson, Philipp L.F.; Xuan, Zhenyu; Rooks, Michelle; Bhattacharjee, Arindam; Brizuela, Leonardo; Albert, Frank W.; de la Rasilla, Marco; Fortea, Javier; Rosas, Antonio; Lachmann, Michael; Hannon, Gregory J.; Pääbo, Svante

    2010-01-01

    Whole genome shotgun sequencing is now possible for extinct organisms, as well as the targeted capture of specific regions. However, targeted resequencing of megabase sized parts of nuclear genomes has yet to be demonstrated for ancient DNA. Here we show that hybridization capture on microarrays can be used to generate large scale targeted data from Neandertal DNA even in the presence of ~99.8% microbial DNA. It is thus now possible to generate high quality data from large regions of the nuclear genome from Neandertals and other extinct organisms. Using this approach we have sequenced ~14,000 protein coding positions that have been inferred to have changed on the human lineage since the last common ancestor shared with chimpanzees. We identify 88 amino acid substitutions that have become fixed in all humans since the divergence from the Neandertals. PMID:20448179

  1. Fluorescence energy transfer as a probe for nucleic acid structures and sequences.

    PubMed Central

    Mergny, J L; Boutorine, A S; Garestier, T; Belloc, F; Rougée, M; Bulychev, N V; Koshkin, A A; Bourson, J; Lebedev, A V; Valeur, B

    1994-01-01

    The primary or secondary structure of single-stranded nucleic acids has been investigated with fluorescent oligonucleotides, i.e., oligonucleotides covalently linked to a fluorescent dye. Five different chromophores were used: 2-methoxy-6-chloro-9-amino-acridine, coumarin 500, fluorescein, rhodamine and ethidium. The chemical synthesis of derivatized oligonucleotides is described. Hybridization of two fluorescent oligonucleotides to adjacent nucleic acid sequences led to fluorescence excitation energy transfer between the donor and the acceptor dyes. This phenomenon was used to probe primary and secondary structures of DNA fragments and the orientation of oligodeoxynucleotides synthesized with the alpha-anomers of nucleoside units. Fluorescence energy transfer can be used to reveal the formation of hairpin structures and the translocation of genes between two chromosomes. PMID:8152922

  2. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  3. Multilocus Sequence Analysis of Nectar Pseudomonads Reveals High Genetic Diversity and Contrasting Recombination Patterns

    PubMed Central

    Álvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M.

    2013-01-01

    The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas ‘sensu stricto’ isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria. PMID:24116076

  4. Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

    PubMed Central

    Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

    2014-01-01

    Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from <40 nt, 40–150 nt and >150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259

  5. Identification of Bacteria Using Phylogenetic Relationships, Revealed by MS/MS Sequencing of Tryptic Peptides Derived from Cellular Proteins

    DTIC Science & Technology

    2004-11-17

    Universal Phylogenetic Tree of Bacteria Based on SSU rRNA Sequences Aquificae Termotogae Planctomycetes Actinobacteria Firmicutes Cyanobacteria...Identification of Bacteria Using Phylogenetic Relationships Revealed by MS/MS Sequencing of Tryptic Peptides Derived from Cellular Proteins Jacek P...Bacteria Using Phylogenetic Relationships Revealed by MS/MS Sequencing of Tryptic Peptides Derived from Cellular Proteins 5a. CONTRACT NUMBER 5b. GRANT

  6. Human retroviruses and aids, 1992. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Korber, B.; Berzofsky, J.A.; Pavlakis, G.N.; Smith, R.F.

    1992-10-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) HIV and SIV Nucleotide Sequences; (H) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions below of the parts of the compendium, the user should read the individual introductions for each part.

  7. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    USGS Publications Warehouse

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  8. Deep Sequencing Reveals Novel Genetic Variants in Children with Acute Liver Failure and Tissue Evidence of Impaired Energy Metabolism

    PubMed Central

    Valencia, C. Alexander; Wang, Xinjian; Wang, Jin; Peters, Anna; Simmons, Julia R.; Moran, Molly C.; Mathur, Abhinav; Husami, Ammar; Qian, Yaping; Sheridan, Rachel; Bove, Kevin E.; Witte, David; Huang, Taosheng; Miethke, Alexander G.

    2016-01-01

    Background & Aims The etiology of acute liver failure (ALF) remains elusive in almost half of affected children. We hypothesized that inherited mitochondrial and fatty acid oxidation disorders were occult etiological factors in patients with idiopathic ALF and impaired energy metabolism. Methods Twelve patients with elevated blood molar lactate/pyruvate ratio and indeterminate etiology were selected from a retrospective cohort of 74 subjects with ALF because their fixed and frozen liver samples were available for histological, ultrastructural, molecular and biochemical analysis. Results A customized next-generation sequencing panel for 26 genes associated with mitochondrial and fatty acid oxidation defects revealed mutations and sequence variants in five subjects. Variants involved the genes ACAD9, POLG, POLG2, DGUOK, and RRM2B; the latter not previously reported in subjects with ALF. The explanted livers of the patients with heterozygous, truncating insertion mutations in RRM2B showed patchy micro- and macrovesicular steatosis, decreased mitochondrial DNA (mtDNA) content <30% of controls, and reduced respiratory chain complex activity; both patients had good post-transplant outcome. One infant with severe lactic acidosis was found to carry two heterozygous variants in ACAD9, which was associated with isolated complex I deficiency and diffuse hypergranular hepatocytes. The two subjects with heterozygous variants of unknown clinical significance in POLG and DGUOK developed ALF following drug exposure. Their hepatocytes displayed abnormal mitochondria by electron microscopy. Conclusion Targeted next generation sequencing and correlation with histological, ultrastructural and functional studies on liver tissue in children with elevated lactate/pyruvate ratio expand the spectrum of genes associated with pediatric ALF. PMID:27483465

  9. Appearances Can Be Deceptive: Revealing a Hidden Viral Infection with Deep Sequencing in a Plant Quarantine Context

    PubMed Central

    Candresse, Thierry; Filloux, Denis; Muhire, Brejnev; Julian, Charlotte; Galzi, Serge; Fort, Guillaume; Bernardo, Pauline; Daugrois, Jean-Heindrich; Fernandez, Emmanuel; Martin, Darren P.; Varsani, Arvind; Roumagnac, Philippe

    2014-01-01

    Comprehensive inventories of plant viral diversity are essential for effective quarantine and sanitation efforts. The safety of regulated plant material exchanges presently relies heavily on techniques such as PCR or nucleic acid hybridisation, which are only suited to the detection and characterisation of specific, well characterised pathogens. Here, we demonstrate the utility of sequence-independent next generation sequencing (NGS) of both virus-derived small interfering RNAs (siRNAs) and virion-associated nucleic acids (VANA) for the detailed identification and characterisation of viruses infecting two quarantined sugarcane plants. Both plants originated from Egypt and were known to be infected with Sugarcane streak Egypt Virus (SSEV; Genus Mastrevirus, Family Geminiviridae), but were revealed by the NGS approaches to also be infected by a second highly divergent mastrevirus, here named Sugarcane white streak Virus (SWSV). This novel virus had escaped detection by all routine quarantine detection assays and was found to also be present in sugarcane plants originating from Sudan. Complete SWSV genomes were cloned and sequenced from six plants and all were found to share >91% genome-wide identity. With the exception of two SWSV variants, which potentially express unusually large RepA proteins, the SWSV isolates display genome characteristics very typical to those of all other previously described mastreviruses. An analysis of virus-derived siRNAs for SWSV and SSEV showed them to be strongly influenced by secondary structures within both genomic single stranded DNA and mRNA transcripts. In addition, the distribution of siRNA size frequencies indicates that these mastreviruses are likely subject to both transcriptional and post-transcriptional gene silencing. Our study stresses the potential advantages of NGS-based virus metagenomic screening in a plant quarantine setting and indicates that such techniques could dramatically reduce the numbers of non

  10. What genomic sequence information has revealed about Vibrio ecology in the ocean--a review.

    PubMed

    Grimes, Darrell Jay; Johnson, Crystal N; Dillon, Kevin S; Flowers, Adrienne R; Noriea, Nicholas F; Berutti, Tracy

    2009-10-01

    To date, the genomes of eight Vibrio strains representing six species and three human pathogens have been fully sequenced and reported. This review compares genomic information revealed from these sequencing efforts and what we can infer about Vibrio biology and ecology from this and related genomic information. The focus of the review is on those attributes that allow the Vibrios to survive and even proliferate in their ocean habitats, which include seawater, plankton, invertebrates, fish, marine mammals, plants, man-made structures (surfaces), and particulate matter. Areas covered include general information about the eight genomes, each of which is distributed over two chromosomes; a discussion of expected and unusual genes found; attachment sites and mechanisms; utilization of particulate and dissolved organic matter; and conclusions.

  11. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  12. Nucleic acid sequence of an internal image-bearing monoclonal anti-idiotype and its comparison to the sequence of the external antigen.

    PubMed Central

    Bruck, C; Co, M S; Slaoui, M; Gaulton, G N; Smith, T; Fields, B N; Mullins, J I; Greene, M I

    1986-01-01

    The monoclonal anti-idiotypic antibody (mAb2) 87.92.6 directed against the 9B.G5 antibody specific for the virus neutralizing epitope on the mammalian reovirus type 3 hemagglutinin was previously demonstrated to express an internal image of the receptor binding epitope of the reovirus type 3. Furthermore, this mAb2 has autoimmune reactivity to the cell surface receptor of the reovirus. The nucleotide and deduced amino acid sequences of the 87.92.6 mAb2 heavy and light chains are described in this report. The sequence analysis reveals that the same heavy chain variable and joining (VH and JH) gene segments are used by the 87.92.6 anti-idiotypic mAb2 and by the dominant idiotypes of the BALB/c anti-GAT (cGAT) and anti-NP (NPa) responses. [GAT; random polymer that is 60% glutamic acid, 30% alanine, and 10% tyrosine. NP; (4-hydroxy-3-nitrophenyl)-acetyl.] Despite extensive homology at the level of the heavy chain variable regions, the NPa positive BALB/c anti-NP monoclonal antibody 17.2.25 binds neither 9B.G5 nor the cellular receptor for the hemagglutinin. Amino acid sequence comparison between the viral hemagglutinin and the 87.92.6 mAb2 light chain "internal image," reveals an area of significant homology indicating that antigen mimicry by antibodies may be achieved by sharing primary structure. PMID:2428036

  13. Metagenome sequence analysis of filamentous microbial communities obtained from geochemically distinct geothermal channels reveals specialization of three aquificales lineages.

    PubMed

    Takacs-Vesbach, Cristina; Inskeep, William P; Jay, Zackary J; Herrgard, Markus J; Rusch, Douglas B; Tringe, Susannah G; Kozubal, Mark A; Hamamura, Natsuko; Macur, Richard E; Fouke, Bruce W; Reysenbach, Anna-Louise; McDermott, Timothy R; Jennings, Ryan deM; Hengartner, Nicolas W; Xie, Gary

    2013-01-01

    The Aquificales are thermophilic microorganisms that inhabit hydrothermal systems worldwide and are considered one of the earliest lineages of the domain Bacteria. We analyzed metagenome sequence obtained from six thermal "filamentous streamer" communities (∼40 Mbp per site), which targeted three different groups of Aquificales found in Yellowstone National Park (YNP). Unassembled metagenome sequence and PCR-amplified 16S rRNA gene libraries revealed that acidic, sulfidic sites were dominated by Hydrogenobaculum (Aquificaceae) populations, whereas the circum-neutral pH (6.5-7.8) sites containing dissolved sulfide were dominated by Sulfurihydrogenibium spp. (Hydrogenothermaceae). Thermocrinis (Aquificaceae) populations were found primarily in the circum-neutral sites with undetectable sulfide, and to a lesser extent in one sulfidic system at pH 8. Phylogenetic analysis of assembled sequence containing 16S rRNA genes as well as conserved protein-encoding genes revealed that the composition and function of these communities varied across geochemical conditions. Each Aquificales lineage contained genes for CO2 fixation by the reverse-TCA cycle, but only the Sulfurihydrogenibium populations perform citrate cleavage using ATP citrate lyase (Acl). The Aquificaceae populations use an alternative pathway catalyzed by two separate enzymes, citryl-CoA synthetase (Ccs), and citryl-CoA lyase (Ccl). All three Aquificales lineages contained evidence of aerobic respiration, albeit due to completely different types of heme Cu oxidases (subunit I) involved in oxygen reduction. The distribution of Aquificales populations and differences among functional genes involved in energy generation and electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, H2, O2) have resulted in niche specialization among members of the Aquificales.

  14. Metagenome Sequence Analysis of Filamentous Microbial Communities Obtained from Geochemically Distinct Geothermal Channels Reveals Specialization of Three Aquificales Lineages

    PubMed Central

    Takacs-Vesbach, Cristina; Inskeep, William P.; Jay, Zackary J.; Herrgard, Markus J.; Rusch, Douglas B.; Tringe, Susannah G.; Kozubal, Mark A.; Hamamura, Natsuko; Macur, Richard E.; Fouke, Bruce W.; Reysenbach, Anna-Louise; McDermott, Timothy R.; Jennings, Ryan deM.; Hengartner, Nicolas W.; Xie, Gary

    2013-01-01

    The Aquificales are thermophilic microorganisms that inhabit hydrothermal systems worldwide and are considered one of the earliest lineages of the domain Bacteria. We analyzed metagenome sequence obtained from six thermal “filamentous streamer” communities (∼40 Mbp per site), which targeted three different groups of Aquificales found in Yellowstone National Park (YNP). Unassembled metagenome sequence and PCR-amplified 16S rRNA gene libraries revealed that acidic, sulfidic sites were dominated by Hydrogenobaculum (Aquificaceae) populations, whereas the circum-neutral pH (6.5–7.8) sites containing dissolved sulfide were dominated by Sulfurihydrogenibium spp. (Hydrogenothermaceae). Thermocrinis (Aquificaceae) populations were found primarily in the circum-neutral sites with undetectable sulfide, and to a lesser extent in one sulfidic system at pH 8. Phylogenetic analysis of assembled sequence containing 16S rRNA genes as well as conserved protein-encoding genes revealed that the composition and function of these communities varied across geochemical conditions. Each Aquificales lineage contained genes for CO2 fixation by the reverse-TCA cycle, but only the Sulfurihydrogenibium populations perform citrate cleavage using ATP citrate lyase (Acl). The Aquificaceae populations use an alternative pathway catalyzed by two separate enzymes, citryl-CoA synthetase (Ccs), and citryl-CoA lyase (Ccl). All three Aquificales lineages contained evidence of aerobic respiration, albeit due to completely different types of heme Cu oxidases (subunit I) involved in oxygen reduction. The distribution of Aquificales populations and differences among functional genes involved in energy generation and electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, H2, O2) have resulted in niche specialization among members of the Aquificales. PMID:23755042

  15. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    PubMed Central

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-01-01

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species. PMID:27739446

  16. Combining Natural Sequence Variation with High Throughput Mutational Data to Reveal Protein Interaction Sites

    PubMed Central

    Melamed, Daniel; Young, David L.; Miller, Christina R.; Fields, Stanley

    2015-01-01

    Many protein interactions are conserved among organisms despite changes in the amino acid sequences that comprise their contact sites, a property that has been used to infer the location of these sites from protein homology. In an inter-species complementation experiment, a sequence present in a homologue is substituted into a protein and tested for its ability to support function. Therefore, substitutions that inhibit function can identify interaction sites that changed over evolution. However, most of the sequence differences within a protein family remain unexplored because of the small-scale nature of these complementation approaches. Here we use existing high throughput mutational data on the in vivo function of the RRM2 domain of the Saccharomyces cerevisiae poly(A)-binding protein, Pab1, to analyze its sites of interaction. Of 197 single amino acid differences in 52 Pab1 homologues, 17 reduce the function of Pab1 when substituted into the yeast protein. The majority of these deleterious mutations interfere with the binding of the RRM2 domain to eIF4G1 and eIF4G2, isoforms of a translation initiation factor. A large-scale mutational analysis of the RRM2 domain in a two-hybrid assay for eIF4G1 binding supports these findings and identifies peripheral residues that make a smaller contribution to eIF4G1 binding. Three single amino acid substitutions in yeast Pab1 corresponding to residues from the human orthologue are deleterious and eliminate binding to the yeast eIF4G isoforms. We create a triple mutant that carries these substitutions and other humanizing substitutions that collectively support a switch in binding specificity of RRM2 from the yeast eIF4G1 to its human orthologue. Finally, we map other deleterious substitutions in Pab1 to inter-domain (RRM2–RRM1) or protein-RNA (RRM2–poly(A)) interaction sites. Thus, the combined approach of large-scale mutational data and evolutionary conservation can be used to characterize interaction sites at single

  17. High-throughput sequencing reveals an altered T cell repertoire in X-linked agammaglobulinemia

    PubMed Central

    Ramesh, Manish; Simchoni, Noa; Hamm, David; Cunningham-Rundles, Charlotte

    2015-01-01

    To examine the T cell receptor structure in the absence of B cells, the TCR β CDR3 was sequenced from DNA of 15 X-linked agammaglobulinemia (XLA) subjects and 18 male controls, using the Illumina HiSeq platform and the ImmunoSEQ analyzer. V gene usage and the V–J combinations, derived from both productive and nonproductive sequences, were significantly different between XLA samples and controls. Although the CDR3 length was similar for XLA and control samples, the CDR3 region of the XLA T cell receptor contained significantly fewer deletions and insertions in V, D, and J gene segments, differences intrinsic to the V(D)J recombination process and not due to peripheral T cell selection. XLA CDR3s demonstrated fewer charged amino acid residues, more sharing of CDR3 sequences, and almost completely lacked a population of highly modified Vβ gene segments found in control DNA, suggesting both a skewed and contracted T cell repertoire in XLA. PMID:26360253

  18. Complete genomic sequence analysis of a highly virulent isolate revealed a novel strain of Sugarcane mosaic virus.

    PubMed

    Gao, Bo; Cui, Xiao-Wen; Li, Xiang-Dong; Zhang, Chun-Qing; Miao, Hong-Qin

    2011-12-01

    Sugarcane mosaic virus (SCMV) is the most prevalent virus causing maize dwarf mosaic disease in northern China. A SCMV isolate, BD8, was obtained from the maize showing dwarf and mosaic symptoms in Baoding, China. The complete genomic sequence of BD8 is 9,576 nucleotides (nt) excluding the poly(A) tail. It contains one single open reading frame of 9,192 nt and encodes a large polyprotein of 3,063 amino acids (aa), flanked by a 5'-untranslated region (UTR) of 148 nt and a 3'-UTR of 236 nt. The entire genomic sequence of BD8 shares identities of 79.1-80.8% with those of other 13 SCMV isolates available in the GenBank at nt level, while their CP genes share identities of 76.9-82.6 and 82.8-86.9% at nt and aa levels, respectively. Phylogenetic analysis of the complete genomic sequences reveals that SCMV can be clustered to four groups: group I includes isolates from maize, group II consists of isolates from sugarcane or maize, groups III and IV contain single isolate of AU-A (AJ278405) and BD8, respectively. Thus, BD8 represents a new strain of SCMV. Furthermore analysis of the CP gene sequences of more isolates shows that BD8 is clustered to a group with the isolates from Thailand and Vietnam, which implies that isolates of this strain have been distributed in South Asia. In the greenhouse, BD8 can cause severe symptoms in all the 12 maize varieties tested with high incidence, indicating that BD8 is highly virulent.

  19. Completion of the amino acid sequence of the alpha 1 chain from type I calf skin collagen. Amino acid sequence of alpha 1(I)B8.

    PubMed Central

    Glanville, R W; Breitkreutz, D; Meitinger, M; Fietzek, P P

    1983-01-01

    The complete amino acid sequence of the 279-residue CNBr peptide CB8 from the alpha 1 chain of type I calf skin collagen is presented. It was determined by sequencing overlapping fragments of CB8 produced by Staphylococcus aureus V8 proteinase, trypsin, Endoproteinase Arg-C and hydroxylamine. Tryptic cleavages were also made specific for lysine by blocking arginine residues with cyclohexane-1,2-dione. This completes the amino acid sequence analysis of the 1054-residues-long alpha (I) chain of calf skin collagen. PMID:6354180

  20. Focused Evolution of HIV-1 Neutralizing Antibodies Revealed by Structures and Deep Sequencing

    SciTech Connect

    Wu, Xueling; Zhou, Tongqing; Zhu, Jiang; Zhang, Baoshan; Georgiev, Ivelin; Wang, Charlene; Chen, Xuejun; Longo, Nancy S.; Louder, Mark; McKee, Krisha; O’Dell, Sijy; Perfetto, Stephen; Schmidt, Stephen D.; Shi, Wei; Wu, Lan; Yang, Yongping; Yang, Zhi-Yong; Yang, Zhongjia; Zhang, Zhenhai; Bonsignori, Mattia; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Haynes, Barton F.; Simek, Melissa; Burton, Dennis R.; Koff, Wayne C.; Doria-Rose, Nicole A.; Connors, Mark; Mullikin, James C.; Nabel, Gary J.; Roederer, Mario; Shapiro, Lawrence; Kwong, Peter D.; Mascola, John R.

    2013-03-04

    Antibody VRC01 is a human immunoglobulin that neutralizes about 90% of HIV-1 isolates. To understand how such broadly neutralizing antibodies develop, we used x-ray crystallography and 454 pyrosequencing to characterize additional VRC01-like antibodies from HIV-1-infected individuals. Crystal structures revealed a convergent mode of binding for diverse antibodies to the same CD4-binding-site epitope. A functional genomics analysis of expressed heavy and light chains revealed common pathways of antibody-heavy chain maturation, confined to the IGHV1-2*02 lineage, involving dozens of somatic changes, and capable of pairing with different light chains. Broadly neutralizing HIV-1 immunity associated with VRC01-like antibodies thus involves the evolution of antibodies to a highly affinity-matured state required to recognize an invariant viral structure, with lineages defined from thousands of sequences providing a genetic roadmap of their development.

  1. Stacking sequence and interlayer coupling in few-layer graphene revealed by in situ imaging

    SciTech Connect

    Wang, Zhu-Jun; Dong, Jichen; Cui, Yi; Eres, Gyula; Timpe, Olaf; Fu, Qiang; Ding, Feng; Willinger, Marc-Georg; Schloegl, R.

    2016-10-19

    In the transition from graphene to graphite, the addition of each individual graphene layer modifies the electronic structure and produces a different material with unique properties. Controlled growth of few-layer graphene is therefore of fundamental interest and will provide access to materials with engineered electronic structure. Here we combine isothermal growth and etching experiments with in situ scanning electron microscopy to reveal the stacking sequence and interlayer coupling strength in few-layer graphene. The observed layer-dependent etching rates reveal the relative strength of the graphene graphene and graphene substrate interaction and the resulting mode of adlayer growth. Scanning tunnelling microscopy and density functional theory calculations confirm a strong coupling between graphene edge atoms and platinum. Simulated etching confirms that etching can be viewed as reversed growth. This work demonstrates that real-time imaging under controlled atmosphere is a powerful method for designing synthesis protocols for sp2 carbon nanostructures in between graphene and graphite.

  2. Stacking sequence and interlayer coupling in few-layer graphene revealed by in situ imaging

    NASA Astrophysics Data System (ADS)

    Wang, Zhu-Jun; Dong, Jichen; Cui, Yi; Eres, Gyula; Timpe, Olaf; Fu, Qiang; Ding, Feng; Schloegl, R.; Willinger, Marc-Georg

    2016-10-01

    In the transition from graphene to graphite, the addition of each individual graphene layer modifies the electronic structure and produces a different material with unique properties. Controlled growth of few-layer graphene is therefore of fundamental interest and will provide access to materials with engineered electronic structure. Here we combine isothermal growth and etching experiments with in situ scanning electron microscopy to reveal the stacking sequence and interlayer coupling strength in few-layer graphene. The observed layer-dependent etching rates reveal the relative strength of the graphene-graphene and graphene-substrate interaction and the resulting mode of adlayer growth. Scanning tunnelling microscopy and density functional theory calculations confirm a strong coupling between graphene edge atoms and platinum. Simulated etching confirms that etching can be viewed as reversed growth. This work demonstrates that real-time imaging under controlled atmosphere is a powerful method for designing synthesis protocols for sp2 carbon nanostructures in between graphene and graphite.

  3. Whole Exome Sequencing Reveals Novel PHEX Splice Site Mutations in Patients with Hypophosphatemic Rickets

    PubMed Central

    Gillies, Christopher; Sampson, Matthew G.; Kher, Vijay; Sethi, Sidharth K.; Otto, Edgar A.

    2015-01-01

    Objective Hypophosphatemic rickets (HR) is a heterogeneous genetic phosphate wasting disorder. The disease is most commonly caused by mutations in the PHEX gene located on the X-chromosome or by mutations in CLCN5, DMP1, ENPP1, FGF23, and SLC34A3. The aims of this study were to perform molecular diagnostics for four patients with HR of Indian origin (two independent families) and to describe their clinical features. Methods We performed whole exome sequencing (WES) for the affected mother of two boys who also displayed the typical features of HR, including bone malformations and phosphate wasting. B-lymphoblast cell lines were established by EBV transformation and subsequent RT-PCR to investigate an uncommon splice site variant found by WES. An in silico analysis was done to obtain accurate nucleotide frequency occurrences of consensus splice positions other than the canonical sites of all human exons. Additionally, we applied direct Sanger sequencing for all exons and exon/intron boundaries of the PHEX gene for an affected girl from an independent second Indian family. Results WES revealed a novel PHEX splice acceptor mutation in intron 9 (c.1080-3C>A) in a family with 3 affected individuals with HR. The effect on splicing of this mutation was further investigated by RT-PCR using RNA obtained from a patient’s EBV-transformed lymphoblast cell line. RT-PCR revealed an aberrant splice transcript skipping exons 10-14 which was not observed in control samples, confirming the diagnosis of X-linked dominant hypophosphatemia (XLH). The in silico analysis of all human splice sites adjacent to all 327,293 exons across 81,814 transcripts among 20,345 human genes revealed that cytosine is, with 64.3%, the most frequent nucleobase at the minus 3 splice acceptor position, followed by thymidine with 28.7%, adenine with 6.3%, and guanine with 0.8%. We generated frequency tables and pictograms for the extended donor and acceptor splice consensus regions by analyzing all human

  4. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    PubMed

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  5. Nuclear Species-Diagnostic SNP Markers Mined from 454 Amplicon Sequencing Reveal Admixture Genomic Structure of Modern Citrus Varieties

    PubMed Central

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  6. An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data.

    PubMed Central

    Adzhubei, I A; Adzhubei, A A; Neidle, S

    1998-01-01

    We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship. PMID:9399866

  7. Genome sequencing reveals fine scale diversification and reticulation history during speciation in Sus

    PubMed Central

    2013-01-01

    Background Elucidating the process of speciation requires an in-depth understanding of the evolutionary history of the species in question. Studies that rely upon a limited number of genetic loci do not always reveal actual evolutionary history, and often confuse inferences related to phylogeny and speciation. Whole-genome data, however, can overcome this issue by providing a nearly unbiased window into the patterns and processes of speciation. In order to reveal the complexity of the speciation process, we sequenced and analyzed the genomes of 10 wild pigs, representing morphologically or geographically well-defined species and subspecies of the genus Sus from insular and mainland Southeast Asia, and one African common warthog. Results Our data highlight the importance of past cyclical climatic fluctuations in facilitating the dispersal and isolation of populations, thus leading to the diversification of suids in one of the most species-rich regions of the world. Moreover, admixture analyses revealed extensive, intra- and inter-specific gene-flow that explains previous conflicting results obtained from a limited number of loci. We show that these multiple episodes of gene-flow resulted from both natural and human-mediated dispersal. Conclusions Our results demonstrate the importance of past climatic fluctuations and human mediated translocations in driving and complicating the process of speciation in island Southeast Asia. This case study demonstrates that genomics is a powerful tool to decipher the evolutionary history of a genus, and reveals the complexity of the process of speciation. PMID:24070215

  8. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism.

    PubMed

    Chen, Chunxian; Bock, Clive H; Beckman, Tom G

    2014-12-01

    This study was to explore genomic factors affecting the performance and polymorphism of 340 randomly selected EST-SSR (expressed sequence tag-simple sequence repeat) primers through BLAST of primer sequences to a reference genome. Genotyping showed 111 failed and 229 succeeded. The failed types included "no peaks" (NP, 69 primers), "weak peaks" (WP, 30), and "multiple peaks" (MP, 12). The successful types were divided into HM (homozygous between two selected parents, 78 primers) and HT (heterozygous at least in one parent, 151 primers). The BLAST revealed primer alignment status, genomic amplicon size (GAS), and genomic and expressed amplicon size difference (ASD). The alignment status was categorized as: "no hits found" (NHF); "multiple partial alignments" (MPA); "single partial alignment" (SPA); "multiple full alignments" (MFA); and "single full alignment" (SFA). NHF and partial alignment (PA) mainly resulted from discrepant nucleotides in contig-derived primers. The ASD separated 247 non-NHF primers into: "deletion", "same size", "insertion", "intron (GAS ≤500)", "intron (GAS >500)", and "error" categories. Most SFA primers were successful. About 88 % "error", 53 % NHF primers, and 47 % "intron (GAS >500)" failed. The "deletion" and "insertion" primers had the higher HT rates, and the "same size" had the highest HM rate. Optimized primer selection criteria are discussed.

  9. Multiple genome sequences reveal adaptations of a phototrophic bacterium to sediment microenvironments.

    SciTech Connect

    Oda, Yasuhiro; Larimer, Frank W; Chain, Patrick S. G.; Malfatti, Stephanie; Shin, Maria V; Vergez, Lisa; Hauser, Loren John; Land, Miriam L; Braatsch, Stephan; Beatty, Thomas; Pelletier, Dale A; Schaefer, Amy L; Harwood, Caroline S

    2008-11-01

    The bacterial genus Rhodopseudomonas is comprised of photosynthetic bacteria found widely distributed in aquatic sediments. Members of the genus catalyze hydrogen gas production, carbon dioxide sequestration, and biomass turnover. The genome sequence of Rhodopseudomonas palustris CGA009 revealed a surprising richness of metabolic versatility that would seem to explain its ability to live in a heterogeneous environment like sediment. However, there is considerable genotypic diversity among Rhodopseudomonas isolates. Here we report the complete genome sequences of four additional members of the genus isolated from a restricted geographical area. The sequences confirm that the isolates belong to a coherent taxonomic unit, but they also have significant differences. Whole genome alignments show that the circular chromosomes of the isolates consist of a collinear backbone with a moderate number of genomic rearrangements that impact local gene order and orientation. There are 3,319 genes, 70% of the genes in each genome, shared by four or more strains. Between 10% and 18% of the genes in each genome are strain specific. Some of these genes suggest specialized physiological traits, which we verified experimentally, that include expanded light harvesting, oxygen respiration, and nitrogen fixation capabilities, as well as anaerobic fermentation. Strain-specific adaptations include traits that may be useful in bioenergy applications. This work suggests that against a backdrop of metabolic versatility that is a defining characteristic of Rhodopseudomonas, different ecotypes have evolved to take advantage of physical and chemical conditions in sediment microenvironments that are too small for human observation.

  10. Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution.

    PubMed

    Wang, Jianping; Na, Jong-Kuk; Yu, Qingyi; Gschwend, Andrea R; Han, Jennifer; Zeng, Fanchang; Aryal, Rishi; VanBuren, Robert; Murray, Jan E; Zhang, Wenli; Navajas-Pérez, Rafael; Feltus, F Alex; Lemke, Cornelia; Tong, Eric J; Chen, Cuixia; Wai, Ching Man; Singh, Ratnesh; Wang, Ming-Li; Min, Xiang Jia; Alam, Maqsudul; Charlesworth, Deborah; Moore, Paul H; Jiang, Jiming; Paterson, Andrew H; Ming, Ray

    2012-08-21

    Sex determination in papaya is controlled by a recently evolved XY chromosome pair, with two slightly different Y chromosomes controlling the development of males (Y) and hermaphrodites (Y(h)). To study the events of early sex chromosome evolution, we sequenced the hermaphrodite-specific region of the Y(h) chromosome (HSY) and its X counterpart, yielding an 8.1-megabase (Mb) HSY pseudomolecule, and a 3.5-Mb sequence for the corresponding X region. The HSY is larger than the X region, mostly due to retrotransposon insertions. The papaya HSY differs from the X region by two large-scale inversions, the first of which likely caused the recombination suppression between the X and Y(h) chromosomes, followed by numerous additional chromosomal rearrangements. Altogether, including the X and/or HSY regions, 124 transcription units were annotated, including 50 functional pairs present in both the X and HSY. Ten HSY genes had functional homologs elsewhere in the papaya autosomal regions, suggesting movement of genes onto the HSY, whereas the X region had none. Sequence divergence between 70 transcripts shared by the X and HSY revealed two evolutionary strata in the X chromosome, corresponding to the two inversions on the HSY, the older of which evolved about 7.0 million years ago. Gene content differences between the HSY and X are greatest in the older stratum, whereas the gene content and order of the collinear regions are identical. Our findings support theoretical models of early sex chromosome evolution.

  11. Multilocus sequence analysis reveals high genetic diversity in clinical isolates of Burkholderia cepacia complex from India.

    PubMed

    Gautam, Vikas; Patil, Prashant P; Kumar, Sunil; Midha, Samriti; Kaur, Mandeep; Kaur, Satinder; Singh, Meenu; Mali, Swapna; Shastri, Jayanthi; Arora, Anita; Ray, Pallab; Patil, Prabhu B

    2016-10-21

    Burkholderia cepacia complex (Bcc) is a complex group of bacteria causing opportunistic infections in immunocompromised and cystic fibrosis (CF) patients. Herein, we report multilocus sequence typing and analysis of the 57 clinical isolates of Bcc collected over the period of seven years (2005-2012) from several hospitals across India. A total of 21 sequence types (ST) including two STs from cystic fibrosis patient's isolates and twelve novel STs were identified in the population reflecting the extent of genetic diversity. Multilocus sequence analysis revealed two lineages in population, a major lineage belonging to B. cenocepacia and a minor lineage belonging to B. cepacia. Split-decomposition analysis suggests absence of interspecies recombination and intraspecies recombination contributed in generating genotypic diversity amongst isolates. Further linkage disequilibrium analysis indicates that recombination takes place at a low frequency, which is not sufficient to break down the clonal relationship. This knowledge of the genetic structure of Bcc population from a rapidly developing country will be invaluable in the epidemiology, surveillance and understanding global diversity of this group of a pathogen.

  12. Multilocus sequence analysis reveals high genetic diversity in clinical isolates of Burkholderia cepacia complex from India

    PubMed Central

    Gautam, Vikas; Patil, Prashant P.; Kumar, Sunil; Midha, Samriti; Kaur, Mandeep; Kaur, Satinder; Singh, Meenu; Mali, Swapna; Shastri, Jayanthi; Arora, Anita; Ray, Pallab; Patil, Prabhu B.

    2016-01-01

    Burkholderia cepacia complex (Bcc) is a complex group of bacteria causing opportunistic infections in immunocompromised and cystic fibrosis (CF) patients. Herein, we report multilocus sequence typing and analysis of the 57 clinical isolates of Bcc collected over the period of seven years (2005–2012) from several hospitals across India. A total of 21 sequence types (ST) including two STs from cystic fibrosis patient’s isolates and twelve novel STs were identified in the population reflecting the extent of genetic diversity. Multilocus sequence analysis revealed two lineages in population, a major lineage belonging to B. cenocepacia and a minor lineage belonging to B. cepacia. Split-decomposition analysis suggests absence of interspecies recombination and intraspecies recombination contributed in generating genotypic diversity amongst isolates. Further linkage disequilibrium analysis indicates that recombination takes place at a low frequency, which is not sufficient to break down the clonal relationship. This knowledge of the genetic structure of Bcc population from a rapidly developing country will be invaluable in the epidemiology, surveillance and understanding global diversity of this group of a pathogen. PMID:27767197

  13. Not all order memory is equal: Test demands reveal dissociations in memory for sequence information.

    PubMed

    Jonker, Tanya R; MacLeod, Colin M

    2017-02-01

    Remembering the order of a sequence of events is a fundamental feature of episodic memory. Indeed, a number of formal models represent temporal context as part of the memory system, and memory for order has been researched extensively. Yet, the nature of the code(s) underlying sequence memory is still relatively unknown. Across 4 experiments that manipulated encoding task, we found evidence for 3 dissociable facets of order memory. Experiment 1 introduced a test requiring a judgment of which of 2 alternatives had immediately followed a word during encoding. This measure revealed better retention of interitem associations following relational encoding (silent reading) than relatively item-specific encoding (judging referent size), a pattern consistent with that observed in previous research using order reconstruction tests. In sharp contrast, Experiment 2 demonstrated the reverse pattern: Memory for the studied order of 2 sequentially presented items was actually better following item-specific encoding than following relational encoding. Experiment 3 reproduced this dissociation in a single experiment using both tests. Experiment 4 extended these findings by further dissociating the roles of relational encoding and item strength in the 2 tests. Taken together, these results indicate that memory for event sequence is influenced by (a) interitem associations, (b) the emphasized directionality of an association, and (c) an item's strength independent of other items. Memory for order is more complicated than has been portrayed in theories of memory and its nuances should be carefully considered when designing tests and models of temporal and relational memory. (PsycINFO Database Record

  14. Genome Sequencing of the Behavior Manipulating Virus LbFV Reveals a Possible New Virus Family

    PubMed Central

    Lepetit, David; Gillet, Benjamin; Hughes, Sandrine; Kraaijeveld, Ken

    2016-01-01

    Parasites are sometimes able to manipulate the behavior of their hosts. However, the molecular cues underlying this phenomenon are poorly documented. We previously reported that the parasitoid wasp Leptopilina boulardi which develops from Drosophila larvae is often infected by an inherited DNA virus. In addition to being maternally transmitted, the virus benefits from horizontal transmission in superparasitized larvae (Drosophila that have been parasitized several times). Interestingly, the virus forces infected females to lay eggs in already parasitized larvae, thus increasing the chance of being horizontally transmitted. In a first step towards the identification of virus genes responsible for the behavioral manipulation, we present here the genome sequence of the virus, called LbFV. The sequencing revealed that its genome contains an homologous repeat sequence (hrs) found in eight regions in the genome. The presence of this hrs may explain the genomic plasticity that we observed for this genome. The genome of LbFV encodes 108 ORFs, most of them having no homologs in public databases. The virus is however related to Hytrosaviridae, although distantly. LbFV may thus represent a member of a new virus family. Several genes of LbFV were captured from eukaryotes, including two anti-apoptotic genes. More surprisingly, we found that LbFV captured from an ancestral wasp a protein with a Jumonji domain. This gene was afterwards duplicated in the virus genome. We hypothesized that this gene may be involved in manipulating the expression of wasp genes, and possibly in manipulating its behavior. PMID:28173110

  15. Single Nucleus Genome Sequencing Reveals High Similarity among Nuclei of an Endomycorrhizal Fungus

    PubMed Central

    Zhang, Zhonghua; Ivanov, Sergey; Saunders, Diane G. O.; Mu, Desheng; Pang, Erli; Cao, Huifen; Cha, Hwangho; Lin, Tao; Zhou, Qian; Shang, Yi; Li, Ying; Sharma, Trupti; van Velzen, Robin; de Ruijter, Norbert; Aanen, Duur K.; Win, Joe; Kamoun, Sophien; Bisseling, Ton; Geurts, René; Huang, Sanwen

    2014-01-01

    Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya. PMID:24415955

  16. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    PubMed Central

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-01-01

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. PMID:27172215

  17. Serial number tagging reveals a prominent sequence preference of retrotransposon integration.

    PubMed

    Chatterjee, Atreyi Ghatak; Esnault, Caroline; Guo, Yabin; Hung, Stevephen; McQueen, Philip G; Levin, Henry L

    2014-07-01

    Transposable elements (TE) have both negative and positive impact on the biology of their host. As a result, a balance is struck between the host and the TE that relies on directing integration to specific genome territories. The extraordinary capacity of DNA sequencing can create ultra dense maps of integration that are being used to study the mechanisms that position integration. Unfortunately, the great increase in the numbers of insertion sites detected comes with the cost of not knowing which positions are rare targets and which sustain high numbers of insertions. To address this problem we developed the serial number system, a TE tagging method that measures the frequency of integration at single nucleotide positions. We sequenced 1 million insertions of retrotransposon Tf1 in the genome of Schizosaccharomyces pombe and obtained the first profile of integration with frequencies for each individual position. Integration levels at individual nucleotides varied over two orders of magnitude and revealed that sequence recognition plays a key role in positioning integration. The serial number system is a general method that can be applied to determine precise integration maps for retroviruses and gene therapy vectors.

  18. Revealing glacier flow and surge dynamics from animated satellite image sequences: examples from the Karakoram

    NASA Astrophysics Data System (ADS)

    Paul, F.

    2015-04-01

    Although animated images are very popular on the Internet, they have so far found only limited use for glaciological applications. With long time-series of satellite images becoming increasingly available and glaciers being well recognized for their rapid changes and variable flow dynamics, animated sequences of multiple satellite images reveal glacier dynamics in a time-lapse mode, making the otherwise slow changes of glacier movement visible and understandable for a wide public. For this study animated image sequences were created from freely available image quick-looks of orthorectified Landsat scenes for four regions in the central Karakoram mountain range. The animations play automatically in a web-browser and might help to demonstrate glacier flow dynamics for educational purposes. The animations revealed highly complex patterns of glacier flow and surge dynamics over a 15-year time period (1998-2013). In contrast to other regions, surging glaciers in the Karakoram are often small (around 10 km2), steep, debris free, and advance for several years at comparably low annual rates (a few hundred m a-1). The advance periods of individual glaciers are generally out of phase, indicating a limited climatic control on their dynamics. On the other hand, nearly all other glaciers in the region are either stable or slightly advancing, indicating balanced or even positive mass budgets over the past few years to decades.

  19. Whole Genome Sequencing Reveals Potential New Targets for Improving Nitrogen Uptake and Utilization in Sorghum bicolor

    PubMed Central

    Massel, Karen; Campbell, Bradley C.; Mace, Emma S.; Tai, Shuaishuai; Tao, Yongfu; Worland, Belinda G.; Jordan, David R.; Botella, Jose R.; Godwin, Ian D.

    2016-01-01

    Nitrogen (N) fertilizers are a major agricultural input where more than 100 million tons are supplied annually. Cereals are particularly inefficient at soil N uptake, where the unrecovered nitrogen causes serious environmental damage. Sorghum bicolor (sorghum) is an important cereal crop, particularly in resource-poor semi-arid regions, and is known to have a high NUE in comparison to other major cereals under limited N conditions. This study provides the first assessment of genetic diversity and signatures of selection across 230 fully sequenced genes putatively involved in the uptake and utilization of N from a diverse panel of sorghum lines. This comprehensive analysis reveals an overall reduction in diversity as a result of domestication and a total of 128 genes displaying signatures of purifying selection, thereby revealing possible gene targets to improve NUE in sorghum and cereals alike. A number of key genes appear to have been involved in selective sweeps, reducing their sequence diversity. The ammonium transporter (AMT) genes generally had low allelic diversity, whereas a substantial number of nitrate/peptide transporter 1 (NRT1/PTR) genes had higher nucleotide diversity in domesticated germplasm. Interestingly, members of the distinct race Guinea margaritiferum contained a number of unique alleles, and along with the wild sorghum species, represent a rich resource of new variation for plant improvement of NUE in sorghum. PMID:27826302

  20. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor.

  1. Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to Deepwater Horizon oil spill.

    PubMed

    Mason, Olivia U; Hazen, Terry C; Borglin, Sharon; Chain, Patrick S G; Dubinsky, Eric A; Fortney, Julian L; Han, James; Holman, Hoi-Ying N; Hultman, Jenni; Lamendella, Regina; Mackelprang, Rachel; Malfatti, Stephanie; Tom, Lauren M; Tringe, Susannah G; Woyke, Tanja; Zhou, Jizhong; Rubin, Edward M; Jansson, Janet K

    2012-09-01

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility, chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea.

  2. Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to Deepwater Horizon oil spill

    PubMed Central

    Mason, Olivia U; Hazen, Terry C; Borglin, Sharon; Chain, Patrick S G; Dubinsky, Eric A; Fortney, Julian L; Han, James; Holman, Hoi-Ying N; Hultman, Jenni; Lamendella, Regina; Mackelprang, Rachel; Malfatti, Stephanie; Tom, Lauren M; Tringe, Susannah G; Woyke, Tanja; Zhou, Jizhong; Rubin, Edward M; Jansson, Janet K

    2012-01-01

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility, chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea. PMID:22717885

  3. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  4. Analysis of TP53 mutation spectra reveals the fingerprint of the potent environmental carcinogen, aristolochic acid.

    PubMed

    Hollstein, M; Moriya, M; Grollman, A P; Olivier, M

    2013-01-01

    Genetic alterations in cancer tissues may reflect the mutational fingerprint of environmental carcinogens. Here we review the pieces of evidence that support the role of aristolochic acid (AA) in inducing a mutational fingerprint in the tumor suppressor gene TP53 in urothelial carcinomas of the upper urinary tract (UUT). Exposure to AA, a nitrophenathrene carboxylic acid present in certain herbal remedies and in flour prepared from wheat grain contaminated with seeds of Aristolochia clematitis, has been linked to chronic nephropathy and UUT. TP53 mutations in UUT of individuals exposed to AA reveal a unique pattern of mutations characterized by A to T transversions on the non-transcribed strand, which cluster at hotspots rarely mutated in other cancers. This unusual pattern, originally discovered in UUTs from two different populations, one in Taiwan, and one in the Balkans, has been reproduced experimentally by treating mouse cells that harbor human TP53 sequences with AA. The convergence of molecular epidemiological and experimental data establishes a clear causal association between exposure to the human carcinogen AA and UUT. Despite bans on the sale of herbs containing AA, their use continues, raising global public health concern and an urgent need to identify populations at risk.

  5. Global geno-proteomic analysis reveals cross-continental sequence conservation and druggable sites among influenza virus polymerases.

    PubMed

    Babar, Mustafeez Mujtaba; Zaidi, Najam-us-Sahar Sadaf; Tahir, Muhammad

    2014-12-01

    Influenza virus is one of the major causes of mortality and morbidity associated with respiratory diseases. The high rate of mutation in the viral proteome provides it with the ability to survive in a variety of host species. This property helps it in maintaining and developing its pathogenicity, transmission and drug resistance. Alternate drug targets, particularly the internal proteins, can potentially be exploited for addressing the resistance issues. In the current analysis, the degree of conservation of influenza virus polymerases has been studied as one of the essential elements for establishing its candidature as a potential target of antiviral therapy. We analyzed more than 130,000 nucleotide and amino acid sequences by classifying them on the basis of continental presence of host organisms. Computational analyses including genetic polymorphism study, mutation pattern determination, molecular evolution and geophylogenetic analysis were performed to establish the high degree of conservation among the sequences. These studies lead to establishing the polymerases, in particular PB1, as highly conserved proteins. Moreover, we mapped the conservation percentage on the tertiary structures of proteins to identify the conserved, druggable sites. The research study, hence, revealed that the influenza virus polymerases are highly conserved (95-99%) proteins with a very slow mutation rate. Potential drug binding sites on various polymerases have also been reported. A scheme for drug target candidate development that can be employed to rapidly mutating proteins has been presented. Moreover, the research output can help in designing new therapeutic molecules against the identified targets.

  6. Formation Sequences of Iron Minerals in the Acidic Alteration Products and Variation of Hydrothermal Fluid Conditions

    NASA Astrophysics Data System (ADS)

    Isobe, H.; Yoshizawa, M.

    2008-12-01

    Iron minerals have important role in environmental issues not only on the Earth but also other terrestrial planets. Iron mineral species related to alteration products of primary minerals with surface or subsurface fluids are characterized by temperature, acidity and redox conditions of the fluids. We can see various iron- bearing alteration products in alteration products around fumaroles in geothermal/volcanic areas. In this study, zonal structures of iron minerals in alteration products of the geothermal area are observed to elucidate temporal and spatial variation of hydrothermal fluids. Alteration of the pyroxene-amphibole andesite of Garan-dake volcano, Oita, Japan occurs by the acidic hydrothermal fluid to form cristobalite leaching out elements other than Si. Hand specimens with unaltered or weakly altered core and cristobalite crust show various sequences of layers. XRD analysis revealed that the alteration degree is represented by abundance of cristobalite. Intermediately altered layers are characterized by occurrence including alunite, pyrite, kaolinite, goethite and hematite. A specimen with reddish brown core surrounded by cristobalite-rich white crust has brown colored layers at the boundary of core and the crust. Reddish core is characterized by occurrence of crystalline hematite by XRD. Another hand specimen has light gray core, which represents reduced conditions, and white cristobalite crust with light brown and reddish brown layers of ferric iron minerals between the core and the crust. On the other hand, hornblende crystals, typical ferrous iron-bearing mineral of the host rock, are well preserved in some samples with strongly decolorized cristobalite-rich groundmass. Hydrothermal alteration experiments of iron-rich basaltic material shows iron mineral species depend on acidity and temperature of the fluid. Oxidation states of the iron-bearing mineral species are strongly influenced by the acidity and redox conditions. Variations of alteration

  7. Genetic variability of mutans streptococci revealed by wide whole-genome sequencing

    PubMed Central

    2013-01-01

    Background Mutans streptococci are a group of bacteria significantly contributing to tooth decay. Their genetic variability is however still not well understood. Results Genomes of 6 clinical S. mutans isolates of different origins, one isolate of S. sobrinus (DSM 20742) and one isolate of S. ratti (DSM 20564) were sequenced and comparatively analyzed. Genome alignment revealed a mosaic-like structure of genome arrangement. Genes related to pathogenicity are found to have high variations among the strains, whereas genes for oxidative stress resistance are well conserved, indicating the importance of this trait in the dental biofilm community. Analysis of genome-scale metabolic networks revealed significant differences in 42 pathways. A striking dissimilarity is the unique presence of two lactate oxidases in S. sobrinus DSM 20742, probably indicating an unusual capability of this strain in producing H2O2 and expanding its ecological niche. In addition, lactate oxidases may form with other enzymes a novel energetic pathway in S. sobrinus DSM 20742 that can remedy its deficiency in citrate utilization pathway. Using 67 S. mutans genomes currently available including the strains sequenced in this study, we estimates the theoretical core genome size of S. mutans, and performed modeling of S. mutans pan-genome by applying different fitting models. An “open” pan-genome was inferred. Conclusions The comparative genome analyses revealed diversities in the mutans streptococci group, especially with respect to the virulence related genes and metabolic pathways. The results are helpful for better understanding the evolution and adaptive mechanisms of these oral pathogen microorganisms and for combating them. PMID:23805886

  8. Whale phylogeny and rapid radiation events revealed using novel retroposed elements and their flanking sequences

    PubMed Central

    2011-01-01

    Background A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales), and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. Results An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae), and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Conclusions Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae), whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving cetacean phylogenetic

  9. Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

    PubMed

    Muraya, Moses M; Schmutzer, Thomas; Ulpinnis, Chris; Scholz, Uwe; Altmann, Thomas

    2015-01-01

    A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS) technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS), assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents). Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs), of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV) of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful approach for

  10. Molecular cloning, encoding sequence, and expression of vaccinia virus nucleic acid-dependent nucleoside triphosphatase gene.

    PubMed Central

    Rodriguez, J F; Kahn, J S; Esteban, M

    1986-01-01

    A rabbit poxvirus genomic library contained within the expression vector lambda gt11 was screened with polyclonal antiserum prepared against vaccinia virus nucleic acid-dependent nucleoside triphosphatase (NTPase)-I enzyme. Five positive phage clones containing from 0.72- to 2.5-kilobase-pair (kbp) inserts expressed a beta-galactosidase fusion protein that was reactive by immunoblotting with the NTPase-I antibody. Hybridization analysis allowed the location of this gene within the vaccinia HindIIID restriction fragment. From the known nucleotide sequence of the 16-kbp vaccinia HindIIID fragment, we identified a region that contains a 1896-base open reading frame coding for a 631-amino acid protein. Analysis of the complete sequence revealed a highly basic protein, with hydrophilic COOH and NH2 termini, various hydrophobic domains, and no significant homology to other known proteins. Translational studies demonstrate that NTPase-I belongs to a late class of viral genes. This protein is highly conserved among Orthopoxviruses. Images PMID:3025846

  11. Partial amino acid sequences around sulfhydryl groups of soybean beta-amylase.

    PubMed

    Nomura, K; Mikami, B; Morita, Y

    1987-08-01

    Sulfhydryl (SH) groups of soybean beta-amylase were modified with 5-(iodoaceto-amidoethyl)aminonaphthalene-1-sulfonate (IAEDANS) and the SH-containing peptides exhibiting fluorescence were purified after chymotryptic digestion of the modified enzyme. The sequence analysis of the peptides derived from the modification of all SH groups in the denatured enzyme revealed the existence of six SH groups, in contrast to five reported previously. One of them was found to have extremely low reactivity toward SH-reagents without reduction. In the native state, IAEDANS reacted with 2 mol of SH groups per mol of the enzyme (SH1 and SH2) accompanied with inactivation of the enzyme owing to the modification of SH2 located near the active site of this enzyme. The selective modification of SH2 with IAEDANS was attained after the blocking of SH1 with 5,5'-dithiobis-(2-nitrobenzoic acid). The amino acid sequences of the peptides containing SH1 and SH2 were determined to be Cys-Ala-Asn-Pro-Gln and His-Gln-Cys-Gly-Gly-Asn-Val-Gly-Asp-Ile-Val-Asn-Ile-Pro-Ile-Pro-Gln-Trp, respectively.

  12. Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics

    SciTech Connect

    Hua, Zheng-Shuang; Han, Yu-Jiao; Chen, Lin-Xing; Liu, Jun; Hu, Min; Li, Sheng-Jin; Kuang, Jia-Liang; Chain, Patrick SG; Huang, Li-Nan; Shu, Wen-Sheng

    2014-11-07

    Here we report that high-throughput sequencing is expanding our knowledge of microbial diversity in the environment. Still, understanding the metabolic potentials and ecological roles of rare and uncultured microbes in natural communities remains a major challenge. To this end, we applied a ‘divide and conquer’ strategy that partitioned a massive metagenomic data set (>100 Gbp) into subsets based on K-mer frequency in sequence assembly to a low-diversity acid mine drainage (AMD) microbial community and, by integrating with an additional metatranscriptomic assembly, successfully obtained 11 draft genomes most of which represent yet uncultured and/or rare taxa (relative abundance <1%). We report the first genome of a naturally occurring Ferrovum population (relative abundance >90%) and its metabolic potentials and gene expression profile, providing initial molecular insights into the ecological role of these lesser known, but potentially important, microorganisms in the AMD environment. Gene transcriptional analysis of the active taxa revealed major metabolic capabilities executed in situ, including carbon- and nitrogen-related metabolisms associated with syntrophic interactions, iron and sulfur oxidation, which are key in energy conservation and AMD generation, and the mechanisms of adaptation and response to the environmental stresses (heavy metals, low pH and oxidative stress). Remarkably, nitrogen fixation and sulfur oxidation were performed by the rare taxa, indicating their critical roles in the overall functioning and assembly of the AMD community. Finally, our study demonstrates the potential of the ‘divide and conquer’ strategy in high-throughput sequencing data assembly for genome reconstruction and functional partitioning analysis of both dominant and rare species in natural microbial assemblages.

  13. Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics

    DOE PAGES

    Hua, Zheng-Shuang; Han, Yu-Jiao; Chen, Lin-Xing; ...

    2014-11-07

    Here we report that high-throughput sequencing is expanding our knowledge of microbial diversity in the environment. Still, understanding the metabolic potentials and ecological roles of rare and uncultured microbes in natural communities remains a major challenge. To this end, we applied a ‘divide and conquer’ strategy that partitioned a massive metagenomic data set (>100 Gbp) into subsets based on K-mer frequency in sequence assembly to a low-diversity acid mine drainage (AMD) microbial community and, by integrating with an additional metatranscriptomic assembly, successfully obtained 11 draft genomes most of which represent yet uncultured and/or rare taxa (relative abundance <1%). We reportmore » the first genome of a naturally occurring Ferrovum population (relative abundance >90%) and its metabolic potentials and gene expression profile, providing initial molecular insights into the ecological role of these lesser known, but potentially important, microorganisms in the AMD environment. Gene transcriptional analysis of the active taxa revealed major metabolic capabilities executed in situ, including carbon- and nitrogen-related metabolisms associated with syntrophic interactions, iron and sulfur oxidation, which are key in energy conservation and AMD generation, and the mechanisms of adaptation and response to the environmental stresses (heavy metals, low pH and oxidative stress). Remarkably, nitrogen fixation and sulfur oxidation were performed by the rare taxa, indicating their critical roles in the overall functioning and assembly of the AMD community. Finally, our study demonstrates the potential of the ‘divide and conquer’ strategy in high-throughput sequencing data assembly for genome reconstruction and functional partitioning analysis of both dominant and rare species in natural microbial assemblages.« less

  14. Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics.

    PubMed

    Hua, Zheng-Shuang; Han, Yu-Jiao; Chen, Lin-Xing; Liu, Jun; Hu, Min; Li, Sheng-Jin; Kuang, Jia-Liang; Chain, Patrick S G; Huang, Li-Nan; Shu, Wen-Sheng

    2015-06-01

    High-throughput sequencing is expanding our knowledge of microbial diversity in the environment. Still, understanding the metabolic potentials and ecological roles of rare and uncultured microbes in natural communities remains a major challenge. To this end, we applied a 'divide and conquer' strategy that partitioned a massive metagenomic data set (>100 Gbp) into subsets based on K-mer frequency in sequence assembly to a low-diversity acid mine drainage (AMD) microbial community and, by integrating with an additional metatranscriptomic assembly, successfully obtained 11 draft genomes most of which represent yet uncultured and/or rare taxa (relative abundance <1%). We report the first genome of a naturally occurring Ferrovum population (relative abundance >90%) and its metabolic potentials and gene expression profile, providing initial molecular insights into the ecological role of these lesser known, but potentially important, microorganisms in the AMD environment. Gene transcriptional analysis of the active taxa revealed major metabolic capabilities executed in situ, including carbon- and nitrogen-related metabolisms associated with syntrophic interactions, iron and sulfur oxidation, which are key in energy conservation and AMD generation, and the mechanisms of adaptation and response to the environmental stresses (heavy metals, low pH and oxidative stress). Remarkably, nitrogen fixation and sulfur oxidation were performed by the rare taxa, indicating their critical roles in the overall functioning and assembly of the AMD community. Our study demonstrates the potential of the 'divide and conquer' strategy in high-throughput sequencing data assembly for genome reconstruction and functional partitioning analysis of both dominant and rare species in natural microbial assemblages.

  15. Polyploid genome of Camelina sativa revealed by isolation of fatty acid synthesis genes

    PubMed Central

    2010-01-01

    Background Camelina sativa, an oilseed crop in the Brassicaceae family, has inspired renewed interest due to its potential for biofuels applications. Little is understood of the nature of the C. sativa genome, however. A study was undertaken to characterize two genes in the fatty acid biosynthesis pathway, fatty acid desaturase (FAD) 2 and fatty acid elongase (FAE) 1, which revealed unexpected complexity in the C. sativa genome. Results In C. sativa, Southern analysis indicates the presence of three copies of both FAD2 and FAE1 as well as LFY, a known single copy gene in other species. All three copies of both CsFAD2 and CsFAE1 are expressed in developing seeds, and sequence alignments show that previously described conserved sites are present, suggesting that all three copies of both genes could be functional. The regions downstream of CsFAD2 and upstream of CsFAE1 demonstrate co-linearity with the Arabidopsis genome. In addition, three expressed haplotypes were observed for six predicted single-copy genes in 454 sequencing analysis and results from flow cytometry indicate that the DNA content of C. sativa is approximately three-fold that of diploid Camelina relatives. Phylogenetic analyses further support a history of duplication and indicate that C. sativa and C. microcarpa might share a parental genome. Conclusions There is compelling evidence for triplication of the C. sativa genome, including a larger chromosome number and three-fold larger measured genome size than other Camelina relatives, three isolated copies of FAD2, FAE1, and the KCS17-FAE1 intergenic region, and three expressed haplotypes observed for six predicted single-copy genes. Based on these results, we propose that C. sativa be considered an allohexaploid. The characterization of fatty acid synthesis pathway genes will allow for the future manipulation of oil composition of this emerging biofuel crop; however, targeted manipulations of oil composition and general development of C. sativa should

  16. Giraffe genome sequence reveals clues to its unique morphology and physiology

    PubMed Central

    Agaba, Morris; Ishengoma, Edson; Miller, Webb C.; McGrath, Barbara C.; Hudson, Chelsea N.; Bedoya Reina, Oscar C.; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A.; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R.

    2016-01-01

    The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions. PMID:27187213

  17. Giraffe genome sequence reveals clues to its unique morphology and physiology.

    PubMed

    Agaba, Morris; Ishengoma, Edson; Miller, Webb C; McGrath, Barbara C; Hudson, Chelsea N; Bedoya Reina, Oscar C; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R

    2016-05-17

    The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions.

  18. Wavelet Analysis of DNA Bending Profiles reveals Structural Constraints on the Evolution of Genomic Sequences.

    PubMed

    Audit, Benjamin; Vaillant, Cédric; Arnéodo, Alain; d'Aubenton-Carafa, Yves; Thermes, Claude

    2004-03-01

    Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the corresponding bending profiles generated with curvature tables based on nucleosome positioning data. This exploration through the optics of the so-called `wavelet transform microscope' reveals a characteristic scale of 100-200 bp that separates two regimes of different LRC. We focus here on the existence of LRC in the small-scale regime (≲ 200 bp). Analysis of genomes in the three kingdoms reveals that this regime is specifically associated to the presence of nucleosomes. Indeed, small scale LRC are observed in eukaryotic genomes and to a less extent in archaeal genomes, in contrast with their absence in eubacterial genomes. Similarly, this regime is observed in eukaryotic but not in bacterial viral DNA genomes. There is one exception for genomes of Poxviruses, the only animal DNA viruses that do not replicate in the cell nucleus and do not present small scale LRC. Furthermore, no small scale LRC are detected in the genomes of all examined RNA viruses, with one exception in the case of retroviruses. Altogether, these results strongly suggest that small-scale LRC are a signature of the nucleosomal structure. Finally, we discuss possible interpretations of these small-scale LRC in terms of the mechanisms that govern the positioning, the stability and the dynamics of the nucleosomes along the DNA chain. This paper is maily devoted to a pedagogical presentation of the theoretical concepts and physical methods which are well suited to perform a statistical analysis of genomic

  19. Purification, amino acid sequence and characterisation of kangaroo IGF-I.

    PubMed

    Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z

    1998-01-01

    Insulin-like growth factor-I (IGF-I) and IGF-II have been purified to homogeneity from kangaroo (Macropus fuliginosus) serum, thus this represents the first report of the purification, sequencing and characterisation of marsupial IGFs. N-Terminal protein sequencing reveals that there are six amino acid differences between kangaroo and human IGF-I. Kangaroo IGF-II has been partially sequenced and no differences were found between human and kangaroo IGF-II in the 53 residues identified. Thus the IGFs appear to be remarkably structurally conserved during mammalian radiation. In addition, in vitro characterisation of kangaroo IGF-I demonstrated that the functional properties of human, kangaroo and chicken IGF-I are very similar. In an assay measuring the ability of the proteins to stimulate protein synthesis in rat L6 myoblasts, all IGF-I proteins were found to be equally potent. The ability of all three proteins to compete for binding with radiolabelled human IGF-I to type-1 IGF receptors in L6 myoblasts and in Sminthopsis crassicaudata transformed lung fibroblasts, a marsupial cell line, was comparable. Furthermore, kangaroo and human IGF-I react equally in a human IGF-I RIA using a human reference standard, radiolabelled human IGF-I and a polyclonal antibody raised against recombinant human IGF-I. This study indicates that not only is the primary structure of eutherian and metatherian IGF-I conserved, but also the proteins appear to be functionally similar.

  20. High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire.

    PubMed

    Hou, Xianliang; Lu, Chong; Chen, Sisi; Xie, Qian; Cui, Guangying; Chen, Jianing; Chen, Zhi; Wu, Zhongwen; Ding, Yulong; Ye, Ping; Dai, Yong; Diao, Hongyan

    2016-03-01

    The T-cell receptor (TCR) repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next-generation sequencing has become a powerful tool for deep TCR profiling. Herein, we used this technology to study the repertoire features of TCR beta chain in the blood of healthy individuals.Peripheral blood samples were collected from 10 healthy donors. T cells were isolated with anti-human CD3 magnetic beads according to the manufacturer's protocol. We then combined multiplex-PCR, Illumina sequencing, and IMGT/High V-QUEST to analyze the characteristics and polymorphisms of the TCR.Most of the individual T cell clones were present at very low frequencies, suggesting that they had not undergone clonal expansion. The usage frequencies of the TCR beta variable, beta joining, and beta diversity gene segments were similar among T cells from different individuals. Notably, the usage frequency of individual nucleotides and amino acids within complementarity-determining region (CDR3) intervals was remarkably consistent between individuals. Moreover, our data show that terminal deoxynucleotidyl transferase activity was biased toward the insertion of G (31.92%) and C (27.14%) over A (21.82%) and T (19.12%) nucleotides.Some conserved features could be observed in the composition of CDR3, which may inform future studies of human TCR gene recombination.

  1. High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire

    PubMed Central

    Hou, Xianliang; Lu, Chong; Chen, Sisi; Xie, Qian; Cui, Guangying; Chen, Jianing; Chen, Zhi; Wu, Zhongwen; Ding, Yulong; Ye, Ping; Dai, Yong; Diao, Hongyan

    2016-01-01

    Abstract The T-cell receptor (TCR) repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next-generation sequencing has become a powerful tool for deep TCR profiling. Herein, we used this technology to study the repertoire features of TCR beta chain in the blood of healthy individuals. Peripheral blood samples were collected from 10 healthy donors. T cells were isolated with anti-human CD3 magnetic beads according to the manufacturer's protocol. We then combined multiplex-PCR, Illumina sequencing, and IMGT/High V-QUEST to analyze the characteristics and polymorphisms of the TCR. Most of the individual T cell clones were present at very low frequencies, suggesting that they had not undergone clonal expansion. The usage frequencies of the TCR beta variable, beta joining, and beta diversity gene segments were similar among T cells from different individuals. Notably, the usage frequency of individual nucleotides and amino acids within complementarity-determining region (CDR3) intervals was remarkably consistent between individuals. Moreover, our data show that terminal deoxynucleotidyl transferase activity was biased toward the insertion of G (31.92%) and C (27.14%) over A (21.82%) and T (19.12%) nucleotides. Some conserved features could be observed in the composition of CDR3, which may inform future studies of human TCR gene recombination. PMID:26962778

  2. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    SciTech Connect

    Anderson, Iain; Lakshmi, Lakshmi Dharmarajan; Rodriquez, Jason; Hooper, Sean; Porat, I.; Ulrich, Luke; Mavromatis, K; Sun, Hui; Land, Miriam L; Lapidus, Alla L.; Lucas, Susan; Barry, Kerrie; Huber, Harald; Zhulin, Igor B; Whitman, W. B.; Mukhopadhyay, Biswarup; Woese, Carl; Bristow, James; Kyrpides, Nikos C

    2009-01-01

    Background Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. Results The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. Conclusion The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  3. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    SciTech Connect

    Anderson, iain J.; Dharmarajan, Lakshmi; Rodriguez, Jason; Hooper, Sean; Porat, Iris; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Sun, Hui; Land, Miriam; Lapidus, Alla; Lucas, Susan; Barry, Kerrie; Huber, Harald; Zhulin, Igor B.; Whitman, William B.; Mukhopadhyay, Biswarup; Woese, Carl; Bristow, James; Kyrpides, Nikos

    2008-09-05

    Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced - Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  4. Peptide Mass Fingerprinting and N-Terminal Amino Acid Sequencing of Glycosylated Cysteine Protease of Euphorbia nivulia Buch.-Ham.

    PubMed Central

    Badgujar, Shamkant B.; Mahajan, Raghunath T.

    2013-01-01

    A new cysteine protease named Nivulian-II has been purified from the latex of Euphorbia nivulia Buch.-Ham. The apparent molecular mass of Nivulian-II is 43670.846 Da (MALDI TOF/MS). Peptide mass fingerprint analysis revealed peptide matches to Maturase K (Q52ZV1_9MAGN) of Banksia quercifolia. The N-terminal sequence (DFPPNTCCCICC) showed partial homology with those of other cysteine proteinases of biological origin. This is the first paper to characterize a Nivulian-II of E. nivulia latex with respect to amino acid sequencing. PMID:23476742

  5. Unique features revealed by the genome sequence of Acinetobacter sp. ADP1, a versatile and naturally transformation competent bacterium

    PubMed Central

    Barbe, Valérie; Vallenet, David; Fonknechten, Nuria; Kreimeyer, Annett; Oztas, Sophie; Labarre, Laurent; Cruveiller, Stéphane; Robert, Catherine; Duprat, Simone; Wincker, Patrick; Ornston, L. Nicholas; Weissenbach, Jean; Marlière, Philippe; Cohen, Georges N.; Médigue, Claudine

    2004-01-01

    Acinetobacter sp. strain ADP1 is a nutritionally versatile soil bacterium closely related to representatives of the well-characterized Pseudomonas aeruginosa and Pseudomonas putida. Unlike these bacteria, the Acinetobacter ADP1 is highly competent for natural transformation which affords extraordinary convenience for genetic manipulation. The circular chromosome of the Acinetobacter ADP1, presented here, encodes 3325 predicted coding sequences, of which 60% have been classified based on sequence similarity to other documented proteins. The close evolutionary proximity of Acinetobacter and Pseudomonas species, as judged by the sequences of their 16S RNA genes and by the highest level of bidirectional best hits, contrasts with the extensive divergence in the GC content of their DNA (40 versus 62%). The chromosomes also differ significantly in size, with the Acinetobacter ADP1 chromosome <60% of the length of the Pseudomonas counterparts. Genome analysis of the Acinetobacter ADP1 revealed genes for metabolic pathways involved in utilization of a large variety of compounds. Almost all of these genes, with orthologs that are scattered in other species, are located in five major ‘islands of catabolic diversity’, now an apparent ‘archipelago of catabolic diversity’, within one-quarter of the overall genome. Acinetobacter ADP1 displays many features of other aerobic soil bacteria with metabolism oriented toward the degradation of organic compounds found in their natural habitat. A distinguishing feature of this genome is the absence of a gene corresponding to pyruvate kinase, the enzyme that generally catalyzes the terminal step in conversion of carbohydrates to pyruvate for respiration by the citric acid cycle. This finding supports the view that the cycle itself is centrally geared to the catabolic capabilities of this exceptionally versatile organism. PMID:15514110

  6. Host-Associated Genomic Features of the Novel Uncultured Intracellular Pathogen Ca. Ichthyocystis Revealed by Direct Sequencing of Epitheliocysts

    PubMed Central

    Qi, Weihong; Vaughan, Lloyd; Katharios, Pantelis; Schlapbach, Ralph; Seth-Smith, Helena M.B.

    2016-01-01

    Advances in single-cell and mini-metagenome sequencing have enabled important investigations into uncultured bacteria. In this study, we applied the mini-metagenome sequencing method to assemble genome drafts of the uncultured causative agents of epitheliocystis, an emerging infectious disease in the Mediterranean aquaculture species gilthead seabream. We sequenced multiple cyst samples and constructed 11 genome drafts from a novel beta-proteobacterial lineage, Candidatus Ichthyocystis. The draft genomes demonstrate features typical of pathogenic bacteria with an obligate intracellular lifestyle: a reduced genome of up to 2.6 Mb, reduced G + C content, and reduced metabolic capacity. Reconstruction of metabolic pathways reveals that Ca. Ichthyocystis genomes lack all amino acid synthesis pathways, compelling them to scavenge from the fish host. All genomes encode type II, III, and IV secretion systems, a large repertoire of predicted effectors, and a type IV pilus. These are all considered to be virulence factors, required for adherence, invasion, and host manipulation. However, no evidence of lipopolysaccharide synthesis could be found. Beyond the core functions shared within the genus, alignments showed distinction into different species, characterized by alternative large gene families. These comprise up to a third of each genome, appear to have arisen through duplication and diversification, encode many effector proteins, and are seemingly critical for virulence. Thus, Ca. Ichthyocystis represents a novel obligatory intracellular pathogenic beta-proteobacterial lineage. The methods used: mini-metagenome analysis and manual annotation, have generated important insights into the lifestyle and evolution of the novel, uncultured pathogens, elucidating many putative virulence factors including an unprecedented array of novel gene families. PMID:27190004

  7. RNA Sequencing Reveals Xyr1 as a Transcription Factor Regulating Gene Expression beyond Carbohydrate Metabolism

    PubMed Central

    Ma, Liang; Chen, Ling; Zhang, Lei; Zou, Gen; Liu, Rui; Jiang, Yanping

    2016-01-01

    Xyr1 has been demonstrated to be the main transcription activator of (hemi)cellulases in the well-known cellulase producer Trichoderma reesei. This study comprehensively investigates the genes regulated by Xyr1 through RNA sequencing to produce the transcription profiles of T. reesei Rut-C30 and its xyr1 deletion mutant (Δxyr1), cultured on lignocellulose or glucose. xyr1 deletion resulted in 467 differentially expressed genes on inducing medium. Almost all functional genes involved in (hemi)cellulose degradation and many transporters belonging to the sugar porter family in the major facilitator superfamily (MFS) were downregulated in Δxyr1. By contrast, all differentially expressed protease, lipase, chitinase, some ATP-binding cassette transporters, and heat shock protein-encoding genes were upregulated in Δxyr1. When cultured on glucose, a total of 281 genes were expressed differentially in Δxyr1, most of which were involved in energy, solute transport, lipid, amino acid, and monosaccharide as well as secondary metabolism. Electrophoretic mobility shift assays confirmed that the intracellular β-glucosidase bgl2, the putative nonenzymatic cellulose-attacking gene cip1, the MFS lactose transporter lp, the nmrA-like gene, endo T, the acid protease pepA, and the small heat shock protein hsp23 were probable Xyr1-targets. These results might help elucidate the regulation system for synthesis and secretion of (hemi)cellulases in T. reesei Rut-C30. PMID:28116297

  8. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain

    PubMed Central

    Lake, Blue B.; Ai, Rizi; Kaeser, Gwendolyn E.; Salathia, Neeraj S.; Yung, Yun C.; Liu, Rui; Wildberg, Andre; Gao, Derek; Fung, Ho-Lim; Chen, Song; Vijayaraghavan, Raakhee; Wong, Julian; Chen, Allison; Sheng, Xiaoyan; Kaper, Fiona; Shen, Richard; Ronaghi, Mostafa; Fan, Jian-Bing; Wang, Wei; Chun, Jerold; Zhang, Kun

    2016-01-01

    The human brain has enormously complex cellular diversity and connectivities fundamental to our neural functions, yet difficulties in interrogating individual neurons has impeded understanding of the underlying transcriptional landscape. We developed a scalable approach to sequence and quantify RNA molecules in isolated neuronal nuclei from post-mortem brain, generating 3,227 sets of single neuron data from six distinct regions of the cerebral cortex. Using an iterative clustering and classification approach, we identified 16 neuronal subtypes that were further annotated on the basis of known markers and cortical cytoarchitecture. These data demonstrate a robust and scalable method for identifying and categorizing single nuclear transcriptomes, revealing shared genes sufficient to distinguish novel and orthologous neuronal subtypes as well as regional identity within the human brain. PMID:27339989

  9. The complete genome sequence of Chromobacterium violaceum reveals remarkable and exploitable bacterial adaptability

    PubMed Central

    2003-01-01

    Chromobacterium violaceum is one of millions of species of free-living microorganisms that populate the soil and water in the extant areas of tropical biodiversity around the world. Its complete genome sequence reveals (i) extensive alternative pathways for energy generation, (ii) ≈500 ORFs for transport-related proteins, (iii) complex and extensive systems for stress adaptation and motility, and (iv) widespread utilization of quorum sensing for control of inducible systems, all of which underpin the versatility and adaptability of the organism. The genome also contains extensive but incomplete arrays of ORFs coding for proteins associated with mammalian pathogenicity, possibly involved in the occasional but often fatal cases of human C. violaceum infection. There is, in addition, a series of previously unknown but important enzymes and secondary metabolites including paraquat-inducible proteins, drug and heavy-metal-resistance proteins, multiple chitinases, and proteins for the detoxification of xenobiotics that may have biotechnological applications. PMID:14500782

  10. Individual and population variation in invertebrates revealed by Inter-simple Sequence Repeats (ISSRs)

    PubMed Central

    Abbot, Patrick

    2001-01-01

    PCR-based molecular markers are well suited for questions requiring large scale surveys of plant and animal populations. Inter-simple Sequence Repeats or ISSRs are analyzed by a recently developed technique based on the amplification of the regions between inverse-oriented microsatellite loci with oligonucleotides anchored in microsatellites themselves. ISSRs have shown much promise for the study of the population biology of plants, but have not yet been explored for similar studies of animals. The value of ISSRs is demonstrated for the study of animal species with low levels of within-population variation. Sets of primers are identified which reveal variation in two aphid species, Acyrthosiphon pisum and Pemphigus obesinymphae, in the yellow fever mosquito Aedes aegypti, and in a rotifer in the genus Philodina. PMID:15455068

  11. Whole-exome sequencing reveals the mutational spectrum of testicular germ cell tumours.

    PubMed

    Litchfield, Kevin; Summersgill, Brenda; Yost, Shawn; Sultana, Razvan; Labreche, Karim; Dudakia, Darshna; Renwick, Anthony; Seal, Sheila; Al-Saadi, Reem; Broderick, Peter; Turner, Nicholas C; Houlston, Richard S; Huddart, Robert; Shipley, Janet; Turnbull, Clare

    2015-01-22

    Testicular germ cell tumours (TGCTs) are the most common cancer in young men. Here we perform whole-exome sequencing (WES) of 42 TGCTs to comprehensively study the cancer's mutational profile. The mutation rate is uniformly low in all of the tumours (mean 0.5 mutations per Mb) as compared with common cancers, consistent with the embryological origin of TGCT. In addition to expected copy number gain of chromosome 12p and mutation of KIT, we identify recurrent mutations in the tumour suppressor gene CDC27 (11.9%). Copy number analysis reveals recurring amplification of the spermatocyte development gene FSIP2 (15.3%) and a 0.4 Mb region at Xq28 (15.3%). Two treatment-refractory patients are shown to harbour XRCC2 mutations, a gene strongly implicated in defining cisplatin resistance. Our findings provide further insights into genes involved in the development and progression of TGCT.

  12. Whole-genome sequence comparisons reveal the evolution of Vibrio cholerae O1.

    PubMed

    Kim, Eun Jin; Lee, Chan Hee; Nair, G Balakrish; Kim, Dong Wook

    2015-08-01

    The analysis of the whole-genome sequences of Vibrio cholerae strains from previous and current cholera pandemics has demonstrated that genomic changes and alterations in phage CTX (particularly in the gene encoding the B subunit of cholera toxin) were major features in the evolution of V. cholerae. Recent studies have revealed the genetic mechanisms in these bacteria by which new variants of V. cholerae are generated from type-specific strains; these mechanisms suggest that certain strains are selected by environmental or human factors over time. By understanding the mechanisms and driving forces of historical and current changes in the V. cholerae population, it would be possible to predict the direction of such changes and the evolution of new variants; this has implications for the battle against cholera.

  13. The genetic basis for ecological adaptation of the Atlantic herring revealed by genome sequencing

    PubMed Central

    Martinez Barrio, Alvaro; Lamichhaney, Sangeet; Fan, Guangyi; Rafati, Nima; Pettersson, Mats; Zhang, He; Dainat, Jacques; Ekman, Diana; Höppner, Marc; Jern, Patric; Martin, Marcel; Nystedt, Björn; Liu, Xin; Chen, Wenbin; Liang, Xinming; Shi, Chengcheng; Fu, Yuanyuan; Ma, Kailong; Zhan, Xiao; Feng, Chungang; Gustafson, Ulla; Rubin, Carl-Johan; Sällman Almén, Markus; Blass, Martina; Casini, Michele; Folkvord, Arild; Laikre, Linda; Ryman, Nils; Ming-Yuen Lee, Simon; Xu, Xun; Andersson, Leif

    2016-01-01

    Ecological adaptation is of major relevance to speciation and sustainable population management, but the underlying genetic factors are typically hard to study in natural populations due to genetic differentiation caused by natural selection being confounded with genetic drift in subdivided populations. Here, we use whole genome population sequencing of Atlantic and Baltic herring to reveal the underlying genetic architecture at an unprecedented detailed resolution for both adaptation to a new niche environment and timing of reproduction. We identify almost 500 independent loci associated with a recent niche expansion from marine (Atlantic Ocean) to brackish waters (Baltic Sea), and more than 100 independent loci showing genetic differentiation between spring- and autumn-spawning populations irrespective of geographic origin. Our results show that both coding and non-coding changes contribute to adaptation. Haplotype blocks, often spanning multiple genes and maintained by selection, are associated with genetic differentiation. DOI: http://dx.doi.org/10.7554/eLife.12081.001 PMID:27138043

  14. What has DNA sequencing revealed about the VSG expression sites of African trypanosomes?

    PubMed

    McCulloch, Richard; Horn, David

    2009-08-01

    Antigenic variation is crucial for the survival of African trypanosomes in mammals and involves switches in expression of variant surface glycoprotein genes, which are co-transcribed with a number of expression-site-associated genes (ESAGs) from loci termed 'bloodstream expression sites' (BESs). Trypanosomes possess multiple BESs, although the reason for this (and why ESAGs are resident in these loci) has remained a subject of debate. The genome sequence of Trypanosoma brucei, released in 2005, did not include the BESs because of their telomeric disposition. This gap in our knowledge has now been bridged by two new studies, which we discuss here, asking what has been revealed about the biological significance of BES multiplicity and ESAG function and evolution.

  15. The complete genome sequences, unique mutational spectra and developmental potency of adult neurons revealed by cloning

    PubMed Central

    Rodriguez, Alberto R.; Ferguson, William C.; Shumilina, Svetlana; Clark, Royden A.; Boland, Michael J.; Martin, Greg; Chubukov, Pavel; Tsunemoto, Rachel K.; Torkamani, Ali; Kupriyanov, Sergey; Hall, Ira M.; Baldwin, Kristin K.

    2016-01-01

    Somatic mutation in neurons is linked to neurologic disease and implicated in cell type diversification. However, the origin, extent and patterns of genomic mutation in neurons remain unknown. We established a nuclear transfer method to clonally amplify the genomes of neurons from adult mice for whole genome sequencing. Comprehensive mutation detection and independent validation revealed that individual neurons harbor ~100 unique mutations from all classes, but lack recurrent rearrangements. Most neurons contain at least one gene disrupting mutation and rare (0-2) mobile element insertions. The frequency and gene bias of neuronal mutations differs from other lineages, potentially due to novel mechanisms governing post-mitotic mutation. Fertile mice were cloned from several neurons, establishing the compatibility of mutated adult neuronal genomes with reprogramming to pluripotency and development. PMID:26948891

  16. Whole-exome sequencing reveals the mutational spectrum of testicular germ cell tumours

    PubMed Central

    Litchfield, Kevin; Summersgill, Brenda; Yost, Shawn; Sultana, Razvan; Labreche, Karim; Dudakia, Darshna; Renwick, Anthony; Seal, Sheila; Al-Saadi, Reem; Broderick, Peter; Turner, Nicholas C.; Houlston, Richard S.; Huddart, Robert; Shipley, Janet; Turnbull, Clare

    2015-01-01

    Testicular germ cell tumours (TGCTs) are the most common cancer in young men. Here we perform whole-exome sequencing (WES) of 42 TGCTs to comprehensively study the cancer's mutational profile. The mutation rate is uniformly low in all of the tumours (mean 0.5 mutations per Mb) as compared with common cancers, consistent with the embryological origin of TGCT. In addition to expected copy number gain of chromosome 12p and mutation of KIT, we identify recurrent mutations in the tumour suppressor gene CDC27 (11.9%). Copy number analysis reveals recurring amplification of the spermatocyte development gene FSIP2 (15.3%) and a 0.4 Mb region at Xq28 (15.3%). Two treatment-refractory patients are shown to harbour XRCC2 mutations, a gene strongly implicated in defining cisplatin resistance. Our findings provide further insights into genes involved in the development and progression of TGCT. PMID:25609015

  17. The Complete Genome Sequences, Unique Mutational Spectra, and Developmental Potency of Adult Neurons Revealed by Cloning.

    PubMed

    Hazen, Jennifer L; Faust, Gregory G; Rodriguez, Alberto R; Ferguson, William C; Shumilina, Svetlana; Clark, Royden A; Boland, Michael J; Martin, Greg; Chubukov, Pavel; Tsunemoto, Rachel K; Torkamani, Ali; Kupriyanov, Sergey; Hall, Ira M; Baldwin, Kristin K

    2016-03-16

    Somatic mutation in neurons is linked to neurologic disease and implicated in cell-type diversification. However, the origin, extent, and patterns of genomic mutation in neurons remain unknown. We established a nuclear transfer method to clonally amplify the genomes of neurons from adult mice for whole-genome sequencing. Comprehensive mutation detection and independent validation revealed that individual neurons harbor ∼100 unique mutations from all classes but lack recurrent rearrangements. Most neurons contain at least one gene-disrupting mutation and rare (0-2) mobile element insertions. The frequency and gene bias of neuronal mutations differ from other lineages, potentially due to novel mechanisms governing postmitotic mutation. Fertile mice were cloned from several neurons, establishing the compatibility of mutated adult neuronal genomes with reprogramming to pluripotency and development.

  18. Metagenomic sequencing reveals microbiota and its functional potential associated with periodontal disease

    PubMed Central

    Wang, Jinfeng; Qi, Ji; Zhao, Hui; He, Shu; Zhang, Yifei; Wei, Shicheng; Zhao, Fangqing

    2013-01-01

    Although attempts have been made to reveal the relationships between bacteria and human health, little is known about the species and function of the microbial community associated with oral diseases. In this study, we report the sequencing of 16 metagenomic samples collected from dental swabs and plaques representing four periodontal states. Insights into the microbial community structure and the metabolic variation associated with periodontal health and disease were obtained. We observed a strong correlation between community structure and disease status, and described a core disease-associated community. A number of functional genes and metabolic pathways including bacterial chemotaxis and glycan biosynthesis were over-represented in the microbiomes of periodontal disease. A significant amount of novel species and genes were identified in the metagenomic assemblies. Our study enriches the understanding of the oral microbiome and sheds light on the contribution of microorganisms to the formation and succession of dental plaques and oral diseases. PMID:23673380

  19. Identification of tropomyosins as major allergens in antarctic krill and mantis shrimp and their amino acid sequence characteristics.

    PubMed

    Motoyama, Kanna; Suma, Yota; Ishizaki, Shoichiro; Nagashima, Yuji; Lu, Ying; Ushio, Hideki; Shiomi, Kazuo

    2008-01-01

    Tropomyosin represents a major allergen of decapod crustaceans such as shrimps and crabs, and its highly conserved amino acid sequence (>90% identity) is a molecular basis of the immunoglobulin E (IgE) cross-reactivity among decapods. At present, however, little information is available about allergens in edible crustaceans other than decapods. In this study, the major allergen in two species of edible crustaceans, Antarctic krill Euphausia superba and mantis shrimp Oratosquilla oratoria that are taxonomically distinct from decapods, was demonstrated to be tropomyosin by IgE-immunoblotting using patient sera. The cross-reactivity of the tropomyosins from both species with decapod tropomyosins was also confirmed by inhibition IgE immunoblotting. Sequences of the tropomyosins from both species were determined by complementary deoxyribonucleic acid cloning. The mantis shrimp tropomyosin has high sequence identity (>90% identity) with decapod tropomyosins, especially with fast-type tropomyosins. On the other hand, the Antarctic krill tropomyosin is characterized by diverse alterations in region 13-42, the amino acid sequence of which is highly conserved for decapod tropomyosins, and hence, it shares somewhat lower sequence identity (82.4-89.8% identity) with decapod tropomyosins than the mantis shrimp tropomyosin. Quantification by enzyme-linked immunosorbent assay revealed that Antarctic krill contains tropomyosin at almost the same level as decapods, suggesting that its allergenicity is equivalent to decapods. However, mantis shrimp was assumed to be substantially not allergenic because of the extremely low content of tropomyosin.

  20. Unprecedented High-Resolution View of Bacterial Operon Architecture Revealed by RNA Sequencing

    PubMed Central

    Creecy, James P.; Maddox, Scott M.; Grissom, Joe E.; Conkle, Trevor L.; Shadid, Tyler M.; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada

    2014-01-01

    ABSTRACT We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3′ transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5′ ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. PMID:25006232

  1. High Quality Maize Centromere 10 Sequence Reveals Evidence of Frequent Recombination Events.

    PubMed

    Wolfgruber, Thomas K; Nakashima, Megan M; Schneider, Kevin L; Sharma, Anupma; Xie, Zidian; Albert, Patrice S; Xu, Ronghui; Bilinski, Paul; Dawe, R Kelly; Ross-Ibarra, Jeffrey; Birchler, James A; Presting, Gernot G

    2016-01-01

    The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10(-6) and 5 × 10(-5) for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres.

  2. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus.

    PubMed

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-05-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic "virion factory," we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5' and 3' extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus "virophage." These results-validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)--will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system.

  3. Genome sequence of the necrotrophic plant pathogen Pythium ultimum reveals original pathogenicity mechanisms and effector repertoire

    PubMed Central

    2010-01-01

    Background Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species. Results The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans. Conclusions Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae. PMID:20626842

  4. Revealing glacier flow and surge dynamics from animated satellite image sequences: examples from the Karakoram

    NASA Astrophysics Data System (ADS)

    Paul, F.

    2015-11-01

    Although animated images are very popular on the internet, they have so far found only limited use for glaciological applications. With long time series of satellite images becoming increasingly available and glaciers being well recognized for their rapid changes and variable flow dynamics, animated sequences of multiple satellite images reveal glacier dynamics in a time-lapse mode, making the otherwise slow changes of glacier movement visible and understandable to the wider public. For this study, animated image sequences were created for four regions in the central Karakoram mountain range over a 25-year time period (1990-2015) from freely available image quick-looks of orthorectified Landsat scenes. The animations play automatically in a web browser and reveal highly complex patterns of glacier flow and surge dynamics that are difficult to obtain by other methods. In contrast to other regions, surging glaciers in the Karakoram are often small (10 km2 or less), steep, debris-free, and advance for several years to decades at relatively low annual rates (about 100 m a-1). These characteristics overlap with those of non-surge-type glaciers, making a clear identification difficult. However, as in other regions, the surging glaciers in the central Karakoram also show sudden increases of flow velocity and mass waves travelling down glacier. The surges of individual glaciers are generally out of phase, indicating a limited climatic control on their dynamics. On the other hand, nearly all other glaciers in the region are either stable or slightly advancing, indicating balanced or even positive mass budgets over the past few decades.

  5. Genetic aberrations in imatinib-resistant dermatofibrosarcoma protuberans revealed by whole genome sequencing.

    PubMed

    Hong, Jung Yong; Liu, Xiao; Mao, Mao; Li, Miao; Choi, Dong Il; Kang, Shin Woo; Lee, Jeeyun; La Choi, Yoon

    2013-01-01

    Dermatofibrosarcoma protuberans (DFSP) is a very rare soft tissue sarcoma. DFSP often reveals a specific chromosome translocation, t(17;22)(q22;q13), which results in the fusion of collagen 1 alpha 1 (COL1A1) gene and platelet-derived growth factor-B (PDGFB) gene. The COL1A1-PDGFB fusion protein activates the PDGFB receptor and resultant constitutive activation of PDGFR receptor is essential in the pathogenesis of DFSP. Thus, blocking PDGFR receptor activation with imatinib has shown promising activity in the treatment of advanced and metastatic DFSP. Despite the success with targeted agents in cancers, acquired drug resistance eventually occurs. Here, we tried to identify potential drug resistance mechanisms against imatinib in a 46-year old female with DFSP who initially responded well to imatinib but suffered rapid disease progression. We performed whole-genome sequencing of both pre-treatment and post-treatment tumor tissue to identify the mutational events associated with imatinib resistance. No significant copy number alterations, insertion, and deletions were identified during imatinib treatment. Of note, we identified newly emerged 8 non-synonymous somatic mutations of the genes (ACAP2, CARD10, KIAA0556, PAAQR7, PPP1R39, SAFB2, STARD9, and ZFYVE9) in the imatinib-resistant tumor tissue. This study revealed diverse possible candidate mechanisms by which imatinib resistance to PDGFRB inhibition may arise in DFSP, and highlights the usefulness of whole-genome sequencing in identifying drug resistance mechanisms and in pursuing genome-directed, personalized anti-cancer therapy.

  6. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  7. Genome-wide sequencing data reveals virulence factors implicated in banana Xanthomonas wilt.

    PubMed

    Studholme, David J; Kemen, Eric; MacLean, Daniel; Schornack, Sebastian; Aritua, Valente; Thwaites, Richard; Grant, Murray; Smith, Julian; Jones, Jonathan D G

    2010-09-01

    Banana Xanthomonas wilt is a newly emerging disease that is currently threatening the livelihoods of millions of farmers in East Africa. The causative agent is Xanthomonas campestris pathovar musacearum (Xcm), but previous work suggests that this pathogen is much more closely related to species Xanthomonas vasicola than to X. campestris. We have generated draft genome sequences for a banana-pathogenic strain of Xcm isolated in Uganda and for a very closely related strain of X. vasicola pathovar vasculorum, originally isolated from sugarcane, that is nonpathogenic on banana. The draft sequences revealed overlapping but distinct repertoires of candidate virulence effectors in the two strains. Both strains encode homologues of the Pseudomonas syringae effectors HopW, HopAF1 and RipT from Ralstonia solanacearum. The banana-pathogenic and non-banana-pathogenic strains also differed with respect to lipopolysaccharide synthesis and type-IV pili, and in at least several thousand single-nucleotide polymorphisms in the core conserved genome. We found evidence of horizontal transfer between X. vasicola and very distantly related bacteria, including members of other divisions of the Proteobacteria. The availability of these draft genomes will be an invaluable tool for further studies aimed at understanding and combating this important disease.

  8. Determinants of spontaneous mutation in the bacterium Escherichia coli as revealed by whole-genome sequencing

    PubMed Central

    Foster, Patricia L.; Lee, Heewook; Popodi, Ellen; Townes, Jesse P.; Tang, Haixu

    2015-01-01

    A complete understanding of evolutionary processes requires that factors determining spontaneous mutation rates and spectra be identified and characterized. Using mutation accumulation followed by whole-genome sequencing, we found that the mutation rates of three widely diverged commensal Escherichia coli strains differ only by about 50%, suggesting that a rate of 1–2 × 10−3 mutations per generation per genome is common for this bacterium. Four major forces are postulated to contribute to spontaneous mutations: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage caused by exogenous agents, and the activities of error-prone polymerases. To determine the relative importance of these factors, we studied 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to prevent or repair oxidative DNA damage significantly impacted mutation rates or spectra. These results suggest that, with the exception of oxidative damage, endogenously induced DNA damage does not perturb the overall accuracy of DNA replication in normally growing cells and that repair pathways may exist primarily to defend against exogenously induced DNA damage. The thousands of mutations caused by oxidative damage recovered across the entire genome revealed strong local-sequence biases of these mutations. Specifically, we found that the identity of the 3′ base can affect the mutability of a purine by oxidative damage by as much as eightfold. PMID:26460006

  9. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    PubMed Central

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  10. Genotyping by PCR and High-Throughput Sequencing of Commercial Probiotic Products Reveals Composition Biases

    PubMed Central

    Morovic, Wesley; Hibberd, Ashley A.; Zabel, Bryan; Barrangou, Rodolphe; Stahl, Buffy

    2016-01-01

    Recent advances in microbiome research have brought renewed focus on beneficial bacteria, many of which are available in food and dietary supplements. Although probiotics have historically been defined as microorganisms that convey health benefits when ingested in sufficient viable amounts, this description now includes the stipulation “well defined strains,” encompassing definitive taxonomy for consumer consideration and regulatory oversight. Here, we evaluated 52 commercial dietary supplements covering a range of labeled species using plate counting and targeted genotyping. Strain identities were assessed using methods recently published by the United States Pharmacopeial Convention. We also determined the relative abundance of individual bacteria by high-throughput sequencing (HTS) of the 16S rRNA sequence using paired-end 2 × 250 bp Illumina MiSeq technology. Using these methods, we tested the hypothesis that products do contain the quantitative and qualitative list of labeled microbial species. We found that 17 samples (33%) were below label claim for CFU prior to their expiration dates. A multiplexed-PCR scheme showed that only 30/52 (58%) of the products contained a correctly labeled classification, with issues encompassing incorrect taxonomy, missing species, and un-labeled species. The HTS revealed that many blended products consisted predominantly of Lactobacillus acidophilus and Bifidobacterium animalis subsp. lactis. These results highlight the need for reliable methods to determine the correct taxonomy and quantify the relative amounts of mixed microbial populations in commercial probiotic products. PMID:27857709

  11. Sequence-based Network Completion Reveals the Integrality of Missing Reactions in Metabolic Networks.

    PubMed

    Krumholz, Elias W; Libourel, Igor G L

    2015-07-31

    Genome-scale metabolic models are central in connecting genotypes to metabolic phenotypes. However, even for well studied organisms, such as Escherichia coli, draft networks do not contain a complete biochemical network. Missing reactions are referred to as gaps. These gaps need to be filled to enable functional analysis, and gap-filling choices influence model predictions. To investigate whether functional networks existed where all gap-filling reactions were supported by sequence similarity to annotated enzymes, four draft networks were supplemented with all reactions from the Model SEED database for which minimal sequence similarity was found in their genomes. Quadratic programming revealed that the number of reactions that could partake in a gap-filling solution was vast: 3,270 in the case of E. coli, where 72% of the metabolites in the draft network could connect a gap-filling solution. Nonetheless, no network could be completed without the inclusion of orphaned enzymes, suggesting that parts of the biochemistry integral to biomass precursor formation are uncharacterized. However, many gap-filling reactions were well determined, and the resulting networks showed improved prediction of gene essentiality compared with networks generated through canonical gap filling. In addition, gene essentiality predictions that were sensitive to poorly determined gap-filling reactions were of poor quality, suggesting that damage to the network structure resulting from the inclusion of erroneous gap-filling reactions may be predictable.

  12. De novo sequences of Haloquadratum walsbyi from Lake Tyrrell, Australia, reveal a variable genomic landscape.

    PubMed

    Tully, Benjamin J; Emerson, Joanne B; Andrade, Karen; Brocks, Jochen J; Allen, Eric E; Banfield, Jillian F; Heidelberg, Karla B

    2015-01-01

    Hypersaline systems near salt saturation levels represent an extreme environment, in which organisms grow and survive near the limits of life. One of the abundant members of the microbial communities in hypersaline systems is the square archaeon, Haloquadratum walsbyi. Utilizing a short-read metagenome from Lake Tyrrell, a hypersaline ecosystem in Victoria, Australia, we performed a comparative genomic analysis of H. walsbyi to better understand the extent of variation between strains/subspecies. Results revealed that previously isolated strains/subspecies do not fully describe the complete repertoire of the genomic landscape present in H. walsbyi. Rearrangements, insertions, and deletions were observed for the Lake Tyrrell derived Haloquadratum genomes and were supported by environmental de novo sequences, including shifts in the dominant genomic landscape of the two most abundant strains. Analysis pertaining to halomucins indicated that homologs for this large protein are not a feature common for all species of Haloquadratum. Further, we analyzed ATP-binding cassette transporters (ABC-type transporters) for evidence of niche partitioning between different strains/subspecies. We were able to identify unique and variable transporter subunits from all five genomes analyzed and the de novo environmental sequences, suggesting that differences in nutrient and carbon source acquisition may play a role in maintaining distinct strains/subspecies.

  13. De Novo Sequences of Haloquadratum walsbyi from Lake Tyrrell, Australia, Reveal a Variable Genomic Landscape

    PubMed Central

    Tully, Benjamin J.; Emerson, Joanne B.; Andrade, Karen; Brocks, Jochen J.; Allen, Eric E.; Banfield, Jillian F.; Heidelberg, Karla B.

    2015-01-01

    Hypersaline systems near salt saturation levels represent an extreme environment, in which organisms grow and survive near the limits of life. One of the abundant members of the microbial communities in hypersaline systems is the square archaeon, Haloquadratum walsbyi. Utilizing a short-read metagenome from Lake Tyrrell, a hypersaline ecosystem in Victoria, Australia, we performed a comparative genomic analysis of H. walsbyi to better understand the extent of variation between strains/subspecies. Results revealed that previously isolated strains/subspecies do not fully describe the complete repertoire of the genomic landscape present in H. walsbyi. Rearrangements, insertions, and deletions were observed for the Lake Tyrrell derived Haloquadratum genomes and were supported by environmental de novo sequences, including shifts in the dominant genomic landscape of the two most abundant strains. Analysis pertaining to halomucins indicated that homologs for this large protein are not a feature common for all species of Haloquadratum. Further, we analyzed ATP-binding cassette transporters (ABC-type transporters) for evidence of niche partitioning between different strains/subspecies. We were able to identify unique and variable transporter subunits from all five genomes analyzed and the de novo environmental sequences, suggesting that differences in nutrient and carbon source acquisition may play a role in maintaining distinct strains/subspecies. PMID:25709557

  14. High-resolution sequencing reveals unexplored archaeal diversity in freshwater wetland soils.

    PubMed

    Narrowe, Adrienne B; Angle, Jordan C; Daly, Rebecca A; Stefanik, Kay C; Wrighton, Kelly C; Miller, Christopher S

    2017-02-20

    Despite being key contributors to biogeochemical processes, archaea are frequently outnumbered by bacteria, and consequently are underrepresented in combined molecular surveys. Here, we demonstrate an approach to concurrently survey the archaea alongside the bacteria with high-resolution 16S rRNA gene sequencing, linking these community data to geochemical parameters. We applied this integrated analysis to hydric soils sampled across a model methane-emitting freshwater wetland. Geochemical profiles, archaeal communities, and bacterial communities were independently correlated with soil depth and water cover. Centimeters of soil depth and corresponding geochemical shifts consistently affected microbial community structure more than hundreds of meters of lateral distance. Methanogens with diverse metabolisms were detected across the wetland, but displayed surprising OTU-level partitioning by depth. Candidatus Methanoperedens spp. archaea thought to perform anaerobic oxidation of methane linked to iron reduction were abundant. Domain-specific sequencing also revealed unexpectedly diverse non-methane-cycling archaeal members. OTUs within the underexplored Woesearchaeota and Bathyarchaeota were prevalent across the wetland, with subgroups and individual OTUs exhibiting distinct occupancy and abundance distributions aligned with environmental gradients. This study adds to our understanding of ecological range for key archaeal taxa in a model freshwater wetland, and links these taxa and individual OTUs to hypotheses about processes governing biogeochemical cycling. This article is protected by copyright. All rights reserved.

  15. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  16. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  17. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities.

  18. The amino acid sequences of the Fd fragments of two human γ heavy chains

    PubMed Central

    Press, E. M.; Hogg, N. M.

    1970-01-01

    The amino acid sequences of the Fd fragments of two human pathological immunoglobulins of the immunoglobulin G1 class are reported. Comparison of the two sequences shows that the heavy-chain variable regions are similar in length to those of the light chains. The existence of heavy chain variable region subgroups is also deduced, from a comparison of these two sequences with those of another γ 1 chain, Eu, a μ chain, Ou, and the partial sequence of a fourth γ 1 chain, Ste. Carbohydrate has been found to be linked to an aspartic acid residue in the variable region of one of the γ 1 chains, Cor. PMID:5449120

  19. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing.

    PubMed

    Conway, Tyrrell; Creecy, James P; Maddox, Scott M; Grissom, Joe E; Conkle, Trevor L; Shadid, Tyler M; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada; Wanner, Barry L

    2014-07-08

    We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3' transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5' ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. Importance: We precisely mapped the 5' and 3' ends of RNA transcripts across the E. coli K-12 genome by using a single-nucleotide analytical approach. Our resulting high-resolution transcriptome maps show that ca. one-third of E. coli operons are

  20. Amino acid sequence of rabbit kidney neutral endopeptidase 24.11 (enkephalinase) deduced from a complementary DNA.

    PubMed Central

    Devault, A; Lazure, C; Nault, C; Le Moual, H; Seidah, N G; Chrétien, M; Kahn, P; Powell, J; Mallet, J; Beaumont, A

    1987-01-01

    Neutral endopeptidase (EC 3.4.24.11) is a major constituent of kidney brush border membranes. It is also present in the brain where it has been shown to be involved in the inactivation of opioid peptides, methionine- and leucine-enkephalins. For this reason this enzyme is often called 'enkephalinase'. In order to characterize the primary structure of the enzyme, oligonucleotide probes were designed from partial amino acid sequences and used to isolate clones from kidney cDNA libraries. Sequencing of the cDNA inserts revealed the complete primary structure of the enzyme. Neutral endopeptidase consists of 750 amino acids. It contains a short N-terminal cytoplasmic domain (27 amino acids), a single membrane-spanning segment (23 amino acids) and an extracellular domain that comprises most of the protein mass. The comparison of the primary structure of neutral endopeptidase with that of thermolysin, a bacterial Zn-metallopeptidase, indicates that most of the amino acid residues involved in Zn coordination and catalytic activity in thermolysin are found within highly honmologous sequences in neutral endopeptidase. Images Fig. 1. Fig. 3. PMID:2440677

  1. Integrated exome and transcriptome sequencing reveals ZAK isoform usage in gastric cancer

    PubMed Central

    Liu, Jinfeng; McCleland, Mark; Stawiski, Eric W.; Gnad, Florian; Mayba, Oleg; Haverty, Peter M.; Durinck, Steffen; Chen, Ying-Jiun; Klijn, Christiaan; Jhunjhunwala, Suchit; Lawrence, Michael; Liu, Hanbin; Wan, Yinan; Chopra, Vivek; Yaylaoglu, Murat B.; Yuan, Wenlin; Ha, Connie; Gilbert, Houston N.; Reeder, Jens; Pau, Gregoire; Stinson, Jeremy; Stern, Howard M.; Manning, Gerard; Wu, Thomas D.; Neve, Richard M.; de Sauvage, Frederic J.; Modrusan, Zora; Seshagiri, Somasekar; Firestein, Ron; Zhang, Zemin

    2014-01-01

    Gastric cancer is the second leading cause of worldwide cancer mortality, yet the underlying genomic alterations remain poorly understood. Here we perform exome and transcriptome sequencing and SNP array assays to characterize 51 primary gastric tumours and 32 cell lines. Meta-analysis of exome data and previously published data sets reveals 24 significantly mutated genes in microsatellite stable (MSS) tumours and 16 in microsatellite instable (MSI) tumours. Over half the patients in our collection could potentially benefit from targeted therapies. We identify 55 splice site mutations accompanied by aberrant splicing products, in addition to mutation-independent differential isoform usage in tumours. ZAK kinase isoform TV1 is preferentially upregulated in gastric tumours and cell lines relative to normal samples. This pattern is also observed in colorectal, bladder and breast cancers. Overexpression of this particular isoform activates multiple cancer-related transcription factor reporters, while depletion of ZAK in gastric cell lines inhibits proliferation. These results reveal the spectrum of genomic and transcriptomic alterations in gastric cancer, and identify isoform-specific oncogenic properties of ZAK. PMID:24807215

  2. Stacking sequence and interlayer coupling in few-layer graphene revealed by in situ imaging

    PubMed Central

    Wang, Zhu-Jun; Dong, Jichen; Cui, Yi; Eres, Gyula; Timpe, Olaf; Fu, Qiang; Ding, Feng; Schloegl, R.; Willinger, Marc-Georg

    2016-01-01

    In the transition from graphene to graphite, the addition of each individual graphene layer modifies the electronic structure and produces a different material with unique properties. Controlled growth of few-layer graphene is therefore of fundamental interest and will provide access to materials with engineered electronic structure. Here we combine isothermal growth and etching experiments with in situ scanning electron microscopy to reveal the stacking sequence and interlayer coupling strength in few-layer graphene. The observed layer-dependent etching rates reveal the relative strength of the graphene–graphene and graphene–substrate interaction and the resulting mode of adlayer growth. Scanning tunnelling microscopy and density functional theory calculations confirm a strong coupling between graphene edge atoms and platinum. Simulated etching confirms that etching can be viewed as reversed growth. This work demonstrates that real-time imaging under controlled atmosphere is a powerful method for designing synthesis protocols for sp2 carbon nanostructures in between graphene and graphite. PMID:27759024

  3. Stacking sequence and interlayer coupling in few-layer graphene revealed by in situ imaging

    DOE PAGES

    Wang, Zhu-Jun; Dong, Jichen; Cui, Yi; ...

    2016-10-19

    In the transition from graphene to graphite, the addition of each individual graphene layer modifies the electronic structure and produces a different material with unique properties. Controlled growth of few-layer graphene is therefore of fundamental interest and will provide access to materials with engineered electronic structure. Here we combine isothermal growth and etching experiments with in situ scanning electron microscopy to reveal the stacking sequence and interlayer coupling strength in few-layer graphene. The observed layer-dependent etching rates reveal the relative strength of the graphene graphene and graphene substrate interaction and the resulting mode of adlayer growth. Scanning tunnelling microscopy andmore » density functional theory calculations confirm a strong coupling between graphene edge atoms and platinum. Simulated etching confirms that etching can be viewed as reversed growth. This work demonstrates that real-time imaging under controlled atmosphere is a powerful method for designing synthesis protocols for sp2 carbon nanostructures in between graphene and graphite.« less

  4. High-Throughput Sequencing Reveals Circular Substrates for an Archaeal RNA ligase.

    PubMed

    Becker, Hubert F; Heliou, Alice; Djaout, Kamel; Lestini, Roxane; Regnier, Mireille; Myllykallio, Hannu

    2017-03-09

    It is only recently that the abundant presence of circular RNAs (circRNAs) in all kingdoms of Life, including the hyperthermophilic archaeon Pyrococcus abyssi, has emerged. This led us to investigate the physiological significance of a previously observed weak intramolecular ligation activity of Pab1020 RNA ligase. Here we demonstrate that this enzyme, despite sharing significant sequence similarity with DNA ligases, is indeed an RNA-specific polynucleotide ligase efficiently acting on physiologically significant substrates. Using a combination of RNA immunoprecipitation assays and RNA-seq, our genome-wide studies revealed 133 individual circRNA loci in P. abyssi. The large majority of these loci interacted with Pab1020 in cells and circularization of selected C/D Box and 5S rRNA transcripts was confirmed biochemically. Altogether these studies revealed that Pab1020 is required for RNA circularization. Our results further suggest the functional speciation of an ancestral NTase domain and/or DNA ligase towards RNA ligase activity and prompt for further characterization of the widespread functions of circular RNAs in prokaryotes. Detailed insight into the cellular substrates of Pab1020 may facilitate the development of new biotechnological applications e.g. in ligation of preadenylated adaptors to RNA molecules.

  5. Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae

    PubMed Central

    Wang, Xiujuan; Lavrov, Dennis V.

    2008-01-01

    Two major transitions in animal evolution–the origins of multicellularity and bilaterality–correlate with major changes in mitochondrial DNA (mtDNA) organization. Demosponges, the largest class in the phylum Porifera, underwent only the first of these transitions and their mitochondrial genomes display a peculiar combination of ancestral and animal-specific features. To get an insight into the evolution of mitochondrial genomes within the Demospongiae, we determined 17 new mtDNA sequences from this group and analyzing them with five previously published sequences. Our analysis revealed that all demosponge mtDNAs are 16- to 25-kbp circular molecules, containing 13–15 protein genes, 2 rRNA genes, and 2–27 tRNA genes. All but four pairs of sampled genomes had unique gene orders, with the number of shared gene boundaries ranging from 1 to 41. Although most demosponge species displayed low rates of mitochondrial sequence evolution, a significant acceleration in evolutionary rates occurred in the G1 group (orders Dendroceratida, Dictyoceratida, and Verticillitida). Large variation in mtDNA organization was also observed within the G0 group (order Homosclerophorida) including gene rearrangements, loss of tRNA genes, and the presence of two introns in Plakortis angulospiculatus. While introns are rare in modern-day demosponge mtDNA, we inferred that at least one intron was present in cox1 of the common ancestor of all demosponges. Our study uncovered an extensive mitochondrial genomic diversity within the Demospongiae. Although all sampled mitochondrial genomes retained some ancestral features, including a minimally modified genetic code, conserved structures of tRNA genes, and presence of multiple non-coding regions, they vary considerably in their size, gene content, gene order, and the rates of sequence evolution. Some of the changes in demosponge mtDNA, such as the loss of tRNA genes and the appearance of hairpin-containing repetitive elements, occurred in

  6. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    PubMed Central

    2011-01-01

    Background MicroRNAs (miRNAs) are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5) mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development. PMID:21466694

  7. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  8. Clostridium sticklandii, a specialist in amino acid degradation:revisiting its metabolism through its genome sequence

    PubMed Central

    2010-01-01

    Background Clostridium sticklandii belongs to a cluster of non-pathogenic proteolytic clostridia which utilize amino acids as carbon and energy sources. Isolated by T.C. Stadtman in 1954, it has been generally regarded as a "gold mine" for novel biochemical reactions and is used as a model organism for studying metabolic aspects such as the Stickland reaction, coenzyme-B12- and selenium-dependent reactions of amino acids. With the goal of revisiting its carbon, nitrogen, and energy metabolism, and comparing studies with other clostridia, its genome has been sequenced and analyzed. Results C. sticklandii is one of the best biochemically studied proteolytic clostridial species. Useful additional information has been obtained from the sequencing and annotation of its genome, which is presented in this paper. Besides, experimental procedures reveal that C. sticklandii degrades amino acids in a preferential and sequential way. The organism prefers threonine, arginine, serine, cysteine, proline, and glycine, whereas glutamate, aspartate and alanine are excreted. Energy conservation is primarily obtained by substrate-level phosphorylation in fermentative pathways. The reactions catalyzed by different ferredoxin oxidoreductases and the exergonic NADH-dependent reduction of crotonyl-CoA point to a possible chemiosmotic energy conservation via the Rnf complex. C. sticklandii possesses both the F-type and V-type ATPases. The discovery of an as yet unrecognized selenoprotein in the D-proline reductase operon suggests a more detailed mechanism for NADH-dependent D-proline reduction. A rather unusual metabolic feature is the presence of genes for all the enzymes involved in two different CO2-fixation pathways: C. sticklandii harbours both the glycine synthase/glycine reductase and the Wood-Ljungdahl pathways. This unusual pathway combination has retrospectively been observed in only four other sequenced microorganisms. Conclusions Analysis of the C. sticklandii genome and

  9. A Sequence-Independent Strategy for Amplification and Characterisation of Episomal Badnavirus Sequences Reveals Three Previously Uncharacterised Yam Badnaviruses

    PubMed Central

    Bömer, Moritz; Turaki, Aliyu A.; Silva, Gonçalo; Kumar, P. Lava; Seal, Susan E.

    2016-01-01

    Yam (Dioscorea spp.) plants are potentially hosts to a diverse range of badnavirus species (genus Badnavirus, family Caulimoviridae), but their detection is complicated by the existence of integrated badnavirus sequences in some yam genomes. To date, only two badnavirus genomes have been characterised, namely, Dioscorea bacilliform AL virus (DBALV) and Dioscorea bacilliform SN virus (DBSNV). A further 10 tentative species in yam have been described based on their partial reverse transcriptase (RT)-ribonuclease H (RNaseH) sequences, generically referred to here as Dioscorea bacilliform viruses (DBVs). Further characterisation of DBV species is necessary to determine which represent episomal viruses and which are only present as integrated badnavirus sequences in some yam genomes. In this study, a sequence-independent multiply-primed rolling circle amplification (RCA) method was evaluated for selective amplification of episomal DBV genomes. This resulted in the identification and characterisation of nine complete genomic sequences (7.4–7.7 kbp) of existing and previously undescribed DBV phylogenetic groups from Dioscorea alata and Dioscorea rotundata accessions. These new yam badnavirus genomes expand our understanding of the diversity and genomic organisation of DBVs, and assist the development of improved diagnostic tools. Our findings also suggest that mixed badnavirus infections occur relatively often in West African yam germplasm. PMID:27399761

  10. The amino acid sequence of goat beta-lactoglobulin.

    PubMed

    Préaux, G; Braunitzer, G; Schrank, B; Stangl, A

    1979-11-01

    The isolation of beta-lactoglobulin from milk of the goat is described. The purified protein was checked for purity and has been characterized by its gross composition and end groups. The native or the modified protein was then degraded by tryptic and cyanogen bromide cleavage. The cleavage products were isolated and sequenced in the sequenator using a Quadrol and propyne program. These data provide the complete sequence of beta-lactoglobulin of the goat. The results are discussed and compared particularly with bovine beta-lactoglobulin components AB. Some biological aspects are described.

  11. Complete genome sequence analysis of Pseudomonas aeruginosa N002 reveals its genetic adaptation for crude oil degradation.

    PubMed

    Das, Dhrubajyoti; Baruah, Reshita; Sarma Roy, Abhijit; Singh, Anil Kumar; Deka Boruah, Hari Prasanna; Kalita, Jatin; Bora, Tarun Chandra

    2015-03-01

    The present research work reports the whole genome sequence analysis of Pseudomonas aeruginosa strain N002 isolated from crude oil contaminated soil of Assam, India having high crude oil degradation ability. The whole genome of the strain N002 was sequenced by shotgun sequencing using Ion Torrent method and complete genome sequence analysis was done. It was found that the strain N002 revealed versatility for degradation, emulsification and metabolizing of crude oil. Analysis of cluster of orthologous group (COG) revealed that N002 has significantly higher gene abundance for cell motility, lipid transport and metabolism, intracellular trafficking, secretion and vesicular transport, secondary metabolite biosynthesis, transport and catabolism, signal transduction mechanism and transcription than average levels found in other genome sequences of the same bacterial species. However, lower gene abundance for carbohydrate transport and metabolism, replication, recombination and repair, translation, ribosomal structure, biogenesis was observed in N002 than average levels of other bacterial species.

  12. Layered materials with coexisting acidic and basic sites for catalytic one-pot reaction sequences.

    PubMed

    Motokura, Ken; Tada, Mizuki; Iwasawa, Yasuhiro

    2009-06-17

    Acidic montmorillonite-immobilized primary amines (H-mont-NH(2)) were found to be excellent acid-base bifunctional catalysts for one-pot reaction sequences, which are the first materials with coexisting acid and base sites active for acid-base tamdem reactions. For example, tandem deacetalization-Knoevenagel condensation proceeded successfully with the H-mont-NH(2), affording the corresponding condensation product in a quantitative yield. The acidity of the H-mont-NH(2) was strongly influenced by the preparation solvent, and the base-catalyzed reactions were enhanced by interlayer acid sites.

  13. Synthesis of gamma,delta-unsaturated glycolic acids via sequenced brook and Ireland--claisen rearrangements.

    PubMed

    Schmitt, Daniel C; Johnson, Jeffrey S

    2010-03-05

    Organozinc, -magnesium, and -lithium nucleophiles initiate a Brook/Ireland-Claisen rearrangement sequence of allylic silyl glyoxylates resulting in the formation of gamma,delta-unsaturated alpha-silyloxy acids.

  14. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  15. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    PubMed

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  16. Re-sequencing regions of the ovine Y chromosome in domestic and wild sheep reveals novel paternal haplotypes.

    PubMed

    Meadows, J R S; Kijas, J W

    2009-02-01

    The male-specific region of the ovine Y chromosome (MSY) remains poorly characterized, yet sequence variants from this region have the potential to reveal the wild progenitor of domestic sheep or examples of domestic and wild paternal introgression. The 5' promoter region of the sex-determining gene SRY was re-sequenced using a subset of wild sheep including bighorn (Ovis canadensis), thinhorn (Ovis dalli spp.), urial (Ovis vignei), argali (Ovis ammon), mouflon (Ovis musimon) and domestic sheep (Ovis aries). Seven novel SNPs (oY2-oY8) were revealed; these were polymorphic between but not within species. Re-sequencing and fragment analysis was applied to the MSY microsatellite SRYM18. It contains a complex compound repeat structure and sequencing of three novel size fragments revealed that a pentanucleotide element remained fixed, whilst a dinucleotide element displayed variability within species. Comparison of the sequence between species revealed that urial and argali sheep grouped more closely to the mouflon and domestic breeds than the pachyceriforms (bighorn and thinhorn). SNP and microsatellite data were combined to define six previously undetected haplotypes. Analysis revealed the mouflon as the only species to share a haplotype with domestic sheep, consistent with its status as a feral domesticate that has undergone male-mediated exchange with domestic animals. A comparison of the remaining wild species and domestic sheep revealed that O. aries is free from signatures of wild sheep introgression.

  17. EEG microstate sequences in healthy humans at rest reveal scale-free dynamics

    PubMed Central

    Van De Ville, Dimitri; Britz, Juliane; Michel, Christoph M.

    2010-01-01

    Recent findings identified electroencephalography (EEG) microstates as the electrophysiological correlates of fMRI resting-state networks. Microstates are defined as short periods (100 ms) during which the EEG scalp topography remains quasi-stable; that is, the global topography is fixed but strength might vary and polarity invert. Microstates represent the subsecond coherent activation within global functional brain networks. Surprisingly, these rapidly changing EEG microstates correlate significantly with activity in fMRI resting-state networks after convolution with the hemodynamic response function that constitutes a strong temporal smoothing filter. We postulate here that microstate sequences should reveal scale-free, self-similar dynamics to explain this remarkable effect and thus that microstate time series show dependencies over long time ranges. To that aim, we deploy wavelet-based fractal analysis that allows determining scale-free behavior. We find strong statistical evidence that microstate sequences are scale free over six dyadic scales covering the 256-ms to 16-s range. The degree of long-range dependency is maintained when shuffling the local microstate labels but becomes indistinguishable from white noise when equalizing microstate durations, which indicates that temporal dynamics are their key characteristic. These results advance the understanding of temporal dynamics of brain-scale neuronal network models such as the global workspace model. Whereas microstates can be considered the “atoms of thoughts,” the shortest constituting elements of cognition, they carry a dynamic signature that is reminiscent at characteristic timescales up to multiple seconds. The scale-free dynamics of the microstates might be the basis for the rapid reorganization and adaptation of the functional networks of the brain. PMID:20921381

  18. Targeted next-generation sequencing of candidate genes reveals novel mutations in patients with dilated cardiomyopathy

    PubMed Central

    ZHAO, YUE; FENG, YUE; ZHANG, YUN-MEI; DING, XIAO-XUE; SONG, YU-ZHU; ZHANG, A-MEI; LIU, LI; ZHANG, HONG; DING, JIA-HUAN; XIA, XUE-SHAN

    2015-01-01

    Dilated cardiomyopathy (DCM) is a major cause of sudden cardiac death and heart failure, and it is characterized by genetic and clinical heterogeneity, even for some patients with a very poor clinical prognosis; in the majority of cases, DCM necessitates a heart transplant. Genetic mutations have long been considered to be associated with this disease. At present, mutations in over 50 genes related to DCM have been documented. This study was carried out to elucidate the characteristics of gene mutations in patients with DCM. The candidate genes that may cause DCM include MYBPC3, MYH6, MYH7, LMNA, TNNT2, TNNI3, MYPN, MYL3, TPM1, SCN5A, DES, ACTC1 and RBM20. Using next-generation sequencing (NGS) and subsequent mutation confirmation with traditional capillary Sanger sequencing analysis, possible causative non-synonymous mutations were identified in ~57% (12/21) of patients with DCM. As a result, 7 novel mutations (MYPN, p.E630K; TNNT2, p.G180A; MYH6, p.R1047C; TNNC1, p.D3V; DES, p.R386H; MYBPC3, p.C1124F; and MYL3, p.D126G), 3 variants of uncertain significance (RBM20, p.R1182H; MYH6, p.T1253M; and VCL, p.M209L), and 2 known mutations (MYH7, p.A26V and MYBPC3, p.R160W) were revealed to be associated with DCM. The mutations were most frequently found in the sarcomere (MYH6, MYBPC3, MYH7, TNNC1, TNNT2 and MYL3) and cytoskeletal (MYPN, DES and VCL) genes. As genetic testing is a useful tool in the clinical management of disease, testing for pathogenic mutations is beneficial to the treatment of patients with DCM and may assist in predicting disease risk for their family members before the onset of symptoms. PMID:26458567

  19. Genome and Transcriptome Sequences Reveal the Specific Parasitism of the Nematophagous Purpureocillium lilacinum 36-1

    PubMed Central

    Xie, Jialian; Li, Shaojun; Mo, Chenmi; Xiao, Xueqiong; Peng, Deliang; Wang, Gaofeng; Xiao, Yannong

    2016-01-01

    Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP). Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs. PMID:27486440

  20. Genome-Wide Sequencing Reveals MicroRNAs Downregulated in Cerebral Cavernous Malformations.

    PubMed

    Kar, Souvik; Bali, Kiran Kumar; Baisantry, Arpita; Geffers, Robert; Samii, Amir; Bertalanffy, Helmut

    2017-02-01

    Cerebral cavernous malformations (CCM) are vascular lesions associated with loss-of-function mutations in one of the three genes encoding KRIT1 (CCM1), CCM2, and PDCD10. Recent understanding of the molecular mechanisms that lead to CCM development is limited. The role of microRNAs (miRNAs) has been demonstrated in vascular pathologies resulting in loss of tight junction proteins, increased vascular permeability and endothelial cell dysfunction. Since the relevance of miRNAs in CCM pathophysiology has not been elucidated, the primary aim of the study was to identify the miRNA-mRNA expression network associated with CCM. Using small RNA sequencing, we identified a total of 764 matured miRNAs expressed in CCM patients compared to the healthy brains. The expression of the selected miRNAs was validated by qRT-PCR, and the results were found to be consistent with the sequencing data. Upon application of additional statistical stringency, five miRNAs (let-7b-5p, miR-361-5p, miR-370-3p, miR-181a-2-3p, and miR-95-3p) were prioritized to be top CCM-relevant miRNAs. Further in silico analyses revealed that the prioritized miRNAs have a direct functional relation with mRNAs, such as MIB1, HIF1A, PDCD10, TJP1, OCLN, HES1, MAPK1, VEGFA, EGFL7, NF1, and ENG, which are previously characterized as key regulators of CCM pathology. To date, this is the first study to investigate the role of miRNAs in CCM pathology. By employing cutting edge molecular and in silico analyses on clinical samples, the current study reports global miRNA expression changes in CCM patients and provides a rich source of data set to understand detailed molecular machinery involved in CCM pathophysiology.

  1. Genome sequence of the acid-tolerant strain Rhizobium sp. LPU83.

    PubMed

    Wibberg, Daniel; Tejerizo, Gonzalo Torres; Del Papa, María Florencia; Martini, Carla; Pühler, Alfred; Lagares, Antonio; Schlüter, Andreas; Pistorio, Mariano

    2014-04-20

    Rhizobia are important members of the soil microbiome since they enter into nitrogen-fixing symbiosis with different legume host plants. Rhizobium sp. LPU83 is an acid-tolerant Rhizobium strain featuring a broad-host-range. However, it is ineffective in nitrogen fixation. Here, the improved draft genome sequence of this strain is reported. Genome sequence information provides the basis for analysis of its acid tolerance, symbiotic properties and taxonomic classification.

  2. Coronavirus genome: prediction of putative functional domains in the non-structural polyprotein by comparative amino acid sequence analysis.

    PubMed Central

    Gorbalenya, A E; Koonin, E V; Donchenko, A P; Blinov, V M

    1989-01-01

    Amino acid sequences of 2 giant non-structural polyproteins (F1 and F2) of infectious bronchitis virus (IBV), a member of Coronaviridae, were compared, by computer-assisted methods, to sequences of a number of other positive strand RNA viral and cellular proteins. By this approach, juxtaposed putative RNA-dependent RNA polymerase, nucleic acid binding ("finger"-like) and RNA helicase domains were identified in F2. Together, these domains might constitute the core of the protein complex involved in the primer-dependent transcription, replication and recombination of coronaviruses. In F1, two cysteine protease-like domains and a growth factor-like one were revealed. One of the putative proteases of IBV is similar to 3C proteases of picornaviruses and related enzymes of como- nepo- and potyviruses. Search of IBV F1 and F2 sequences for sites similar to those cleaved by the latter proteases and intercomparison of the surrounding sequence stretches revealed 13 dipeptides Q/S(G) which are probably cleaved by the coronavirus 3C-like protease. Based on these observations, a partial tentative scheme for the functional organization and expression strategy of the non-structural polyproteins of IBV was proposed. It implies that, despite the general similarity to other positive strand RNA viruses, and particularly to potyviruses, coronaviruses possess a number of unique structural and functional features. PMID:2526320

  3. Exome sequencing reveals VCP mutations as a cause of familial ALS

    PubMed Central

    Johnson, Janel O.; Mandrioli, Jessica; Benatar, Michael; Abramzon, Yevgeniya; Van Deerlin, Vivianna M.; Trojanowski, John Q.; Gibbs, J Raphael; Brunetti, Maura; Gronka, Susan; Wuu, Joanne; Ding, Jinhui; McCluskey, Leo; Martinez-Lage, Maria; Falcone, Dana; Hernandez, Dena G.; Arepalli, Sampath; Chong, Sean; Schymick, Jennifer C.; Rothstein, Jeffrey; Landi, Francesco; Wang, Michael; Calvo, Andrea; Mora, Gabriele; Sabatelli, Mario; Monsurrò, Maria Rosaria; Battistini, Stefania; Salvi, Fabrizio; Spataro, Rossella; Sola, Patrizia; Borghero, Giuseppe; Galassi, Giuliana; Scholz, Sonja W.; Taylor, J. Paul; Restagno, Gabriella; Chiò, Adriano; Traynor, Bryan J.

    2010-01-01

    Summary Using exome sequencing, we identified a p.R191Q amino acid change in the valosin-containing protein (VCP) gene in an Italian family with autosomal dominantly inherited amyotrophic lateral sclerosis (ALS). Mutations in VCP have previously been identified in families with Inclusion Body Myopathy, Paget’s disease and Frontotemporal Dementia (IBMPFD). Screening of VCP in a cohort of 210 familial ALS cases and 78 autopsy-proven ALS cases identified four additional mutations including a p.R155H mutation in a pathologically-proven case of ALS. VCP protein is essential for maturation of ubiquitin-containing autophagosomes, and mutant VCP toxicity is partially mediated through its effect on TDP-43 protein, a major constituent of ubiquitin inclusions that neuropathologically characterize ALS. Our data broaden the phenotype of IBMPFD to include motor neuron degeneration, suggest that VCP mutations may account for ~1–2% of familial ALS, and represent the first evidence directly implicating defects in the ubiquitination/protein degradation pathway in motor neuron degeneration. PMID:21145000

  4. Exploration of the arrest peptide sequence space reveals arrest-enhanced variants.

    PubMed

    Cymer, Florian; Hedman, Rickard; Ismail, Nurzian; von Heijne, Gunnar

    2015-04-17

    Translational arrest peptides (APs) are short stretches of polypeptides that induce translational stalling when synthesized on a ribosome. Mechanical pulling forces acting on the nascent chain can weaken or even abolish stalling. APs can therefore be used as in vivo force sensors, making it possible to measure the forces that act on a nascent chain during translation with single-residue resolution. It is also possible to score the relative strengths of APs by subjecting them to a given pulling force and ranking them according to stalling efficiency. Using the latter approach, we now report an extensive mutagenesis scan of a strong mutant variant of the Mannheimia succiniciproducens SecM AP and identify mutations that further increase the stalling efficiency. Combining three such mutations, we designed an AP that withstands the strongest pulling force we are able to generate at present. We further show that diproline stretches in a nascent protein act as very strong APs when translation is carried out in the absence of elongation factor P. Our findings highlight critical residues in APs, show that certain amino acid sequences induce very strong translational arrest and provide a toolbox of APs of varying strengths that can be used for in vivo force measurements.

  5. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly).

  6. Single-chain structure of human ceruloplasmin: the complete amino acid sequence of the whole molecule.

    PubMed Central

    Takahashi, N; Ortel, T L; Putnam, F W

    1984-01-01

    We have determined the amino acid sequence of the amino-terminal 67,000-dalton (67-kDa) fragment of human ceruloplasmin and have established overlapping sequences between the 67-kDa and 50-kDa fragments and between the 50-kDa and 19-kDa fragments. The 67-kDa fragment contains 480 amino acid residues and three glucosamine oligosaccharides. These results together with our previous sequence data for the 50-kDa and 19-kDa fragments complete the amino acid sequence of human ceruloplasmin. The polypeptide chain has a total of 1,046 amino acid residues (Mr 120,085) and has attachment sites for four glucosamine oligosaccharides; together these account for the total molecular mass of human ceruloplasmin (132 kDa). The sequence analysis of the peptides overlapping the fragments showed that one additional amino acid, arginine, is present between the 67-kDa and 50-kDa fragments, and another, lysine, is between the 50-kDa and 19-kDa fragments. Only two apparent sites of amino acid interchange have been identified in the polypeptide chain. Both involve a single-point interchange of glycine and lysine that would result in a difference in charge. The results of the complete sequence analysis verified that human ceruloplasmin is composed of a single polypeptide chain and that the subunit-like fragments are produced by proteolytic cleavage during purification (and possibly also in vivo). PMID:6582496

  7. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  8. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-04

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs.

  9. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  10. Fingerprinting the Asterid Species Using Subtracted Diversity Array Reveals Novel Species-Specific Sequences

    PubMed Central

    Mantri, Nitin; Olarte, Alexandra; Li, Chun Guang; Xue, Charlie; Pang, Edwin C. K.

    2012-01-01

    Background Asterids is one of the major plant clades comprising of many commercially important medicinal species. One of the major concerns in medicinal plant industry is adulteration/contamination resulting from misidentification of herbal plants. This study reports the construction and validation of a microarray capable of fingerprinting medicinally important species from the Asterids clade. Methodology/Principal Findings Pooled genomic DNA of 104 non-asterid angiosperm and non-angiosperm species was subtracted from pooled genomic DNA of 67 asterid species. Subsequently, 283 subtracted DNA fragments were used to construct an Asterid-specific array. The validation of Asterid-specific array revealed a high (99.5%) subtraction efficiency. Twenty-five Asterid species (mostly medicinal) representing 20 families and 9 orders within the clade were hybridized onto the array to reveal its level of species discrimination. All these species could be successfully differentiated using their hybridization patterns. A number of species-specific probes were identified for commercially important species like tea, coffee, dandelion, yarrow, motherwort, Japanese honeysuckle, valerian, wild celery, and yerba mate. Thirty-seven polymorphic probes were characterized by sequencing. A large number of probes were novel species-specific probes whilst some of them were from chloroplast region including genes like atpB, rpoB, and ndh that have extensively been used for fingerprinting and phylogenetic analysis of plants. Conclusions/Significance Subtracted Diversity Array technique is highly efficient in fingerprinting species with little or no genomic information. The Asterid-specific array could fingerprint all 25 species assessed including three species that were not used in constructing the array. This study validates the use of chloroplast genes for bar-coding (fingerprinting) plant species. In addition, this method allowed detection of several new loci that can be explored to solve

  11. Evolutionary dynamics of influenza A nucleoprotein (NP) lineages revealed by large-scale sequence analyses.

    PubMed

    Xu, Jianpeng; Christman, Mary C; Donis, Ruben O; Lu, Guoqing

    2011-12-01

    Influenza A viral nucleoprotein (NP) plays a critical role in virus replication and host adaptation, however, the underlying molecular evolutionary dynamics of NP lineages are less well-understood. In this study, large-scale analyses of 5094 NP nucleotide sequences revealed eight distinct evolutionary lineages, including three host-specific lineages (human, classical swine and equine), two cross-host lineages (Eurasian avian-like swine and swine-origin human pandemic H1N1 2009) and three geographically isolated avian lineages (Eurasian, North American and Oceanian). The average nucleotide substitution rate of the NP lineages was estimated to be 2.4 × 10(-3) substitutions per site per year, with the highest value observed in pandemic H1N1 2009 (3.4 × 10(-3)) and the lowest in equine (0.9 × 10(-3)). The estimated time of most recent common ancestor (TMRCA) for each lineage demonstrated that the earliest human lineage was derived around 1906, and the latest pandemic H1N1 2009 lineage dated back to December 17, 2008. A marked time gap was found between the times when the viruses emerged and were first sampled, suggesting the crucial role for long-term surveillance of newly emerging viruses. The selection analyses showed that human lineage had six positive selection sites, whereas pandemic H1N1 2009, classical swine, Eurasian avian and Eurasian swine had only one or two sites. Protein structure analyses revealed several positive selection sites located in epitope regions or host adaptation regions, indicating strong adaptation to host immune system pressures in influenza viruses. Along with previous studies, this study provides new insights into the evolutionary dynamics of influenza A NP lineages. Further lineage analyses of other gene segments will allow better understanding of influenza A virus evolution and assist in the improvement of global influenza surveillance.

  12. Genome sequencing and analysis reveals possible determinants of Staphylococcus aureus nasal carriage

    PubMed Central

    Sivaraman, Karthikeyan; Venkataraman, Nitya; Tsai, Jennifer; Dewell, Scott; Cole, Alexander M

    2008-01-01

    Background Nasal carriage of Staphylococcus aureus is a major risk factor in clinical and community settings due to the range of etiologies caused by the organism. We have identified unique immunological and ultrastructural properties associated with nasal carriage isolates denoting a role for bacterial factors in nasal carriage. However, despite extensive molecular level characterizations by several groups suggesting factors necessary for colonization on nasal epithelium, genetic determinants of nasal carriage are unknown. Herein, we have set a genomic foundation for unraveling the bacterial determinants of nasal carriage in S. aureus. Results MLST analysis revealed no lineage specific differences between carrier and non-carrier strains suggesting a role for mobile genetic elements. We completely sequenced a model carrier isolate (D30) and a model non-carrier strain (930918-3) to identify differential gene content. Comparison revealed the presence of 84 genes unique to the carrier strain and strongly suggests a role for Type VII secretion systems in nasal carriage. These genes, along with a putative pathogenicity island (SaPIBov) present uniquely in the carrier strains are likely important in affecting carriage. Further, PCR-based genotyping of other clinical isolates for a specific subset of these 84 genes raise the possibility of nasal carriage being caused by multiple gene sets. Conclusion Our data suggest that carriage is likely a heterogeneic phenotypic trait and implies a role for nucleotide level polymorphism in carriage. Complete genome level analyses of multiple carriage strains of S. aureus will be important in clarifying molecular determinants of S. aureus nasal carriage. PMID:18808706

  13. Cloning and nucleotide sequencing of a novel 7 beta-(4-carboxybutanamido)cephalosporanic acid acylase gene of Bacillus laterosporus and its expression in Escherichia coli and Bacillus subtilis.

    PubMed

    Aramori, I; Fukagawa, M; Tsumura, M; Iwami, M; Ono, H; Kojo, H; Kohsaka, M; Ueda, Y; Imanaka, H

    1991-12-01

    A strain of Bacillus species which produced an enzyme named glutaryl 7-ACA acylase which converts 7 beta-(4-carboxybutanamido)cephalosporanic acid (glutaryl 7-ACA) to 7-amino cephalosporanic acid (7-ACA) was isolated from soil. The gene for the glutaryl 7-ACA acylase was cloned with pHSG298 in Escherichia coli JM109, and the nucleotide sequence was determined by the M13 dideoxy chain termination method. The DNA sequence revealed only one large open reading frame composed of 1,902 bp corresponding to 634 amino acid residues. The deduced amino acid sequence contained a potential signal sequence in its amino-terminal region. Expression of the gene for glutaryl 7-ACA acylase was performed in both E. coli and Bacillus subtilis. The enzyme preparations purified from either recombinant strain of E. coli or B. subtilis were shown to be identical with each other as regards the profile of sodium dodecyl sulfate-polyacrylamide gel electrophoresis and were composed of a single peptide with the molecular size of 70 kDa. Determination of the amino-terminal sequence of the two enzyme preparations revealed that both amino-terminal sequences (the first nine amino acids) were identical and completely coincided with residues 28 to 36 of the open reading frame. Extracellular excretion of the enzyme was observed in a recombinant strain of B. subtilis.

  14. Cloning and nucleotide sequencing of a novel 7 beta-(4-carboxybutanamido)cephalosporanic acid acylase gene of Bacillus laterosporus and its expression in Escherichia coli and Bacillus subtilis.

    PubMed Central

    Aramori, I; Fukagawa, M; Tsumura, M; Iwami, M; Ono, H; Kojo, H; Kohsaka, M; Ueda, Y; Imanaka, H

    1991-01-01

    A strain of Bacillus species which produced an enzyme named glutaryl 7-ACA acylase which converts 7 beta-(4-carboxybutanamido)cephalosporanic acid (glutaryl 7-ACA) to 7-amino cephalosporanic acid (7-ACA) was isolated from soil. The gene for the glutaryl 7-ACA acylase was cloned with pHSG298 in Escherichia coli JM109, and the nucleotide sequence was determined by the M13 dideoxy chain termination method. The DNA sequence revealed only one large open reading frame composed of 1,902 bp corresponding to 634 amino acid residues. The deduced amino acid sequence contained a potential signal sequence in its amino-terminal region. Expression of the gene for glutaryl 7-ACA acylase was performed in both E. coli and Bacillus subtilis. The enzyme preparations purified from either recombinant strain of E. coli or B. subtilis were shown to be identical with each other as regards the profile of sodium dodecyl sulfate-polyacrylamide gel electrophoresis and were composed of a single peptide with the molecular size of 70 kDa. Determination of the amino-terminal sequence of the two enzyme preparations revealed that both amino-terminal sequences (the first nine amino acids) were identical and completely coincided with residues 28 to 36 of the open reading frame. Extracellular excretion of the enzyme was observed in a recombinant strain of B. subtilis. Images FIG. 2 FIG. 5 FIG. 6 PMID:1744041

  15. Complete genome sequence analysis of novel human bocavirus reveals genetic recombination between human bocavirus 2 and human bocavirus 4.

    PubMed

    Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat

    2013-07-01

    Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2.

  16. Revealing the sequence of interactions of PuroA peptide with Candida albicans cells by live-cell imaging

    PubMed Central

    Shagaghi, Nadin; Bhave, Mrinal; Palombo, Enzo A.; Clayton, Andrew H. A.

    2017-01-01

    To determine the mechanism(s) of action of antimicrobial peptides (AMPs) it is desirable to provide details of their interaction kinetics with cellular, sub-cellular and molecular targets. The synthetic peptide, PuroA, displays potent antimicrobial activities which have been attributed to peptide-induced membrane destabilization, or intracellular mechanisms of action (DNA-binding) or both. We used time-lapse fluorescence microscopy and fluorescence lifetime imaging microscopy (FLIM) to directly monitor the localization and interaction kinetics of a FITC- PuroA peptide on single Candida albicans cells in real time. Our results reveal the sequence of events leading to cell death. Within 1 minute, FITC-PuroA was observed to interact with SYTO-labelled nucleic acids, resulting in a noticeable quenching in the fluorescence lifetime of the peptide label at the nucleus of yeast cells, and cell-cycle arrest. A propidium iodide (PI) influx assay confirmed that peptide translocation itself did not disrupt the cell membrane integrity; however, PI entry occurred 25–45 minutes later, which correlated with an increase in fractional fluorescence of pores and an overall loss of cell size. Our results clarify that membrane disruption appears to be the mechanism by which the C. albicans cells are killed and this occurs after FITC-PuroA translocation and binding to intracellular targets. PMID:28252014

  17. Revealing the sequence of interactions of PuroA peptide with Candida albicans cells by live-cell imaging

    NASA Astrophysics Data System (ADS)

    Shagaghi, Nadin; Bhave, Mrinal; Palombo, Enzo A.; Clayton, Andrew H. A.

    2017-03-01

    To determine the mechanism(s) of action of antimicrobial peptides (AMPs) it is desirable to provide details of their interaction kinetics with cellular, sub-cellular and molecular targets. The synthetic peptide, PuroA, displays potent antimicrobial activities which have been attributed to peptide-induced membrane destabilization, or intracellular mechanisms of action (DNA-binding) or both. We used time-lapse fluorescence microscopy and fluorescence lifetime imaging microscopy (FLIM) to directly monitor the localization and interaction kinetics of a FITC- PuroA peptide on single Candida albicans cells in real time. Our results reveal the sequence of events leading to cell death. Within 1 minute, FITC-PuroA was observed to interact with SYTO-labelled nucleic acids, resulting in a noticeable quenching in the fluorescence lifetime of the peptide label at the nucleus of yeast cells, and cell-cycle arrest. A propidium iodide (PI) influx assay confirmed that peptide translocation itself did not disrupt the cell membrane integrity; however, PI entry occurred 25–45 minutes later, which correlated with an increase in fractional fluorescence of pores and an overall loss of cell size. Our results clarify that membrane disruption appears to be the mechanism by which the C. albicans cells are killed and this occurs after FITC-PuroA translocation and binding to intracellular targets.

  18. Genome sequence comparison reveals a candidate gene involved in male-hermaphrodite differentiation in papaya (Carica papaya) trees.

    PubMed

    Ueno, Hiroki; Urasaki, Naoya; Natsume, Satoshi; Yoshida, Kentaro; Tarora, Kazuhiko; Shudo, Ayano; Terauchi, Ryohei; Matsumura, Hideo

    2015-04-01

    The sex type of papaya (Carica papaya) is determined by the pair of sex chromosomes (XX, female; XY, male; and XY(h), hermaphrodite), in which there is a non-recombining genomic region in the Y and Y(h) chromosomes. This region is presumed to be involved in determination of males and hermaphrodites; it is designated as the male-specific region in the Y chromosome (MSY) and the hermaphrodite-specific region in the Y(h) chromosome (HSY). Here, we identified the genes determining male and hermaphrodite sex types by comparing MSY and HSY genomic sequences. In the MSY and HSY genomic regions, we identified 14,528 nucleotide substitutions and 965 short indels with a large gap and two highly diverged regions. In the predicted genes expressed in flower buds, we found no nucleotide differences leading to amino acid changes between the MSY and HSY. However, we found an HSY-specific transposon insertion in a gene (SVP like) showing a similarity to the Short Vegetative Phase (SVP) gene. Study of SVP-like transcripts revealed that the MSY allele encoded an intact protein, while the HSY allele encoded a truncated protein. Our findings demonstrated that the SVP-like gene is a candidate gene for male-hermaphrodite determination in papaya.

  19. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  20. RNA sequencing analysis of human podocytes reveals glucocorticoid regulated gene networks targeting non-immune pathways

    PubMed Central

    Jiang, Lulu; Hindmarch, Charles C. T.; Rogers, Mark; Campbell, Colin; Waterfall, Christy; Coghill, Jane; Mathieson, Peter W.; Welsh, Gavin I.

    2016-01-01

    Glucocorticoids are steroids that reduce inflammation and are used as immunosuppressive drugs for many diseases. They are also the mainstay for the treatment of minimal change nephropathy (MCN), which is characterised by an absence of inflammation. Their mechanisms of action remain elusive. Evidence suggests that immunomodulatory drugs can directly act on glomerular epithelial cells or ‘podocytes’, the cell type which is the main target of injury in MCN. To understand the nature of glucocorticoid effects on non-immune cell functions, we generated RNA sequencing data from human podocyte cell lines and identified the genes that are significantly regulated in dexamethasone-treated podocytes compared to vehicle-treated cells. The upregulated genes are of functional relevance to cytoskeleton-related processes, whereas the downregulated genes mostly encode pro-inflammatory cytokines and growth factors. We observed a tendency for dexamethasone-upregulated genes to be downregulated in MCN patients. Integrative analysis revealed gene networks composed of critical signaling pathways that are likely targeted by dexamethasone in podocytes. PMID:27774996

  1. Methylome sequencing in triple-negative breast cancer reveals distinct methylation clusters with prognostic value.

    PubMed

    Stirzaker, Clare; Zotenko, Elena; Song, Jenny Z; Qu, Wenjia; Nair, Shalima S; Locke, Warwick J; Stone, Andrew; Armstong, Nicola J; Robinson, Mark D; Dobrovic, Alexander; Avery-Kiejda, Kelly A; Peters, Kate M; French, Juliet D; Stein, Sandra; Korbie, Darren J; Trau, Matt; Forbes, John F; Scott, Rodney J; Brown, Melissa A; Francis, Glenn D; Clark, Susan J

    2015-02-02

    Epigenetic alterations in the cancer methylome are common in breast cancer and provide novel options for tumour stratification. Here, we perform whole-genome methylation capture sequencing on small amounts of DNA isolated from formalin-fixed, paraffin-embedded tissue from triple-negative breast cancer (TNBC) and matched normal samples. We identify differentially methylated regions (DMRs) enriched with promoters associated with transcription factor binding sites and DNA hypersensitive sites. Importantly, we stratify TNBCs into three distinct methylation clusters associated with better or worse prognosis and identify 17 DMRs that show a strong association with overall survival, including DMRs located in the Wilms tumour 1 (WT1) gene, bi-directional-promoter and antisense WT1-AS. Our data reveal that coordinated hypermethylation can occur in oestrogen receptor-negative disease, and that characterizing the epigenetic framework provides a potential signature to stratify TNBCs. Together, our findings demonstrate the feasibility of profiling the cancer methylome with limited archival tissue to identify regulatory regions associated with cancer.

  2. Unique Features of a Japanese ‘Candidatus Liberibacter asiaticus’ Strain Revealed by Whole Genome Sequencing

    PubMed Central

    Katoh, Hiroshi; Miyata, Shin-ichi; Inoue, Hiromitsu; Iwanami, Toru

    2014-01-01

    Citrus greening (huanglongbing) is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, ‘Candidatus Liberibacter asiaticus’, ‘Ca. L. americanus’, and ‘Ca. L. africanus’. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol), in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative ‘Ca. L. asiaticus’ Japanese isolate Ishi-1 was determined by metagenomic analysis of DNA extracted from ‘Ca. L. asiaticus’-infected psyllids and leaf midribs. The 1.19-Mb genome has an average 36.32% GC content. Annotation revealed 13 operons encoding rRNA and 44 tRNA genes, but no typical bacterial pathogenesis-related genes were located within the genome, similar to the Floridian psy62 and Chinese gxpsy. In contrast to other ‘Ca. L. asiaticus’ strains, the genome of the Japanese Ishi-1 strain lacks a prophage-related region. PMID:25180586

  3. Family Based Whole Exome Sequencing Reveals the Multifaceted Role of Notch Signaling in Congenital Heart Disease

    PubMed Central

    Chetaille, Philippe; Prince, Andrea; Godard, Beatrice; Leclerc, Severine; Sobreira, Nara; Ling, Hua; Awadalla, Philip; Thibeault, Maryse; Khairy, Paul; Samuels, Mark E.; Andelfinger, Gregor

    2016-01-01

    Left-ventricular outflow tract obstructions (LVOTO) encompass a wide spectrum of phenotypically heterogeneous heart malformations which frequently cluster in families. We performed family based whole-exome and targeted re-sequencing on 182 individuals from 51 families with multiple affected members. Central to our approach is the family unit which serves as a reference to identify causal genotype-phenotype correlations. Screening a multitude of 10 overlapping phenotypes revealed disease associated and co-segregating variants in 12 families. These rare or novel protein altering mutations cluster predominantly in genes (NOTCH1, ARHGAP31, MAML1, SMARCA4, JARID2, JAG1) along the Notch signaling cascade. This is in line with a significant enrichment (Wilcoxon, p< 0.05) of variants with a higher pathogenicity in the Notch signaling pathway in patients compared to controls. The significant enrichment of novel protein truncating and missense mutations in NOTCH1 highlights the allelic and phenotypic heterogeneity in our pediatric cohort. We identified novel co-segregating pathogenic mutations in NOTCH1 associated with left and right-sided cardiac malformations in three independent families with a total of 15 affected individuals. In summary, our results suggest that a small but highly pathogenic fraction of family specific mutations along the Notch cascade are a common cause of LVOTO. PMID:27760138

  4. Targeted next-generation sequencing reveals multiple deleterious variants in OPLL-associated genes

    PubMed Central

    Chen, Xin; Guo, Jun; Cai, Tao; Zhang, Fengshan; Pan, Shengfa; Zhang, Li; Wang, Shaobo; Zhou, Feifei; Diao, Yinze; Zhao, Yanbin; Chen, Zhen; Liu, Xiaoguang; Chen, Zhongqiang; Liu, Zhongjun; Sun, Yu; Du, Jie

    2016-01-01

    Ossification of the posterior longitudinal ligament of the spine (OPLL), which is characterized by ectopic bone formation in the spinal ligaments, can cause spinal-cord compression. To date, at least 11 susceptibility genes have been genetically linked to OPLL. In order to identify potential deleterious alleles in these OPLL-associated genes, we designed a capture array encompassing all coding regions of the target genes for next-generation sequencing (NGS) in a cohort of 55 unrelated patients with OPLL. By bioinformatics analyses, we successfully identified three novel and five extremely rare variants (MAF < 0.005). These variants were predicted to be deleterious by commonly used various algorithms, thereby resulting in missense mutations in four OPLL-associated genes (i.e., COL6A1, COL11A2, FGFR1, and BMP2). Furthermore, potential effects of the patient with p.Q89E of BMP2 were confirmed by a markedly increased BMP2 level in peripheral blood samples. Notably, seven of the variants were found to be associated with the patients with continuous subtype changes by cervical spinal radiological analyses. Taken together, our findings revealed for the first time that deleterious coding variants of the four OPLL-associated genes are potentially pathogenic in the patients with OPLL. PMID:27246988

  5. Whole genome sequence of Staphylococcus saprophyticus reveals the pathogenesis of uncomplicated urinary tract infection.

    PubMed

    Kuroda, Makoto; Yamashita, Atsushi; Hirakawa, Hideki; Kumano, Miyuki; Morikawa, Kazuya; Higashide, Masato; Maruyama, Atsushi; Inose, Yumiko; Matoba, Kimio; Toh, Hidehiro; Kuhara, Satoru; Hattori, Masahira; Ohta, Toshiko

    2005-09-13

    Staphylococcus saprophyticus is a uropathogenic Staphylococcus frequently isolated from young female outpatients presenting with uncomplicated urinary tract infections. We sequenced the whole genome of S. saprophyticus type strain ATCC 15305, which harbors a circular chromosome of 2,516,575 bp with 2,446 ORFs and two plasmids. Comparative genomic analyses with the strains of two other species, Staphylococcus aureus and Staphylococcus epidermidis, as well as experimental data, revealed the following characteristics of the S. saprophyticus genome. S. saprophyticus does not possess any virulence factors found in S. aureus, such as coagulase, enterotoxins, exoenzymes, and extracellular matrix-binding proteins, although it does have a remarkable paralog expansion of transport systems related to highly variable ion contents in the urinary environment. A further unique feature is that only a single ORF is predictable as a cell wall-anchored protein, and it shows positive hemagglutination and adherence to human bladder cell associated with initial colonization in the urinary tract. It also shows significantly high urease activity in S. saprophyticus. The uropathogenicity of S. saprophyticus can be attributed to its genome that is needed for its survival in the human urinary tract by means of novel cell wall-anchored adhesin and redundant uro-adaptive transport systems, together with urease.

  6. RNA sequencing analysis of human podocytes reveals glucocorticoid regulated gene networks targeting non-immune pathways.

    PubMed

    Jiang, Lulu; Hindmarch, Charles C T; Rogers, Mark; Campbell, Colin; Waterfall, Christy; Coghill, Jane; Mathieson, Peter W; Welsh, Gavin I

    2016-10-24

    Glucocorticoids are steroids that reduce inflammation and are used as immunosuppressive drugs for many diseases. They are also the mainstay for the treatment of minimal change nephropathy (MCN), which is characterised by an absence of inflammation. Their mechanisms of action remain elusive. Evidence suggests that immunomodulatory drugs can directly act on glomerular epithelial cells or 'podocytes', the cell type which is the main target of injury in MCN. To understand the nature of glucocorticoid effects on non-immune cell functions, we generated RNA sequencing data from human podocyte cell lines and identified the genes that are significantly regulated in dexamethasone-treated podocytes compared to vehicle-treated cells. The upregulated genes are of functional relevance to cytoskeleton-related processes, whereas the downregulated genes mostly encode pro-inflammatory cytokines and growth factors. We observed a tendency for dexamethasone-upregulated genes to be downregulated in MCN patients. Integrative analysis revealed gene networks composed of critical signaling pathways that are likely targeted by dexamethasone in podocytes.

  7. Exome sequencing reveals AMER1 as a frequently mutated gene in colorectal cancer

    PubMed Central

    Sanz-Pamplona, Rebeca; Lopez-Doriga, Adriana; Paré-Brunet, Laia; Lázaro, Kira; Bellido, Fernando; Alonso, M. Henar; Aussó, Susanna; Guinó, Elisabet; Beltrán, Sergi; Castro-Giner, Francesc; Gut, Marta; Sanjuan, Xavier; Closa, Adria; Cordero, David; Morón-Duran, Francisco D.; Soriano, Antonio; Salazar, Ramón; Valle, Laura; Moreno, Victor

    2015-01-01

    PURPOSE Somatic mutations occur at early stages of adenoma and accumulate throughout colorectal cancer (CRC) progression. The aim of this study was to characterize the mutational landscape of stage II tumors and to search for novel recurrent mutations likely implicated in CRC tumorigenesis. DESIGN The exomic DNA of 42 stage II, microsatellite stable, colon tumors and their paired mucosae were sequenced. Other molecular data available in the discovery dataset (gene expression, methylation, and CNV) was used to further characterize these tumors. Additional datasets comprising 553 CRC samples were used to validate the discovered mutations. RESULTS As a result, 4,886 somatic single nucleotide variants (SNVs) were found. Almost all SNVs were private changes, with few mutations shared by more than one tumor, thus revealing tumor-specific mutational landscapes. Nevertheless, these diverse mutations converged into common cellular pathways such as cell cycle or apoptosis. Among this mutational heterogeneity, variants resulting in early stop-codons in the AMER1 (also known as FAM123B or WTX) gene emerged as recurrent mutations in CRC. Loses of AMER1 by other mechanisms apart from mutations such as methylation and copy number aberrations were also found. Tumors lacking this tumor suppressor gene exhibited a mesenchymal phenotype characterized by inhibition of the canonical Wnt pathway. CONCLUSION In silico and experimental validation in independent datasets confirmed the existence of functional mutations in AMER1 in approximately 10% of analyzed CRC tumors. Moreover, these tumors exhibited a characteristic phenotype. PMID:26071483

  8. Requirements for Efficient Correction of ΔF508 CFTR Revealed by Analyses of Evolved Sequences

    PubMed Central

    Mendoza, Juan L.; Schmidt, André; Li, Qin; Caspa, Emmanuel; Barrett, Tyler; Bridges, Robert J.; Feranchak, Andrew P.; Brautigam, Chad A.; Thomas, Philip J.

    2012-01-01

    SUMMARY Misfolding of ΔF508 CFTR underlies pathology in most CF patients. F508 resides in the first nucleotide binding domain (NBD1) of CFTR near a predicted interface with the fourth intracellular loop (ICL4). Efforts to identify small molecules that restore function by correcting the folding defect have revealed an apparent efficacy ceiling. To understand the mechanistic basis of this obstacle, positions statistically coupled to 508, in evolved sequences, were identified and assessed for their impact on both NBD1 and CFTR folding. The results indicate that both NBD1 folding and interaction with ICL4 are altered by the ΔF508 mutation and that correction of either individual process is only partially effective. By contrast, combination of mutations that counteract both defects restores ΔF508 maturation and function to wild type levels. These results provide a mechanistic rationale for the limited efficacy of extant corrector compounds and suggest approaches for identifying compounds that correct both defective steps. PMID:22265409

  9. Whole genome sequence of Desulfovibrio magneticus strain RS-1 revealed common gene clusters in magnetotactic bacteria

    PubMed Central

    Nakazawa, Hidekazu; Arakaki, Atsushi; Narita-Yamada, Sachiko; Yashiro, Isao; Jinno, Koji; Aoki, Natsuko; Tsuruyama, Ai; Okamura, Yoshiko; Tanikawa, Satoshi; Fujita, Nobuyuki; Takeyama, Haruko; Matsunaga, Tadashi

    2009-01-01

    Magnetotactic bacteria are ubiquitous microorganisms that synthesize intracellular magnetite particles (magnetosomes) by accumulating Fe ions from aquatic environments. Recent molecular studies, including comprehensive proteomic, transcriptomic, and genomic analyses, have considerably improved our hypotheses of the magnetosome-formation mechanism. However, most of these studies have been conducted using pure-cultured bacterial strains of α-proteobacteria. Here, we report the whole-genome sequence of Desulfovibrio magneticus strain RS-1, the only isolate of magnetotactic microorganisms classified under δ-proteobacteria. Comparative genomics of the RS-1 and four α-proteobacterial strains revealed the presence of three separate gene regions (nuo and mamAB-like gene clusters, and gene region of a cryptic plasmid) conserved in all magnetotactic bacteria. The nuo gene cluster, encoding NADH dehydrogenase (complex I), was also common to the genomes of three iron-reducing bacteria exhibiting uncontrolled extracellular and/or intracellular magnetite synthesis. A cryptic plasmid, pDMC1, encodes three homologous genes that exhibit high similarities with those of other magnetotactic bacterial strains. In addition, the mamAB-like gene cluster, encoding the key components for magnetosome formation such as iron transport and magnetosome alignment, was conserved only in the genomes of magnetotactic bacteria as a similar genomic island-like structure. Our findings suggest the presence of core genetic components for magnetosome biosynthesis; these genes may have been acquired into the magnetotactic bacterial genomes by multiple gene-transfer events during proteobacterial evolution. PMID:19675025

  10. RNA sequencing reveals differentially expressed genes as potential diagnostic and prognostic indicators of gallbladder carcinoma

    PubMed Central

    Jiang, Mingming; Fang, Meng; Ji, Jun; Wang, Aihua; Wang, Mengmeng; Jiang, Xiaoqing; Gao, Chunfang

    2015-01-01

    Gallbladder carcinoma (GBC) is a rare tumor with a dismal survival rate overall. Hence, there is an urgent need for exploring more specific and sensitive biomarkers for the diagnosis and treatment of GBC. At first, amplified total RNAs from two paired GBC tumors and adjacent non-tumorous tissues (ANTTs) were subjected to RNA sequencing. 161 genes were identified differentially expressed between tumors and ANTTs. Functional enrichment analysis indicated that the up-regulated genes in tumor were primarily associated with signaling molecules and enzyme modulators, and mainly involved in cell cycles and pathways in cancer. Twelve differentially expressed genes (DEGs) were further confirmed in another independent cohort of 35 GBC patients. Expression levels of BIRC5, TK1, TNNT1 and MMP9 were found to be positively related to postoperative relapse. There was also a significant correlation between BIRC5 expression and tumor-node-metastasis (TNM) stage. Besides, we observed a positive correlation between serum CA19–9 concentration and the expression levels of TNNT1, MMP9 and CLIC3. Survival analysis revealed that GBC patients with high TK1 and MMP9 expression levels had worse prognosis. These identified DEGs might not only be promising biomarkers for GBC diagnosis and prognosis, but also expedite the discovery of novel therapeutic strategies. PMID:25970782

  11. High throughput sequencing reveals alterations in the recombination signatures with diminishing Spo11 activity.

    PubMed

    Rockmill, Beth; Lefrançois, Philippe; Voelkel-Meiman, Karen; Oke, Ashwini; Roeder, G Shirleen; Fung, Jennifer C

    2013-10-01

    Spo11 is the topoisomerase-like enzyme responsible for the induction of the meiosis-specific double strand breaks (DSBs), which initiates the recombination events responsible for proper chromosome segregation. Nineteen PCR-induced alleles of SPO11 were identified and characterized genetically and cytologically. Recombination, spore viability and synaptonemal complex (SC) formation were decreased to varying extents in these mutants. Arrest by ndt80 restored these events in two severe hypomorphic mutants, suggesting that ndt80-arrested nuclei are capable of extended DSB activity. While crossing-over, spore viability and synaptonemal complex (SC) formation defects correlated, the extent of such defects was not predictive of the level of heteroallelic gene conversions (prototrophs) exhibited by each mutant. High throughput sequencing of tetrads from spo11 hypomorphs revealed that gene conversion tracts associated with COs are significantly longer and gene conversion tracts unassociated with COs are significantly shorter than in wild type. By modeling the extent of these tract changes, we could account for the discrepancy in genetic measurements of prototrophy and crossover association. These findings provide an explanation for the unexpectedly low prototroph levels exhibited by spo11 hypomorphs and have important implications for genetic studies that assume an unbiased recovery of prototrophs, such as measurements of CO homeostasis. Our genetic and physical data support previous observations of DSB-limited meioses, in which COs are disproportionally maintained over NCOs (CO homeostasis).

  12. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...

  13. SETG: Nucleic Acid Extraction and Sequencing for In Situ Life Detection on Mars

    NASA Astrophysics Data System (ADS)

    Mojarro, A.; Hachey, J.; Tani, J.; Smith, A.; Bhattaru, S. A.; Pontefract, A.; Doebler, R.; Brown, M.; Ruvkun, G.; Zuber, M. T.; Carr, C. E.

    2016-10-01

    We are developing an integrated nucleic acid extraction and sequencing instrument: the Search for Extra-Terrestrial Genomes (SETG) for in situ life detection on Mars. Our goals are to identify related or unrelated nucleic acid-based life on Mars.

  14. Draft Genome Sequence of Cyanobacterium sp. Strain IPPAS B-1200 with a Unique Fatty Acid Composition

    PubMed Central

    Starikov, Alexander Y.; Usserbaeva, Aizhan A.; Sinetova, Maria A.; Sarsekeyeva, Fariza K.; Zayadan, Bolatkhan K.; Ustinova, Vera V.; Kupriyanova, Elena V.; Los, Dmitry A.

    2016-01-01

    Here, we report the draft genome of Cyanobacterium sp. IPPAS strain B-1200, isolated from Lake Balkhash, Kazakhstan, and characterized by the unique fatty acid composition of its membrane lipids, which are enriched with myristic and myristoleic acids. The approximate genome size is 3.4 Mb, and the predicted number of coding sequences is 3,119. PMID:27856596

  15. Deep Sequencing Reveals Potential Antigenic Variants at Low Frequencies in Influenza A Virus-Infected Humans

    PubMed Central

    Dinis, Jorge M.; Florek, Nicholas W.; Fatola, Omolayo O.; Moncla, Louise H.; Mutschler, James P.; Charlier, Olivia K.; Meece, Jennifer K.; Belongia, Edward A.

    2016-01-01

    ABSTRACT Influenza vaccines must be frequently reformulated to account for antigenic changes in the viral envelope protein, hemagglutinin (HA). The rapid evolution of influenza virus under immune pressure is likely enhanced by the virus's genetic diversity within a host, although antigenic change has rarely been investigated on the level of individual infected humans. We used deep sequencing to characterize the between- and within-host genetic diversity of influenza viruses in a cohort of patients that included individuals who were vaccinated and then infected in the same season. We characterized influenza HA segments from the predominant circulating influenza A subtypes during the 2012-2013 (H3N2) and 2013-2014 (pandemic H1N1; H1N1pdm) flu seasons. We found that HA consensus sequences were similar in nonvaccinated and vaccinated subjects. In both groups, purifying selection was the dominant force shaping HA genetic diversity. Interestingly, viruses from multiple individuals harbored low-frequency mutations encoding amino acid substitutions in HA antigenic sites at or near the receptor-binding domain. These mutations included two substitutions in H1N1pdm viruses, G158K and N159K, which were recently found to confer escape from virus-specific antibodies. These findings raise the possibility that influenza antigenic diversity can be generated within individual human hosts but may not become fixed in the viral population even when they would be expected to have a strong fitness advantage. Understanding constraints on influenza antigenic evolution within individual hosts may elucidate potential future pathways of antigenic evolution at the population level. IMPORTANCE Influenza vaccines must be frequently reformulated due to the virus's rapid evolution rate. We know that influenza viruses exist within each infected host as a “swarm” of genetically distinct viruses, but the role of this within-host diversity in the antigenic evolution of influenza has been unclear

  16. Nucleotide sequence polymorphism at the apical membrane antigen-1 locus reveals population history of Plasmodium vivax in Thailand

    PubMed Central

    Putaporntip, Chaturong; Jongwutiwes, Somchai; Grynberg, Priscila; Cui, Liwang; Hughes, Austin L.

    2009-01-01

    Apical membrane antigen-1 is a candidate for inclusion in a vaccine for the human malaria parasite Plasmodium vivax. We collected 231 complete sequences of the gene encoding this antigen (pvama-1) from three regions of Thailand, the most extensive collection to date of sequences at this locus. The domain II loop (previously mentioned as a potential vaccine component) was almost completely conserved, with a single amino acid variant (I313R) observed in a single sequence. The 3′ portion of the gene (domain II through the stop codon) showed significantly lower nucleotide diversity than the 5′ portion (start codon through domain I); and a given domain I sequence might be found in a haplotype with more than one domain II sequence. These results imply a hotspot of recombination between domains I and II. We found significant geographic subdivision among the three regions of Thailand (NW, East, and South) in which collections were made in 2007. Numbers of P. vivax infections have experienced overall declines since 1990 in all three regions; but the decline has been most recent in the NW, and there has been a rebound in numbers of infections in the South since 2000. Consistent with population history, amino acid sequence diversity was greatest in the NW. The South, which had by far the lowest sequence diversity of the three regions, showed signs of a population that has expanded from a small number of founders after a bottleneck. PMID:19643205

  17. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  18. Parvalbumins from coelacanth muscle. III. Amino acid sequence of the major component.

    PubMed

    Jauregui-Adell, J; Pechere, J F

    1978-09-26

    The primary structure of the major parvalbumin (pI = 4.52) from coelacanth muscle (Latimeria chalumnae) has been determined. Sequence analysis of the tryptic peptides, in some cases obtained with beta-trypsin, accounts for the total amino acid content of the protein. Chymotryptic peptides provide appropriate sequence overlaps, to complete the localization of the tryptic peptides. Examination of the amino acid sequence of this protein shows the typical structure of a beta-parvalbumin. Its position in the dendrogram of related calcium-binding proteins corresponds to that usually accepted for crossopterygians.

  19. Analysis of cloned cDNA and genomic sequences for phytochrome: complete amino acid sequences for two gene products expressed in etiolated Avena.

    PubMed Central

    Hershey, H P; Barker, R F; Idler, K B; Lissemore, J L; Quail, P H

    1985-01-01

    Cloned cDNA and genomic sequences have been analyzed to deduce the amino acid sequence of phytochrome from etiolated Avena. Restriction endonuclease site polymorphism between clones indicates that at least four phytochrome genes are expressed in this tissue. Sequence analysis of two complete and one partial coding region shows approximately 98% homology at both the nucleotide and amino acid levels, with the majority of amino acid changes being conservative. High sequence homology is also found in the 5'-untranslated region but significant divergence occurs in the 3'-untranslated region. The phytochrome polypeptides are 1128 amino acid residues long corresponding to a molecular mass of 125 kdaltons. The known protein sequence at the chromophore attachment site occurs only once in the polypeptide, establishing that phytochrome has a single chromophore per monomer covalently linked to Cys-321. Computer analyses of the amino acid sequences have provided predictions regarding a number of structural features of the phytochrome molecule. PMID:3001642

  20. Genome sequence of the button mushroom Agaricus bisporus reveals mechanisms governing adaptation to a humic-rich ecological niche

    PubMed Central

    Morin, Emmanuelle; Kohler, Annegret; Baker, Adam R.; Foulongne-Oriol, Marie; Lombard, Vincent; Nagye, Laszlo G.; Ohm, Robin A.; Patyshakuliyeva, Aleksandrina; Brun, Annick; Aerts, Andrea L.; Bailey, Andrew M.; Billette, Christophe; Coutinho, Pedro M.; Deakin, Greg; Doddapaneni, Harshavardhan; Floudas, Dimitrios; Grimwood, Jane; Hildén, Kristiina; Kües, Ursula; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Murat, Claude; Riley, Robert W.; Salamov, Asaf A.; Schmutz, Jeremy; Subramanian, Venkataramanan; Wösten, Han A. B.; Xu, Jianping; Eastwood, Daniel C.; Foster, Gary D.; Sonnenberg, Anton S. M.; Cullen, Dan; de Vries, Ronald P.; Lundell, Taina; Hibbett, David S.; Henrissat, Bernard; Burton, Kerry S.; Kerrigan, Richard W.; Challen, Michael P.; Grigoriev, Igor V.; Martin, Francis

    2012-01-01

    Agaricus bisporus is the model fungus for the adaptation, persistence, and growth in the humic-rich leaf-litter environment. Aside from its ecological role, A. bisporus has been an important component of the human diet for over 200 y and worldwide cultivation of the “button mushroom” forms a multibillion dollar industry. We present two A. bisporus genomes, their gene repertoires and transcript profiles on compost and during mushroom formation. The genomes encode a full repertoire of polysaccharide-degrading enzymes similar to that of wood-decayers. Comparative transcriptomics of mycelium grown on defined medium, casing-soil, and compost revealed genes encoding enzymes involved in xylan, cellulose, pectin, and protein degradation are more highly expressed in compost. The striking expansion of heme-thiolate peroxidases and β-etherases is distinctive from Agaricomycotina wood-decayers and suggests a broad attack on decaying lignin and related metabolites found in humic acid-rich environment. Similarly, up-regulation of these genes together with a lignolytic manganese peroxidase, multiple copper radical oxidases, and cytochrome P450s is consistent with challenges posed by complex humic-rich substrates. The gene repertoire and expression of hydrolytic enzymes in A. bisporus is substantially different from the taxonomically related ectomycorrhizal symbiont Laccaria bicolor. A common promoter motif was also identified in genes very highly expressed in humic-rich substrates. These observations reveal genetic and enzymatic mechanisms governing adaptation to the humic-rich ecological niche formed during plant degradation, further defining the critical role such fungi contribute to soil structure and carbon sequestration in terrestrial ecosystems. Genome sequence will expedite mushroom breeding for improved agronomic characteristics. PMID:23045686

  1. Genome sequence of the button mushroom Agaricus bisporus reveals mechanisms governing adaptation to a humic-rich ecological niche.

    PubMed

    Morin, Emmanuelle; Kohler, Annegret; Baker, Adam R; Foulongne-Oriol, Marie; Lombard, Vincent; Nagy, Laszlo G; Ohm, Robin A; Patyshakuliyeva, Aleksandrina; Brun, Annick; Aerts, Andrea L; Bailey, Andrew M; Billette, Christophe; Coutinho, Pedro M; Deakin, Greg; Doddapaneni, Harshavardhan; Floudas, Dimitrios; Grimwood, Jane; Hildén, Kristiina; Kües, Ursula; Labutti, Kurt M; Lapidus, Alla; Lindquist, Erika A; Lucas, Susan M; Murat, Claude; Riley, Robert W; Salamov, Asaf A; Schmutz, Jeremy; Subramanian, Venkataramanan; Wösten, Han A B; Xu, Jianping; Eastwood, Daniel C; Foster, Gary D; Sonnenberg, Anton S M; Cullen, Dan; de Vries, Ronald P; Lundell, Taina; Hibbett, David S; Henrissat, Bernard; Burton, Kerry S; Kerrigan, Richard W; Challen, Michael P; Grigoriev, Igor V; Martin, Francis

    2012-10-23

    Agaricus bisporus is the model fungus for the adaptation, persistence, and growth in the humic-rich leaf-litter environment. Aside from its ecological role, A. bisporus has been an important component of the human diet for over 200 y and worldwide cultivation of the "button mushroom" forms a multibillion dollar industry. We present two A. bisporus genomes, their gene repertoires and transcript profiles on compost and during mushroom formation. The genomes encode a full repertoire of polysaccharide-degrading enzymes similar to that of wood-decayers. Comparative transcriptomics of mycelium grown on defined medium, casing-soil, and compost revealed genes encoding enzymes involved in xylan, cellulose, pectin, and protein degradation are more highly expressed in compost. The striking expansion of heme-thiolate peroxidases and β-etherases is distinctive from Agaricomycotina wood-decayers and suggests a broad attack on decaying lignin and related metabolites found in humic acid-rich environment. Similarly, up-regulation of these genes together with a lignolytic manganese peroxidase, multiple copper radical oxidases, and cytochrome P450s is consistent with challenges posed by complex humic-rich substrates. The gene repertoire and expression of hydrolytic enzymes in A. bisporus is substantially different from the taxonomically related ectomycorrhizal symbiont Laccaria bicolor. A common promoter motif was also identified in genes very highly expressed in humic-rich substrates. These observations reveal genetic and enzymatic mechanisms governing adaptation to the humic-rich ecological niche formed during plant degradation, further defining the critical role such fungi contribute to soil structure and carbon sequestration in terrestrial ecosystems. Genome sequence will expedite mushroom breeding for improved agronomic characteristics.

  2. Genome sequence of the button mushroom Agaricus bisporus reveals mechanisms governing adaptation to a humic-rich ecological niche

    SciTech Connect

    Morin, Emmanuelle; Kohler, Annegret; Baker, Adam R.; Foulongne-Oriol, Marie; Lombard, Vincent; Nagy, Laszlo G.; Ohm, Robin A.; Patyshakuliyeva, Aleksandrina; Brun, Annick; Aerts, Andrea L.; Bailey, Andrew M.; Billette, Christophe; Coutinho, Pedro M.; Deakin, Greg; Doddapaneni, Harshavardhan; Floudas, Dimitrios; Grimwood, Jane; Hilden, Kristiina; Kues, Ursula; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Murat, Claude; Riley, Robert W.; Salamov, Asaf A.; Schmutz, Jeremy; Subramanian, Venkataramanan; Wosten, Han A. B.; Xu, Jianping; Eastwood, Daniel C.; Foster, Gary D.; Sonnenberg, Anton S. M.; Cullen, Dan; de Vries, Ronald P.; Lundell, Taina; Hibbett, David S.; Henrissat, Bernard; Burton, Kerry S.; Kerrigan, Richard W.; Challen, Michael P.; Grigoriev, Igor V.; Martin, Francis

    2012-04-27

    Agaricus bisporus is the model fungus for the adaptation, persistence, and growth in the humic-rich leaf-litter environment. Aside from its ecological role, A. bisporus has been an important component of the human diet for over 200 y and worldwide cultivation of the button mushroom forms a multibillion dollar industry. We present two A. bisporus genomes, their gene repertoires and transcript profiles on compost and during mushroom formation. The genomes encode a full repertoire of polysaccharide-degrading enzymes similar to that of wood-decayers. Comparative transcriptomics of mycelium grown on defined medium, casing-soil, and compost revealed genes encoding enzymes involved in xylan, cellulose, pectin, and protein degradation are more highly expressed in compost. The striking expansion of heme-thiolate peroxidases and etherases is distinctive from Agaricomycotina wood-decayers and suggests a broad attack on decaying lignin and related metabolites found in humic acid-rich environment. Similarly, up-regulation of these genes together with a lignolytic manganese peroxidase, multiple copper radical oxidases, and cytochrome P450s is consistent with challenges posed by complex humic-rich substrates. The gene repertoire and expression of hydrolytic enzymes in A. bisporus is substantially different from the taxonomically related ectomycorrhizal symbiont Laccaria bicolor. A common promoter motif was also identified in genes very highly expressed in humic-rich substrates. These observations reveal genetic and enzymatic mechanisms governing adaptation to the humic-rich ecological niche formed during plant degradation, further defining the critical role such fungi contribute to soil structure and carbon sequestration in terrestrial ecosystems. Genome sequence will expedite mushroom breeding for improved agronomic characteristics.

  3. Differences between EcoRI nonspecific and "star" sequence complexes revealed by osmotic stress.

    PubMed

    Sidorova, Nina Y; Rau, Donald C

    2004-10-01

    The binding of the restriction endonuclease EcoRI to DNA is exceptionally specific. Even a single basepair change ("star" sequence) from the recognition sequence, GAATTC, decreases the binding free energy of EcoRI to values nearly indistinguishable from nonspecific binding. The difference in the number of waters sequestered by the protein-DNA complexes of the "star" sequences TAATTC and CAATTC and by the specific sequence complex determined from the dependence of binding free energy on water activity is also practically indistinguishable at low osmotic pressures from the 110 water molecules sequestered by nonspecific sequence complexes. Novel measurements of the dissociation rates of noncognate sequence complexes and competition equilibrium show that sequestered water can be removed from "star" sequence complexes by high osmotic pressure, but not from a nonspecific complex. By 5 Osm, the TAATTC "star" sequence complex has lost almost 90 of the approximately 110 waters initially present. It is more difficult to remove water from the CAATTC "star" sequence complex. The sequence dependence of water loss correlates with the known sequence dependence of "star" cleavage activity.

  4. Purification, characterization and partial amino acid sequence of glycogen synthase from Saccharomyces cerevisiae.

    PubMed Central

    Carabaza, A; Arino, J; Fox, J W; Villar-Palasi, C; Guinovart, J J

    1990-01-01

    Glycogen synthase from Saccharomyces cerevisiae was purified to homogeneity. The enzyme showed a subunit molecular mass of 80 kDa. The holoenzyme appears to be a tetramer. Antibodies developed against purified yeast glycogen synthase inactivated the enzyme in yeast extracts and allowed the detection of the protein in Western blots. Amino acid analysis showed that the enzyme is very rich in glutamate and/or glutamine residues. The N-terminal sequence (11 amino acid residues) was determined. In addition, selected tryptic-digest peptides were purified by reverse-phase h.p.l.c. and submitted to gas-phase sequencing. Up to eight sequences (79 amino acid residues) could be aligned with the human muscle enzyme sequence. Levels of identity range between 37 and 100%, indicating that, although human and yeast glycogen synthases probably share some conserved regions, significant differences in their primary structure should be expected. Images Fig. 1. Fig. 2. Fig. 3. PMID:2114092

  5. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  6. Structural Conservation of Ligand Binding Reveals a Bile Acid-like Signaling Pathway in Nematodes*

    PubMed Central

    Zhi, Xiaoyong; Zhou, X. Edward; Melcher, Karsten; Motola, Daniel L.; Gelmedin, Verena; Hawdon, John; Kliewer, Steven A.; Mangelsdorf, David J.; Xu, H. Eric

    2012-01-01

    Bile acid-like molecules named dafachronic acids (DAs) control the dauer formation program in Caenorhabditis elegans through the nuclear receptor DAF-12. This mechanism is conserved in parasitic nematodes to regulate their dauer-like infective larval stage, and as such, the DAF-12 ligand binding domain has been identified as an important therapeutic target in human parasitic hookworm species that infect more than 600 million people worldwide. Here, we report two x-ray crystal structures of the hookworm Ancylostoma ceylanicum DAF-12 ligand binding domain in complex with DA and cholestenoic acid (a bile acid-like metabolite), respectively. Structure analysis and functional studies reveal key residues responsible for species-specific ligand responses of DAF-12. Furthermore, DA binds to DAF-12 mechanistically and is structurally similar to bile acids binding to the mammalian bile acid receptor farnesoid X receptor. Activation of DAF-12 by cholestenoic acid and the cholestenoic acid complex structure suggest that bile acid-like signaling pathways have been conserved in nematodes and mammals. Together, these results reveal the molecular mechanism for the interplay between parasite and host, provide a structural framework for DAF-12 as a promising target in treating nematode parasitism, and provide insight into the evolution of gut parasite hormone-signaling pathways. PMID:22170062

  7. Nucleotide and deduced amino acid sequences of a new subtilisin from an alkaliphilic Bacillus isolate.

    PubMed

    Saeki, Katsuhisa; Magallones, Marietta V; Takimura, Yasushi; Hatada, Yuji; Kobayashi, Tohru; Kawai, Shuji; Ito, Susumu

    2003-10-01

    The gene for a new subtilisin from the alkaliphilic Bacillus sp. KSM-LD1 was cloned and sequenced. The open reading frame of the gene encoded a 97 amino-acid prepro-peptide plus a 307 amino-acid mature enzyme that contained a possible catalytic triad of residues, Asp32, His66, and Ser224. The deduced amino acid sequence of the mature enzyme (LD1) showed approximately 65% identity to those of subtilisins SprC and SprD from alkaliphilic Bacillus sp. LG12. The amino acid sequence identities of LD1 to those of previously reported true subtilisins and high-alkaline proteases were below 60%. LD1 was characteristically stable during incubation with surfactants and chemical oxidants. Interestingly, an oxidizable Met residue is located next to the catalytic Ser224 of the enzyme as in the cases of the oxidation-susceptible subtilisins reported to date.

  8. Shark myelin basic protein: amino acid sequence, secondary structure, and self-association.

    PubMed

    Milne, T J; Atkins, A R; Warren, J A; Auton, W P; Smith, R

    1990-09-01

    Myelin basic protein (MBP) from the Whaler shark (Carcharhinus obscurus) has been purified from acid extracts of a chloroform/methanol pellet from whole brains. The amino acid sequence of the majority of the protein has been determined and compared with the sequences of other MBPs. The shark protein has only 44% homology with the bovine protein, but, in common with other MBPs, it has basic residues distributed throughout the sequence and no extensive segments that are predicted to have an ordered secondary structure in solution. Shark MBP lacks the triproline sequence previously postulated to form a hairpin bend in the molecule. The region containing the putative consensus sequence for encephalitogenicity in the guinea pig contains several substitutions, thus accounting for the lack of activity of the shark protein. Studies of the secondary structure and self-association have shown that shark MBP possesses solution properties similar to those of the bovine protein, despite the extensive differences in primary structure.

  9. Cloning, sequence analysis, and expression in Escherichia coli of the gene encoding an alpha-amino acid ester hydrolase from Acetobacter turbidans.

    PubMed

    Polderman-Tijmes, Jolanda J; Jekel, Peter A; de Vries, Erik J; van Merode, Annet E J; Floris, René; van der Laan, Jan-Metske; Sonke, Theo; Janssen, Dick B

    2002-01-01

    The alpha-amino acid ester hydrolase from Acetobacter turbidans ATCC 9325 is capable of hydrolyzing and synthesizing beta-lactam antibiotics, such as cephalexin and ampicillin. N-terminal amino acid sequencing of the purified alpha-amino acid ester hydrolase allowed cloning and genetic characterization of the corresponding gene from an A. turbidans genomic library. The gene, designated aehA, encodes a polypeptide with a molecular weight of 72,000. Comparison of the determined N-terminal sequence and the deduced amino acid sequence indicated the presence of an N-terminal leader sequence of 40 amino acids. The aehA gene was subcloned in the pET9 expression plasmid and expressed in Escherichia coli. The recombinant protein was purified and found to be dimeric with subunits of 70 kDa. A sequence similarity search revealed 26% identity with a glutaryl 7-ACA acylase precursor from Bacillus laterosporus, but no homology was found with other known penicillin or cephalosporin acylases. There was some similarity to serine proteases, including the conservation of the active site motif, GXSYXG. Together with database searches, this suggested that the alpha-amino acid ester hydrolase is a beta-lactam antibiotic acylase that belongs to a class of hydrolases that is different from the Ntn hydrolase superfamily to which the well-characterized penicillin acylase from E. coli belongs. The alpha-amino acid ester hydrolase of A. turbidans represents a subclass of this new class of beta-lactam antibiotic acylases.

  10. Analysis of the chromatin domain organisation around the plastocyanin gene reveals an MAR-specific sequence element in Arabidopsis thaliana.

    PubMed Central

    van Drunen, C M; Oosterling, R W; Keultjes, G M; Weisbeek, P J; van Driel, R; Smeekens, S C

    1997-01-01

    The Arabidopsis thaliana genome is currently being sequenced, eventually leading towards the unravelling of all potential genes. We wanted to gain more insight into the way this genome might be organized at the ultrastructural level. To this extent we identified matrix attachment regions demarking potential chromatin domains, in a 16 kb region around the plastocyanin gene. The region was cloned and sequenced revealing six genes in addition to the plastocyanin gene. Using an heterologous in vitro nuclear matrix binding assay, to search for evolutionary conserved matrix attachment regions (MARs), we identified three such MARs. These three MARs divide the region into two small chromatin domains of 5 kb, each containing two genes. Comparison of the sequence of the three MARs revealed a degenerated 21 bp sequence that is shared between these MARs and that is not found elsewhere in the region. A similar sequence element is also present in four other MARs of Arabidopsis.Therefore, this sequence may constitute a landmark for the position of MARs in the genome of this plant. In a genomic sequence database of Arabidopsis the 21 bp element is found approximately once every 10 kb. The compactness of the Arabidopsis genome could account for the high incidence of MARs and MRSs we observed. PMID:9380515

  11. Temporal Dynamics of Avian Populations during Pleistocene Revealed by Whole-Genome Sequences

    PubMed Central

    Nadachowska-Brzyska, Krystyna; Li, Cai; Smeds, Linnea; Zhang, Guojie; Ellegren, Hans

    2015-01-01

    Summary Global climate fluctuations have significantly influenced the distribution and abundance of biodiversity [1]. During unfavorable glacial periods, many species experienced range contraction and fragmentation, expanding again during interglacials [2–4]. An understanding of the evolutionary consequences of both historical and ongoing climate changes requires knowledge of the temporal dynamics of population numbers during such climate cycles. Variation in abundance should have left clear signatures in the patterns of intraspecific genetic variation in extant species, from which historical effective population sizes (Ne) can be estimated [3]. We analyzed whole-genome sequences of 38 avian species in a pairwise sequentially Markovian coalescent (PSMC, [5]) framework to quantitatively reveal changes in Ne from approximately 10 million to 10 thousand years ago. Significant fluctuations in Ne over time were evident for most species. The most pronounced pattern observed in many species was a severe reduction in Ne coinciding with the beginning of the last glacial period (LGP). Among species, Ne varied by at least three orders of magnitude, exceeding 1 million in the most abundant species. Several species on the IUCN Red List of Threatened Species showed long-term reduction in population size, predating recent declines. We conclude that cycles of population expansions and contractions have been a common feature of many bird species during the Quaternary period, likely coinciding with climate cycles. Population size reduction should have increased the risk of extinction but may also have promoted speciation. Species that have experienced long-term declines may be especially vulnerable to recent anthropogenic threats. PMID:25891404

  12. Analysis of Genome Sequences from Plant Pathogenic Rhodococcus Reveals Genetic Novelties in Virulence Loci

    PubMed Central

    Davis, Edward W.; Putnam, Melodie L.; Hu, Erdong; Swader-Hines, David; Mol, Adeline; Baucher, Marie; Prinsen, Els; Zdanowska, Magdalena; Givan, Scott A.; Jaziri, Mondher El; Loper, Joyce E.; Mahmud, Taifo; Chang, Jeff H.

    2014-01-01

    Members of Gram-positive Actinobacteria cause economically important diseases to plants. Within the Rhodococcus genus, some members can cause growth deformities and persist as pathogens on a wide range of host plants. The current model predicts that phytopathogenic isolates require a cluster of three loci present on a linear plasmid, with the fas operon central to virulence. The Fas proteins synthesize, modify, and activate a mixture of growth regulating cytokinins, which cause a hormonal imbalance in plants, resulting in abnormal growth. We sequenced and compared the genomes of 20 isolates of Rhodococcus to gain insights into the mechanisms and evolution of virulence in these bacteria. Horizontal gene transfer was identified as critical but limited in the scale of virulence evolution, as few loci are conserved and exclusive to phytopathogenic isolates. Although the fas operon is present in most phytopathogenic isolates, it is absent from phytopathogenic isolate A21d2. Instead, this isolate has a horizontally acquired gene chimera that encodes a novel fusion protein with isopentyltransferase and phosphoribohydrolase domains, predicted to be capable of catalyzing and activating cytokinins, respectively. Cytokinin profiling of the archetypal D188 isolate revealed only one activate cytokinin type that was specifically synthesized in a fas-dependent manner. These results suggest that only the isopentenyladenine cytokinin type is synthesized and necessary for Rhodococcus phytopathogenicity, which is not consistent with the extant model stating that a mixture of cytokinins is necessary for Rhodococcus to cause leafy gall symptoms. In all, data indicate that only four horizontally acquired functions are sufficient to confer the trait of phytopathogenicity to members of the genetically diverse clade of Rhodococcus. PMID:25010934

  13. DNA Sequence Analyses Reveal Abundant Diversity, Endemism and Evidence for Asian Origin of the Porcini Mushrooms

    PubMed Central

    Feng, Bang; Xu, Jianping; Wu, Gang; Zeng, Nian-Kai; Li, Yan-Chun; Tolgor, Bau; Kost, Gerhard W.; Yang, Zhu L.

    2012-01-01

    The wild gourmet mushroom Boletus edulis and its close allies are of significant ecological and economic importance. They are found throughout the Northern Hemisphere, but despite their ubiquity there are still many unresolved issues with regard to the taxonomy, systematics and biogeography of this group of mushrooms. Most phylogenetic studies of Boletus so far have characterized samples from North America and Europe and little information is available on samples from other areas, including the ecologically and geographically diverse regions of China. Here we analyzed DNA sequence variation in three gene markers from samples of these mushrooms from across China and compared our findings with those from other representative regions. Our results revealed fifteen novel phylogenetic species (about one-third of the known species) and a newly identified lineage represented by Boletus sp. HKAS71346 from tropical Asia. The phylogenetic analyses support eastern Asia as the center of diversity for the porcini sensu stricto clade. Within this clade, B. edulis is the only known holarctic species. The majority of the other phylogenetic species are geographically restricted in their distributions. Furthermore, molecular dating and geological evidence suggest that this group of mushrooms originated during the Eocene in eastern Asia, followed by dispersal to and subsequent speciation in other parts of Asia, Europe, and the Americas from the middle Miocene through the early Pliocene. In contrast to the ancient dispersal of porcini in the strict sense in the Northern Hemisphere, the occurrence of B. reticulatus and B. edulis sensu lato in the Southern Hemisphere was probably due to recent human-mediated introductions. PMID:22629418

  14. Complex Evolutionary History of the Aeromonas veronii Group Revealed by Host Interaction and DNA Sequence Data

    PubMed Central

    Faucher, Joshua; Horneman, Amy J.; Gogarten, J. Peter; Graf, Joerg

    2011-01-01

    Aeromonas veronii biovar sobria, Aeromonas veronii biovar veronii, and Aeromonas allosaccharophila are a closely related group of organisms, the Aeromonas veronii Group, that inhabit a wide range of host animals as a symbiont or pathogen. In this study, the ability of various strains to colonize the medicinal leech as a model for beneficial symbiosis and to kill wax worm larvae as a model for virulence was determined. Isolates cultured from the leech out-competed other strains in the leech model, while most strains were virulent in the wax worms. Three housekeeping genes, recA, dnaJ and gyrB, the gene encoding chitinase, chiA, and four loci associated with the type three secretion system, ascV, ascFG, aexT, and aexU were sequenced. The phylogenetic reconstruction failed to produce one consensus tree that was compatible with most of the individual genes. The Approximately Unbiased test and the Genetic Algorithm for Recombination Detection both provided further support for differing evolutionary histories among this group of genes. Two contrasting tests detected recombination within aexU, ascFG, ascV, dnaJ, and gyrB but not in aexT or chiA. Quartet decomposition analysis indicated a complex recent evolutionary history for these strains with a high frequency of horizontal gene transfer between several but not among all strains. In this study we demonstrate that at least for some strains, horizontal gene transfer occurs at a sufficient frequency to blur the signal from vertically inherited genes, despite strains being adapted to distinct niches. Simply increasing the number of genes included in the analysis is unlikely to overcome this challenge in organisms that occupy multiple niches and can exchange DNA between strains specialized to different niches. Instead, the detection of genes critical in the adaptation to specific niches may help to reveal the physiological specialization of these strains. PMID:21359176

  15. Complex evolutionary history of the Aeromonas veronii group revealed by host interaction and DNA sequence data.

    PubMed

    Silver, Adam C; Williams, David; Faucher, Joshua; Horneman, Amy J; Gogarten, J Peter; Graf, Joerg

    2011-02-16

    Aeromonas veronii biovar sobria, Aeromonas veronii biovar veronii, and Aeromonas allosaccharophila are a closely related group of organisms, the Aeromonas veronii Group, that inhabit a wide range of host animals as a symbiont or pathogen. In this study, the ability of various strains to colonize the medicinal leech as a model for beneficial symbiosis and to kill wax worm larvae as a model for virulence was determined. Isolates cultured from the leech out-competed other strains in the leech model, while most strains were virulent in the wax worms. Three housekeeping genes, recA, dnaJ and gyrB, the gene encoding chitinase, chiA, and four loci associated with the type three secretion system, ascV, ascFG, aexT, and aexU were sequenced. The phylogenetic reconstruction failed to produce one consensus tree that was compatible with most of the individual genes. The Approximately Unbiased test and the Genetic Algorithm for Recombination Detection both provided further support for differing evolutionary histories among this group of genes. Two contrasting tests detected recombination within aexU, ascFG, ascV, dnaJ, and gyrB but not in aexT or chiA. Quartet decomposition analysis indicated a complex recent evolutionary history for these strains with a high frequency of horizontal gene transfer between several but not among all strains. In this study we demonstrate that at least for some strains, horizontal gene transfer occurs at a sufficient frequency to blur the signal from vertically inherited genes, despite strains being adapted to distinct niches. Simply increasing the number of genes included in the analysis is unlikely to overcome this challenge in organisms that occupy multiple niches and can exchange DNA between strains specialized to different niches. Instead, the detection of genes critical in the adaptation to specific niches may help to reveal the physiological specialization of these strains.

  16. Temporal Dynamics of Avian Populations during Pleistocene Revealed by Whole-Genome Sequences.

    PubMed

    Nadachowska-Brzyska, Krystyna; Li, Cai; Smeds, Linnea; Zhang, Guojie; Ellegren, Hans

    2015-05-18

    Global climate fluctuations have significantly influenced the distribution and abundance of biodiversity. During unfavorable glacial periods, many species experienced range contraction and fragmentation, expanding again during interglacials. An understanding of the evolutionary consequences of both historical and ongoing climate changes requires knowledge of the temporal dynamics of population numbers during such climate cycles. Variation in abundance should have left clear signatures in the patterns of intraspecific genetic variation in extant species, from which historical effective population sizes (N(e)) can be estimated. We analyzed whole-genome sequences of 38 avian species in a pairwise sequentially Markovian coalescent (PSMC, [5]) framework to quantitatively reveal changes in N(e) from approximately 10 million to 10 thousand years ago. Significant fluctuations in N(e) over time were evident for most species. The most pronounced pattern observed in many species was a severe reduction in N(e) coinciding with the beginning of the last glacial period (LGP). Among species, N(e) varied by at least three orders of magnitude, exceeding 1 million in the most abundant species. Several species on the IUCN Red List of Threatened Species showed long-term reduction in population size, predating recent declines. We conclude that cycles of population expansions and contractions have been a common feature of many bird species during the Quaternary period, likely coinciding with climate cycles. Population size reduction should have increased the risk of extinction but may also have promoted speciation. Species that have experienced long-term declines may be especially vulnerable to recent anthropogenic threats.

  17. Deep small RNA sequencing from the nematode Ascaris reveals conservation, functional diversification, and novel developmental profiles.

    PubMed

    Wang, Jianbin; Czech, Benjamin; Crunk, Amanda; Wallace, Adam; Mitreva, Makedonka; Hannon, Gregory J; Davis, Richard E

    2011-09-01

    Eukaryotic cells express several classes of small RNAs that regulate gene expression and ensure genome maintenance. Endogenous siRNAs (endo-siRNAs) and Piwi-interacting RNAs (piRNAs) mainly control gene and transposon expression in the germline, while microRNAs (miRNAs) generally function in post-transcriptional gene silencing in both somatic and germline cells. To provide an evolutionary and developmental perspective on small RNA pathways in nematodes, we identified and characterized known and novel small RNA classes through gametogenesis and embryo development in the parasitic nematode Ascaris suum and compared them with known small RNAs of Caenorhabditis elegans. piRNAs, Piwi-clade Argonautes, and other proteins associated with the piRNA pathway have been lost in Ascaris. miRNAs are synthesized immediately after fertilization in utero, before pronuclear fusion, and before the first cleavage of the zygote. This is the earliest expression of small RNAs ever described at a developmental stage long thought to be transcriptionally quiescent. A comparison of the two classes of Ascaris endo-siRNAs, 22G-RNAs and 26G-RNAs, to those in C. elegans, suggests great diversification and plasticity in the use of small RNA pathways during spermatogenesis in different nematodes. Our data reveal conserved characteristics of nematode small RNAs as well as features unique to Ascaris that illustrate significant flexibility in the use of small RNAs pathways, some of which are likely an adaptation to Ascaris' life cycle and parasitism. The transcriptome assembly has been submitted to NCBI Transcriptome Shotgun Assembly Sequence Database(http://www.ncbi.nlm.nih.gov/genbank/TSA.html) under accession numbers JI163767–JI182837 and JI210738–JI257410.

  18. Genome sequencing reveals insights into physiology and longevity of the naked mole rat.

    PubMed

    Kim, Eun Bae; Fang, Xiaodong; Fushan, Alexey A; Huang, Zhiyong; Lobanov, Alexei V; Han, Lijuan; Marino, Stefano M; Sun, Xiaoqing; Turanov, Anton A; Yang, Pengcheng; Yim, Sun Hee; Zhao, Xiang; Kasaikina, Marina V; Stoletzki, Nina; Peng, Chunfang; Polak, Paz; Xiong, Zhiqiang; Kiezun, Adam; Zhu, Yabing; Chen, Yuanxin; Kryukov, Gregory V; Zhang, Qiang; Peshkin, Leonid; Yang, Lan; Bronson, Roderick T; Buffenstein, Rochelle; Wang, Bo; Han, Changlei; Li, Qiye; Chen, Li; Zhao, Wei; Sunyaev, Shamil R; Park, Thomas J; Zhang, Guojie; Wang, Jun; Gladyshev, Vadim N

    2011-10-12

    The naked mole rat (Heterocephalus glaber) is a strictly subterranean, extraordinarily long-lived eusocial mammal. Although it is the size of a mouse, its maximum lifespan exceeds 30 years, making this animal the longest-living rodent. Naked mole rats show negligible senescence, no age-related increase in mortality, and high fecundity until death. In addition to delayed ageing, they are resistant to both spontaneous cancer and experimentally induced tumorigenesis. Naked mole rats pose a challenge to the theories that link ageing, cancer and redox homeostasis. Although characterized by significant oxidative stress, the naked mole rat proteome does not show age-related susceptibility to oxidative damage or increased ubiquitination. Naked mole rats naturally reside in large colonies with a single breeding female, the 'queen', who suppresses the sexual maturity of her subordinates. They also live in full darkness, at low oxygen and high carbon dioxide concentrations, and are unable to sustain thermogenesis nor feel certain types of pain. Here we report the sequencing and analysis of the naked mole rat genome, which reveals unique genome features and molecular adaptations consistent with cancer resistance, poikilothermy, hairlessness and insensitivity to low oxygen, and altered visual function, circadian rythms and taste sensing. This information provides insights into the naked mole rat's exceptional longevity and ability to live in hostile conditions, in the dark and at low oxygen. The extreme traits of the naked mole rat, together with the reported genome and transcriptome information, offer opportunities for understanding ageing and advancing other areas of biological and biomedical research.

  19. An analysis of amino acid sequences surrounding archaeal glycoprotein sequons.

    PubMed

    Abu-Qarn, Mehtap; Eichler, Jerry

    2007-05-01

    Despite having provided the first example of a prokaryal glycoprotein, little is known of the rules governing the N-glycosylation process in Archaea. As in Eukarya and Bacteria, archaeal N-glycosylation takes place at the Asn residues of Asn-X-Ser/Thr sequons. Since not all sequons are utilized, it is clear that other factors, including the context in which a sequon exists, affect glycosylation efficiency. As yet, the contribution to N-glycosylation made by sequon-bordering residues and other related factors in Archaea remains unaddressed. In the following, the surroundings of Asn residues confirmed by experiment as modified were analyzed in an attempt to define sequence rules and requirements for archaeal N-glycosylation.

  20. Urinary metabolomics in Fxr-null mice reveals activated adaptive metabolic pathways upon bile acid challenge.

    PubMed

    Cho, Joo-Youn; Matsubara, Tsutomu; Kang, Dong Wook; Ahn, Sung-Hoon; Krausz, Kristopher W; Idle, Jeffrey R; Luecke, Hans; Gonzalez, Frank J

    2010-05-01

    Farnesoid X receptor (FXR) is a nuclear receptor that regulates genes involved in synthesis, metabolism, and transport of bile acids and thus plays a major role in maintaining bile acid homeostasis. In this study, metabolomic responses were investigated in urine of wild-type and Fxr-null mice fed cholic acid, an FXR ligand, using ultra-performance liquid chromatography (UPLC) coupled with electrospray time-of-flight mass spectrometry (TOFMS). Multivariate data analysis between wild-type and Fxr-null mice on a cholic acid diet revealed that the most increased ions were metabolites of p-cresol (4-methylphenol), corticosterone, and cholic acid in Fxr-null mice. The structural identities of the above metabolites were confirmed by chemical synthesis and by comparing retention time (RT) and/or tandem mass fragmentation patterns of the urinary metabolites with the authentic standards. Tauro-3alpha,6,7alpha,12alpha-tetrol (3alpha,6,7alpha,12alpha-tetrahydroxy-5beta-cholestan-26-oyltaurine), one of the most increased metabolites in Fxr-null mice on a CA diet, is a marker for efficient hydroxylation of toxic bile acids possibly through induction of Cyp3a11. A cholestatic model induced by lithocholic acid revealed that enhanced expression of Cyp3a11 is the major defense mechanism to detoxify cholestatic bile acids in Fxr-null mice. These results will be useful for identification of biomarkers for cholestasis and for determination of adaptive molecular mechanisms in cholestasis.

  1. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids

  2. Classification of mouse VK groups based on the partial amino acid sequence to the first invariant tryptophan: impact of 14 new sequences from IgG myeloma proteins.

    PubMed

    Potter, M; Newell, J B; Rudikoff, S; Haber, E

    1982-12-01

    Fourteen new VK sequences derived from BALB/c IgG myeloma proteins were determined to the first invariant tryptophan (Trp 35). These partial sequences were compared with 65 other published VK sequences using a computer program. The 79 sequences were organized according to the length of the sequence from the amino terminus to the first invariant tryptophan (Trp 35), into seven groups (33, 34, 35, 36, 39, 40 and 41aa). A distance matrix of all 79 sequences was then computed, i.e. the number of amino acid substitutions necessary to convert one sequence to another was determined. From these data a dendrogram was constructed. Most of the VK sequences fell into clusters or closely related groups. The definition of a sequence group is arbitrary but facilitates the classification of VK proteins. We used 12 substitutions as the basis for defining a sequence group based on the known number of substitutions that are found in the VK21 proteins. By this criterion there were 18 groups in the Trp 35 dendrogram. Twelve of the 14 new sequences fell into one of these sequence groups; two formed new sequence groups. Collective amino acid sequencing is still encountering new VK structures indicating more sequences will be required to attain an accurate estimate of the total number of VK groups. Updated dendrograms can be quickly generated to include newly generated sequences.

  3. Sequence-specific interaction between HIV-1 matrix protein and viral genomic RNA revealed by in vitro genetic selection.

    PubMed Central

    Purohit, P; Dupont, S; Stevenson, M; Green, M R

    2001-01-01

    The human immunodeficiency virus type-1 matrix protein (HIV-1 MA) is a multifunctional structural protein synthesized as part of the Pr55 gag polyprotein. We have used in vitro genetic selection to identify an RNA consensus sequence that specifically interacts with MA (Kd = 5 x 10(-7) M). This 13-nt MA binding consensus sequence bears a high degree of homology (77%) to a region (nt 1433-1446) within the POL open reading frame of the HIV-1 genome (consensus sequence from 38 HIV-1 strains). Chemical interference experiments identified the nucleotides within the MA binding consensus sequence involved in direct contact with MA. We further demonstrate that this RNA-protein interaction is mediated through a stretch of basic amino acids within MA. Mutations that disrupt the interaction between MA and its RNA binding site within the HIV-1 genome resulted in a measurable decrease in viral replication. PMID:11345436

  4. Chloroplast phylogenomic data from the green algal order Sphaeropleales (Chlorophyceae, Chlorophyta) reveal complex patterns of sequence evolution.

    PubMed

    Fučíková, Karolina; Lewis, Paul O; Lewis, Louise A

    2016-05-01

    Chloroplast sequence data are widely used to infer phylogenies of plants and algae. With the increasing availability of complete chloroplast genome sequences, the opportunity arises to resolve ancient divergences that were heretofore problematic. On the flip side, properly analyzing large multi-gene data sets can be a major challenge, as these data may be riddled with systematic biases and conflicting signals. Our study contributes new data from nine complete and four fragmentary chloroplast genome sequences across the green algal order Sphaeropleales. Our phylogenetic analyses of a 56-gene data set show that analyzing these data on a nucleotide level yields a well-supported phylogeny - yet one that is quite different from a corresponding amino acid analysis. We offer some possible explanations for this conflict through a range of analyses of modified data sets. In addition, we characterize the newly sequenced genomes in terms of their structure and content, thereby further contributing to the knowledge of chloroplast genome evolution.

  5. Single-cell Sequencing of Thiomargarita Reveals Genomic Flexibility for Adaptation to Dynamic Redox Conditions.

    PubMed

    Winkel, Matthias; Salman-Carvalho, Verena; Woyke, Tanja; Richter, Michael; Schulz-Vogt, Heide N; Flood, Beverly E; Bailey, Jake V; Mußmann, Marc

    2016-01-01

    Large, colorless sulfur-oxidizing bacteria (LSB) of the family Beggiatoaceae form thick mats at sulfidic sediment surfaces, where they efficiently detoxify sulfide before it enters the water column. The genus Thiomargarita harbors the largest known free-living bacteria with cell sizes of up to 750 μm in diameter. In addition to their ability to oxidize reduced sulfur compounds, some Thiomargarita spp. are known to store large amounts of nitrate, phosphate and elemental sulfur internally. To date little is known about their energy yielding metabolic pathways, and how these pathways compare to other Beggiatoaceae. Here, we present a draft single-cell genome of a chain-forming "Candidatus Thiomargarita nelsonii Thio36", and conduct a comparative analysis to five draft and one full genome of other members of the Beggiatoaceae. "Ca. T. nelsonii Thio36" is able to respire nitrate to both ammonium and dinitrogen, which allows them to flexibly respond to environmental changes. Genes for sulfur oxidation and inorganic carbon fixation confirmed that "Ca. T. nelsonii Thio36" can function as a chemolithoautotroph. Carbon can be fixed via the Calvin-Benson-Bassham cycle, which is common among the Beggiatoaceae. In addition we found key genes of the reductive tricarboxylic acid cycle that point toward an alternative CO2 fixation pathway. Surprisingly, "Ca. T. nelsonii Thio36" also encodes key genes of the C2-cycle that convert 2-phosphoglycolate to 3-phosphoglycerate during photorespiration in higher plants and cyanobacteria. Moreover, we identified a novel trait of a flavin-based energy bifurcation pathway coupled to a Na(+)-translocating membrane complex (Rnf). The coupling of these pathways may be key to surviving long periods of anoxia. As other Beggiatoaceae "Ca. T. nelsonii Thio36" encodes many genes similar to those of (filamentous) cyanobacteria. In summary, the genome of "Ca. T. nelsonii Thio36" provides additional insight into the ecology of giant sulfur

  6. Not All Order Memory Is Equal: Test Demands Reveal Dissociations in Memory for Sequence Information

    ERIC Educational Resources Information Center

    Jonker, Tanya R.; MacLeod, Colin M.

    2017-01-01

    Remembering the order of a sequence of events is a fundamental feature of episodic memory. Indeed, a number of formal models represent temporal context as part of the memory system, and memory for order has been researched extensively. Yet, the nature of the code(s) underlying sequence memory is still relatively unknown. Across 4 experiments that…

  7. Whole genome sequencing of a begomovirus-resistant tomato inbred reveals introgressions from wild Solanum species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The low cost of next generation sequencing (NGS) technology and the availability of a large number of well annotated plant genomes has made sequencing technology useful to breeding programs. With the published high quality tomato reference genome of the processing cultivar Heinz 1706, we can now uti...

  8. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    PubMed

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  9. Analysis of Human mRNAs With the Reference Genome Sequence Reveals Potential Errors, Polymorphisms, and RNA Editing

    PubMed Central

    Furey, Terrence S.; Diekhans, Mark; Lu, Yontao; Graves, Tina A.; Oddy, Lachlan; Randall-Maher, Jennifer; Hillier, LaDeana W.; Wilson, Richard K.; Haussler, David

    2004-01-01

    The NCBI Reference Sequence (RefSeq) project and the NIH Mammalian Gene Collection (MGC) together define a set of ∼30,000 nonredundant human mRNA sequences with identified coding regions representing 17,000 distinct loci. These high-quality mRNA sequences allow for the identification of transcribed regions in the human genome sequence, and many researchers accept them as the correct representation of each defined gene sequence. Computational comparison of these mRNA sequences and the recently published essentially finished human genome sequence reveals several thousand undocumented nonsynonymous substitution and frame shift discrepancies between the two resources. Additional analysis is undertaken to verify that the euchromatic human genome is sufficiently complete—containing nearly the whole mRNA collection, thus allowing for a comprehensive analysis to be undertaken. Many of the discrepancies will prove to be genuine polymorphisms in the human population, somatic cell genomic variants, or examples of RNA editing. It is observed that the genome sequence variant has significant additional support from other mRNAs and ESTs, almost four times more often than does the mRNA variant, suggesting that the genome sequence is more accurate. In ∼15% of these cases, there is substantial support for both variants, suggestive of an undocumented polymorphism. An initial screening against a 24-individual genomic DNA diversity panel verified 60% of a small set of potential single nucleotide polymorphisms from which successful results could be obtained. We also find statistical evidence that a few of these discrepancies are due to RNA editing. Overall, these results suggest that the mRNA collections may contain a substantial number of errors. For current and future mRNA collections, it may be prudent to fully reconcile each genome sequence discrepancy, classifying each as a polymorphism, site of RNA editing or somatic cell variation, or genome sequence error. PMID:15489323

  10. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  11. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  12. Trans genomic capture and sequencing of primate exomes reveals new targets of positive selection.

    PubMed

    George, Renee D; McVicker, Graham; Diederich, Rachel; Ng, Sarah B; MacKenzie, Alexandra P; Swanson, Willie J; Shendure, Jay; Thomas, James H

    2011-10-01

    Comparison of protein-coding DNA sequences from diverse primates can provide insight into these species' evolutionary history and uncover the molecular basis for their phenotypic differences. Currently, the number of available primate reference genomes limits these genome-wide comparisons. Here we use targeted capture methods designed for human to sequence the protein-coding regions, or exomes, of four non-human primate species (three Old World monkeys and one New World monkey). Despite average sequence divergence of up to 4% from the human sequence probes, we are able to capture ~96% of coding sequences. Using a combination of mapping and assembly techniques, we generated high-quality full-length coding sequences for each species. Both the number of nucleotide differences and the distribution of insertion and deletion (indel) lengths indicate that the quality of the assembled sequences is very high and exceeds that of most reference genomes. Using this expanded set of primate coding sequences, we performed a genome-wide scan for genes experiencing positive selection and identified a novel class of adaptively evolving genes involved in the conversion of epithelial cells in skin, hair, and nails to keratin. Interestingly, the genes we identify under positive selection also exhibit significantly increased allele frequency differences among human populations, suggesting that they play a role in both recent and long-term adaptation. We also identify several genes that have been lost on specific primate lineages, which illustrate the broad utility of this data set for other evolutionary analyses. These results demonstrate the power of second-generation sequencing in comparative genomics and greatly expand the repertoire of available primate coding sequences.

  13. Amino acid sequence around the active-site serine residue in the acyltransferase domain of goat mammary fatty acid synthetase.

    PubMed Central

    Mikkelsen, J; Højrup, P; Rasmussen, M M; Roepstorff, P; Knudsen, J

    1985-01-01

    Goat mammary fatty acid synthetase was labelled in the acyltransferase domain by formation of O-ester intermediates by incubation with [1-14C]acetyl-CoA and [2-14C]malonyl-CoA. Tryptic-digest and CNBr-cleavage peptides were isolated and purified by high-performance reverse-phase and ion-exchange liquid chromatography. The sequences of the malonyl- and acetyl-labelled peptides were shown to be identical. The results confirm the hypothesis that both acetyl and malonyl groups are transferred to the mammalian fatty acid synthetase complex by the same transferase. The sequence is compared with those of other fatty acid synthetase transferases. PMID:3922356

  14. Draft Genome Sequence of the Deep-Sea Basidiomycetous Yeast Cryptococcus sp. Strain Mo29 Reveals Its Biotechnological Potential

    PubMed Central

    Rédou, Vanessa; Kumar, Abhishek; Hainaut, Matthieu; Henrissat, Bernard; Record, Eric; Barbier, Georges

    2016-01-01

    Cryptococcus sp. strain Mo29 was isolated from the Rainbow hydrothermal site on the Mid-Atlantic Ridge. Here, we present the draft genome sequence of this basidiomycetous yeast strain, which has highlighted its biotechnological potential as revealed by the presence of genes involved in the synthesis of secondary metabolites and biotechnologically important enzymes. PMID:27389259

  15. Metagenome Sequencing Reveals Rhodococcus Dominance in Farpuk Cave, Mizoram, India, an Eastern Himalayan Biodiversity Hot Spot Region

    PubMed Central

    De Mandal, Surajit; Sanga, Zothan

    2015-01-01

    The present study employed 16S rRNA amplicon sequencing to survey the prokaryotic microbiota on Farpuk Cave, revealing a diverse bacterial community with 4,021 operational taxonomical units (OTUs), mainly dominated by the genus Rhodococcus. Moreover, 18.17% of the OTUs were unclassified at the phylum level, suggesting the existence of novel bacterial species. PMID:26067958

  16. Identification of a Bacteria Using Phylogenetic Relationships Revealed by MS/MS Sequencing of Tryptic Peptides Derived From Cellular Proteins

    DTIC Science & Technology

    2004-12-01

    phylogenetic relationships between bacterial species as a part of a hierarchical decision tree process. 1. INTRODUCTION The detection and...1 IDENTIFICATION OF BACTERIA USING PHYLOGENETIC RELATIONSHIPS REVEALED BY MS/MS SEQUENCING OF TRYPTIC PEPTIDES DERIVED FROM CELLULAR PROTEINS...based on analysis of an electrospray ionization (ESI)-MS/MS data for the fast classification of analyzed bacteria, using phylogenetic relationships

  17. Ligation with nucleic acid sequence-based amplification.

    PubMed

    Ong, Carmichael; Tai, Warren; Sarma, Aartik; Opal, Steven M; Artenstein, Andrew W; Tripathi, Anubhav

    2012-01-01

    This work presents a novel method for detecting nucleic acid targets using a ligation step along with an isothermal, exponential amplification step. We use an engineered ssDNA with two variable regions on the ends, allowing us to design the probe for optimal reaction kinetics and primer binding. This two-part probe is ligated by T4 DNA Ligase only when both parts bind adjacently to the target. The assay demonstrates that the expected 72-nt RNA product appears only when the synthetic target, T4 ligase, and both probe fragments are present during the ligation step. An extraneous 38-nt RNA product also appears due to linear amplification of unligated probe (P3), but its presence does not cause a false-positive result. In addition, 40 mmol/L KCl in the final amplification mix was found to be optimal. It was also found that increasing P5 in excess of P3 helped with ligation and reduced the extraneous 38-nt RNA product. The assay was also tested with a single nucleotide polymorphism target, changing one base at the ligation site. The assay was able to yield a negative signal despite only a single-base change. Finally, using P3 and P5 with longer binding sites results in increased overall sensitivity of the reaction, showing that increasing ligation efficiency can improve the assay overall. We believe that this method can be used effectively for a number of diagnostic assays.

  18. Transcriptome Sequencing Reveals Wide Expression Reprogramming of Basal and Unknown Genes in Leptospira biflexa Biofilms

    PubMed Central

    Spangenberg, Lucía; Lopes Bastos, Bruno; Graña, Martín; Vasconcelos, Larissa; Almeida, Áurea; Greif, Gonzalo; Robello, Carlos; Ristow, Paula

    2016-01-01

    ABSTRACT The genus Leptospira is composed of pathogenic and saprophytic spirochetes. Pathogenic Leptospira is the etiological agent of leptospirosis, a globally spread neglected disease. A key ecological feature of some pathogenic species is their ability to survive both within and outside the host. For most leptospires, the ability to persist outside the host is associated with biofilm formation, a most important bacterial strategy to face and overcome hostile environmental conditions. The architecture and biochemistry of leptospiral biofilms are rather well understood; however, the genetic program underpinning biofilm formation remains mostly unknown. In this work, we used the saprophyte Leptospira biflexa as a model organism to assess over- and underrepresented transcripts during the biofilm state, using transcriptome sequencing (RNA-seq) technology. Our results showed that some basal biological processes like DNA replication and cell division are downregulated in the mature biofilm. Additionally, we identified significant expression reprogramming for genes involved in motility, sugar/lipid metabolism, and iron scavenging, as well as for outer membrane-encoding genes. A careful manual annotation process allowed us to assign molecular functions to many previously uncharacterized genes that are probably involved in biofilm metabolism. We also provided evidence for the presence of small regulatory RNAs in this species. Finally, coexpression networks were reconstructed to pinpoint functionally related gene clusters that may explain how biofilm maintenance is regulated. Beyond elucidating some genetic aspects of biofilm formation, this work reveals a number of pathways whose functional dissection may impact our understanding of leptospiral biology, in particular how these organisms adapt to environmental changes. IMPORTANCE In this work, we describe the first transcriptome based on RNA-seq technology focused on studying transcriptional changes associated with biofilm

  19. Thin-film technology for direct visual detection of nucleic acid sequences: applications in clinical research.

    PubMed

    Jenison, Robert D; Bucala, Richard; Maul, Diana; Ward, David C

    2006-01-01

    Certain optical conditions permit the unaided eye to detect thickness changes on surfaces on the order of 20 A, which are of similar dimensions to monomolecular interactions between proteins or hybridization of complementary nucleic acid sequences. Such detection exploits specific interference of reflected white light, wherein thickness changes are perceived as surface color changes. This technology, termed thin-film detection, allows for the visualization of subattomole amounts of nucleic acid targets, even in complex clinical samples. Thin-film technology has been applied to a broad range of clinically relevant indications, including the detection of pathogenic bacterial and viral nucleic acid sequences and the discrimination of sequence variations in human genes causally related to susceptibility or severity of disease.

  20. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  1. RNA internal standard synthesis by nucleic acid sequence-based amplification for competitive quantitative amplification reactions.

    PubMed

    Lo, Wan-Yu; Baeumner, Antje J

    2007-02-15

    Nucleic acid sequence-based amplification (NASBA) reactions have been demonstrated to successfully synthesize new sequences based on deletion and insertion reactions. Two RNA internal standards were synthesized for use in competitive amplification reactions in which quantitative analysis can be achieved by coamplifying the internal standard with the wild type sample. The sequences were created in two consecutive NASBA reactions using the E. coli clpB mRNA sequence as model analyte. The primer sequences of the wild type sequence were maintained, and a 20-nt-long segment inside the amplicon region was exchanged for a new segment of similar GC content and melting temperature. The new RNA sequence was thus amplifiable using the wild type primers and detectable via a new inserted sequence. In the first reaction, the forwarding primer and an additional 20-nt-long sequence was deleted and replaced by a new 20-nt-long sequence. In the second reaction, a forwarding primer containing as 5' overhang sequence the wild type primer sequence was used. The presence of pure internal standard was verified using electrochemiluminescence and RNA lateral-flow biosensor analysis. Additional sequence deletion in order to shorten the internal standard amplicons and thus generate higher detection signals was found not to be required. Finally, a competitive NASBA reaction between one internal standard and the wild type sequence was carried out proving its functionality. This new rapid construction method via NASBA provides advantages over the traditional techniques since it requires no traditional cloning procedures, no thermocyclers, and can be completed in less than 4 h.

  2. The High-throughput sequencing of Sillago japonica mitochondrial genome reveals the phylogenetic position within the genus Sillago.

    PubMed

    Niu, Sufang; Wu, Renxie; Liu, Yong; Wang, Xuefeng

    2016-09-01

    The complete mitogenome of Sillago japonica was determined through high-throughput DNA sequencing technology. The circular mtDNA molecule was 16 645 bp in size and encoded 13 protein-coding genes, 2 rRNAs, 22 tRNAs and 2 non-coding regions, with the gene arrangement and content identical to other typical vertebrate mitogenomes. The identity analysis revealed that the mitogenome sequence of S. japonica shared a relatively high sequence identity to S. asiatica (81.5%) compared with S. aeolus (77.5%), S. indica (77.1%), and S. sihama (76.3%). The neighbor-joining tree of complete mitogenome sequence showed that S. japonica firstly clustered together with S. asiatica, then grouped with S. indica and S. sihama, and finally gathered with S. aeolus. Taken together, the results absolutely supported the evolutionary position of S. japonica and provided new insights into phylogenetic relationships of Sillago.

  3. The phylogenetic relationship of tetrapod, coelacanth, and lungfish revealed by the sequences of forty-four nuclear genes.

    PubMed

    Takezaki, Naoko; Figueroa, Felipe; Zaleska-Rutczynska, Zofia; Takahata, Naoyuki; Klein, Jan

    2004-08-01

    The origin of tetrapods is a major outstanding issue in vertebrate phylogeny. Each of the three possible principal hypotheses (coelacanth, lungfish, or neither being the sister group of tetrapods) has found support in different sets of data. In an attempt to resolve the controversy, sequences of 44 nuclear genes encoding amino acid residues at 10,404 positions were obtained and analyzed. However, this large set of sequences did not support conclusively one of the three hypotheses. Apparently, the coelacanth, lungfish, and tetrapod lineages diverged within such a short time interval that at this level of analysis, their relationships appear to be an irresolvable trichotomy.

  4. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    PubMed Central

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-01-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment. PMID:28262684

  5. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    NASA Astrophysics Data System (ADS)

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  6. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction.

    PubMed

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-06

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  7. Multilocus sequence typing reveals that Bacillus cereus strains isolated from clinical infections have distinct phylogenetic origins.

    PubMed

    Barker, Margaret; Thakker, Bishan; Priest, Fergus G

    2005-04-01

    Eight strains of Bacillus cereus isolated from bacteremia and soft tissue infections were assigned to seven sequence types (STs) by multilocus sequence typing (MLST). Two strains from different locations had identical STs. The concatenated sequences of the seven STs were aligned with 65 concatenated sequences from reference STs and a neighbor-joining tree was constructed. Two strains were distantly related to all reference STs. Three strains were recovered in a clade that included Bacillus anthracis, B. cereus and rare Bacillus thuringiensis strains while the other three strains were assigned to two STs that were more closely affiliated to most of the B. thuringiensis STs. We conclude that invasive B. cereus strains do not form a single clone or clonal complex of highly virulent strains.

  8. De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

    PubMed

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.

  9. De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum

    PubMed Central

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689

  10. Characterization and phylogenetic analysis of α-gliadin gene sequences reveals significant genomic divergence in Triticeae species.

    PubMed

    Li, Guang-Rong; Lang, Tao; Yang, En-Nian; Liu, Cheng; Yang, Zu-Jun

    2014-12-01

    Although the unique properties of wheat α-gliadin gene family are well characterized, little is known about the evolution and genomic divergence of α-gliadin gene family within the Triticeae. We isolated a total of 203 α-gliadin gene sequences from 11 representative diploid and polyploid Triticeae species, and found 108 sequences putatively functional. Our results indicate that α-gliadin genes may have possibly originated from wild Secale species, where the sequences contain the shortest repetitive domains and display minimum variation. A miniature inverted-repeat transposable element insertion is reported for the first time in α-gliadin gene sequence of Thinopyrum intermedium in this study, indicating that the transposable element might have contributed to the diversification of α-gliadin genes family among Triticeae genomes. The phylogenetic analyses revealed that the α-gliadin gene sequences of Dasypyrum, Australopyrum, Lophopyrum, Eremopyrum and Pseudoroengeria species have amplified several times. A search for four typical toxic epitopes for celiac disease within the Triticeae α-gliadin gene sequences showed that the α-gliadins of wild Secale, Australopyrum and Agropyron genomes lack all four epitopes, while other Triticeae species have accumulated these epitopes, suggesting that the evolution of these toxic epitopes sequences occurred during the course of speciation, domestication or polyploidization of Triticeae.

  11. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    NASA Astrophysics Data System (ADS)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

    2014-12-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5'-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 . 10-14 cm2 and 7.06 . 10-14 cm2. The highest cross section was found for 5'-TT(ATA)3TT and 5'-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy.

  12. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    PubMed Central

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346

  13. Microbial structures, functions, and metabolic pathways in wastewater treatment bioreactors revealed using high-throughput sequencing.

    PubMed

    Ye, Lin; Zhang, Tong; Wang, Taitao; Fang, Zhiwei

    2012-12-18

    The objective of this study was to explore microbial community structures, functional profiles, and metabolic pathways in a lab-scale and a full-scale wastewater treatment bioreactors. In order to do this, over 12 gigabases of metagenomic sequence data and 600,000 paired-end sequences of bacterial 16S rRNA gene were generated with the Illumina HiSeq 2000 platform, using DNA extracted from activated sludge in the two bioreactors. Three kinds of sequences (16S rRNA gene amplicons, 16S rRNA gene sequences obtained from metagenomic sequencing, and predicted proteins) were used to conduct taxonomic assignments. Specially, relative abundances of ammonia-oxidizing archaea (AOA) and ammonia-oxidizing bacteria (AOB) were analyzed. Compared with quantitative real-time PCR (qPCR), metagenomic sequencing was demonstrated to be a better approach to quantify AOA and AOB in activated sludge samples. It was found that AOB were more abundant than AOA in both reactors. Furthermore, the analysis of the metabolic profiles indicated that the overall patterns of metabolic pathways in the two reactors were quite similar (73.3% of functions shared). However, for some pathways (such as carbohydrate metabolism and membrane transport), the two reactors differed in the number of pathway-specific genes.

  14. Single-cell sequencing of Thiomargarita reveals genomic flexibility for adaptation to dynamic redox conditions

    DOE PAGES

    Winkel, Matthias; Salman-Carvalho, Verena; Woyke, Tanja; ...

    2016-06-21

    Large, colorless sulfur-oxidizing bacteria (LSB) of the family Beggiatoaceae form thick mats at sulfidic sediment surfaces, where they efficiently detoxify sulfide before it enters the water column. The genus Thiomargarita harbors the largest known free-living bacteria with cell sizes of up to 750 μm in diameter. In addition to their ability to oxidize reduced sulfur compounds, some Thiornargarita spp. are known to store large amounts of nitrate, phosphate and elemental sulfur internally. To date little is known about their energy yielding metabolic pathways, and how these pathways compare to other Beggiatoaceae. Here, we present a draft single-cell genome of amore » chain-forming "Candidatus Thiomargarita nelsonii Thio36", and conduct a comparative analysis to five draft and one full genome of other members of the Beggiatoaceae. "Ca. T. nelsonii Thio36" is able to respire nitrate to both ammonium and dinitrogen, which allows them to flexibly respond to environmental changes. Genes for sulfur oxidation and inorganic carbon fixation confirmed that "Ca. T. nelsonii Thio36" can function as a chemolithoautotroph. Carbon can be fixed via the Calvin-Benson-Bassham cycle, which is common among the Beggiatoaceae. In addition we found key genes of the reductive tricarboxylic acid cycle that point toward an alternative CO2 fixation pathway. Surprisingly, "Ca. T. nelsonii Thio36" also encodes key genes of the C2-cycle that convert 2-phosphoglycolate to 3-phosphoglycerate during photorespiration in higher plants and cyanobacteria. Moreover, we identified a novel trait of a flavin-based energy bifurcation pathway coupled to a Na+-translocating membrane complex (Rnf). The coupling of these pathways may be key to surviving long periods of anoxia. As other Beggiatoaceae "Ca. T. nelsonii Thio36" encodes many genes similar to those of (filamentous) cyanobacteria. In conclusion, the genome of "Ca. T. nelsonii Thio36" provides additional insight into the ecology of giant sulfur

  15. Single-cell Sequencing of Thiomargarita Reveals Genomic Flexibility for Adaptation to Dynamic Redox Conditions

    PubMed Central

    Winkel, Matthias; Salman-Carvalho, Verena; Woyke, Tanja; Richter, Michael; Schulz-Vogt, Heide N.; Flood, Beverly E.; Bailey, Jake V.; Mußmann, Marc

    2016-01-01

    Large, colorless sulfur-oxidizing bacteria (LSB) of the family Beggiatoaceae form thick mats at sulfidic sediment surfaces, where they efficiently detoxify sulfide before it enters the water column. The genus Thiomargarita harbors the largest known free-living bacteria with cell sizes of up to 750 μm in diameter. In addition to their ability to oxidize reduced sulfur compounds, some Thiomargarita spp. are known to store large amounts of nitrate, phosphate and elemental sulfur internally. To date little is known about their energy yielding metabolic pathways, and how these pathways compare to other Beggiatoaceae. Here, we present a draft single-cell genome of a chain-forming “Candidatus Thiomargarita nelsonii Thio36”, and conduct a comparative analysis to five draft and one full genome of other members of the Beggiatoaceae. “Ca. T. nelsonii Thio36” is able to respire nitrate to both ammonium and dinitrogen, which allows them to flexibly respond to environmental changes. Genes for sulfur oxidation and inorganic carbon fixation confirmed that “Ca. T. nelsonii Thio36” can function as a chemolithoautotroph. Carbon can be fixed via the Calvin–Benson–Bassham cycle, which is common among the Beggiatoaceae. In addition we found key genes of the reductive tricarboxylic acid cycle that point toward an alternative CO2 fixation pathway. Surprisingly, “Ca. T. nelsonii Thio36” also encodes key genes of the C2-cycle that convert 2-phosphoglycolate to 3-phosphoglycerate during photorespiration in higher plants and cyanobacteria. Moreover, we identified a novel trait of a flavin-based energy bifurcation pathway coupled to a Na+-translocating membrane complex (Rnf). The coupling of these pathways may be key to surviving long periods of anoxia. As other Beggiatoaceae “Ca. T. nelsonii Thio36” encodes many genes similar to those of (filamentous) cyanobacteria. In summary, the genome of “Ca. T. nelsonii Thio36” provides additional insight into the ecology of

  16. Complexity of gastric acid secretion revealed by targeted gene disruption in mice.

    PubMed

    Chen, Duan; Zhao, Chun-Mei

    2010-01-01

    Physiology of gastric acid secretion is one of the earliest subjects in medical research and education. Gastric acid secretion has been sometimes inadequately expressed as pH value rather than amount of gastric H(+) secreted per unit time. Gastric acid secretion is regulated by endocrine, paracrine and neurocrine signals via at least three messenger pathways: gastrin-histamine, CCK-somatostatin, and neural network. These pathways have been largely validated and further characterized by phenotyping a series of knockout mouse models. The complexity of gastric acid secretion is illustrated by both expected and unexpected phenotypes of altered acid secretion. For examples, in comparison with wild-type mice, gastrin and CCK double knockout and SSTR(2) knockout mice displayed a shift in the regulation of ECL cells from somatostatin-SSTR(2) pathway to galanin-Gal1 receptor pathway; a shift in the regulation of parietal cells from gastrin-histamine pathway to vagal pathway; and a shift in the CCK(2) receptors on parietal cells from functional silence to activation. The biological function of glycine-extended gastrin in synergizing gastrin-17 has been revealed in gastrin knockout mice. The roles of gastric acid secretion in tumorigenesis and ulceration have not been fully understood. Transgenic hypergastrinemic INS-GAS mice developed a spontaneous gastric cancer, which was associated with an impaired acid secretion. Gastrin knockout mice were still able to produce acid in response to vagal stimulation, especially after H. pylori infection. Taken together, phenotyping of a series of genetically engineered mouse models reveals a high degree of complexity of gastric acid secretion in both physiological and pathophysiological conditions.

  17. Amino acid sequences of two nonspecific lipid-transfer proteins from germinated castor bean.

    PubMed

    Takishima, K; Watanabe, S; Yamada, M; Suga, T; Mamiya, G

    1988-11-01

    The amino acid sequence of two nonspecific lipid-transfer proteins (nsLTP) B and C from germinated castor bean seeds have been determined. Both the proteins consist of 92 residues, as for nsLTP previously reported, and their calculated Mr values are 9847 and 9593 for nsLTP-B and nsLTP-C, respectively. The sequences of nsLTP-B and nsLTP-C, compared to the known sequence of nsLTP-A from the same source, are 68% and 35% similar, respectively. No variation was found at the positions of the cysteine residues, indicating that they might be involved in disulfide bridges.

  18. Genome-wide profiling of DNA methylation reveals preferred sequences of DNMTs in hepatocellular carcinoma cells.

    PubMed

    Fan, Hong; Zhao, Zhujiang; Cheng, Yuchao; Cui, He; Qiao, Fengchang; Wang, Ling; Hu, Jiaojiao; Wu, Huzhang; Song, Wei

    2016-01-01

    Aberrant DNA methylation of CpG site is among the earliest and most frequent alterations in developmental process and diseases including cancer. To elucidate the functional preferred site of DNMTs, we analyzed the feature of distinct methylated sequences and established the defined relationship between DNMTs and preference genomic DNA sequences. Small interfering RNA (siRNA) construct of DNTM1, DNMT3A, and DNMT3B was transfected into the human hepatocellular carcinoma cell line SMMC-7721, respectively. Distinguishing methylated fragments pool was enriched by SHH method in cells which is knocked down DNMT1, DNMT3A, DNMT3B, separately. The defined binding transcription factors (TFs) containing of 5'CpG islands were obtained with bioinformatics software and website. In SMMC-7721 hepatocellular carcinoma (HCC) cell line, DNMT1, DNMT3A, and DNMT3B were specific suppressed by their corresponding siRNA construct, separately. A 46, 42, 67 distinctive methylated fragments from three different DNMTs were evaluated according to genomic DNA database. Those separated fragments were distributed among genomic DNA regions of all chromosome complements, including coding genes, repeat sequences, and genes with unknown function. The majority of coding genes contain CpG islands in their promoter region. Cluster analysis demonstrated all of preference sequences identified by three DNMTs shares their own conserved sequences. In depleting of different DNMTs cells, 80 % of 103 upregulation genes induced by DNMT1 knock-down contain CpG sites; 76 % of 25 upregulation genes induced by DNMT3A knock-down contain CpG sites; 63 % of 126 upregulation genes induced by DNMT3B knock-down contain CpG sites. Our findings suggested that distinctive DNMTs targeted DNA methylation site to their preference sequences, and this targeting might be associated with diverse roles of DNMTs in tumorigenesis. Meanwhile, the analysis of preference sequences provides an alternative way to find out the individual

  19. Analysis of expression and amino acid sequence of the allergen Mag 3 in two species of house dust mites-Dermatophagoides farinae and D. pteronyssinus (Acari: Astigmata: Pyroglyphidae).

    PubMed

    Asman, Marek; Solarz, Krzysztof; Szilman, Ewa; Szilman, Piotr

    2010-01-01

    In the 90's of the XX century, 2 new and important allergens of house dust mites mites were cloned and sequenced: Mag 1 and Mag 3. However, the second allergen has been identified to date only in extracts of Dermatophagoides farinae [DF ]. In this work, we aimed to detect expression of this important allergen and for the first time analyze to the amino acid sequence in other species of house dust mite - Dermatophagoides pteronyssinus [DP ]. We were able to confirm the expression of allergen Mag 3 in DF and to exclude it in DP . By sequencing the products of DNA amplification, we revealed the nucleotide sequence encoding allergen Mag 3 in DF . This analysis enabled detection of 9 single base changes. An analysis of encoded amino acid sequence by triplets with substituted nucleotides revealed that 8 changes were polymorphic, and 1 was a mutation substituting GTG (valine) for ATG (methionine) at 236 position. However, the presence of amino acid sequence difference in this allergen might suggest that there exist other isoforms which can make difficult both diagnosis as well as immunotherapy in persons who produce allergic response to this allergen. The variants of allergen Mag 3 (group 14) are still not known beside the very good known allergen variants of the other main groups 1, 2, 4, 5 or 7. Thus, the identification and definition of allergic properties of allergen Mag 3 variants needs to be further investigated.

  20. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  1. Fungal communities from the calcareous deep-sea sediments in the Southwest India Ridge revealed by Illumina sequencing technology.

    PubMed

    Zhang, Likui; Kang, Manyu; Huang, Yangchao; Yang, Lixiang

    2016-05-01

    The diversity and ecological significance of bacteria and archaea in deep-sea environments have been thoroughly investigated, but eukaryotic microorganisms in these areas, such as fungi, are poorly understood. To elucidate fungal diversity in calcareous deep-sea sediments in the Southwest India Ridge (SWIR), the internal transcribed spacer (ITS) regions of rRNA genes from two sediment metagenomic DNA samples were amplified and sequenced using the Illumina sequencing platform. The results revealed that 58-63 % and 36-42 % of the ITS sequences (97 % similarity) belonged to Basidiomycota and Ascomycota, respectively. These findings suggest that Basidiomycota and Ascomycota are the predominant fungal phyla in the two samples. We also found that Agaricomycetes, Leotiomycetes, and Pezizomycetes were the major fungal classes in the two samples. At the species level, Thelephoraceae sp. and Phialocephala fortinii were major fungal species in the two samples. Despite the low relative abundance, unidentified fungal sequences were also observed in the two samples. Furthermore, we found that there were slight differences in fungal diversity between the two sediment samples, although both were collected from the SWIR. Thus, our results demonstrate that calcareous deep-sea sediments in the SWIR harbor diverse fungi, which augment the fungal groups in deep-sea sediments. This is the first report of fungal communities in calcareous deep-sea sediments in the SWIR revealed by Illumina sequencing.

  2. Cloning and sequence analysis of human breast epithelial antigen BA46 reveals an RGD cell adhesion sequence presented on an epidermal growth factor-like domain.

    PubMed

    Couto, J R; Taylor, M R; Godwin, S G; Ceriani, R L; Peterson, J A

    1996-04-01

    The BA46 antigen of the human milk fat globule (HMFG) membrane is expressed in human breast carcinomas and has been used successfully as a target for experimental breast cancer radioimmunotherapy. To characterize this antigen further, we obtained the entire cDNA sequence and focused on its possible role in cell adhesion. The derived protein sequence of BA46 encodes a 387-residue precursor composed of a putative signal peptide, an amino-terminal epidermal growth factor (EGF)-like domain containing the cell adhesion tripeptide arginine-glycine-aspartic acid (RGD), and human factor V and factor VIII C1/C2-like domains. The EGF-like domain of BA46 is similar to the calcium-binding EGF-like domains of several coagulation factors, but the BA46 domain lacks a residue required for calcium binding and the coagulation factor domains do not include an RGD sequence. Assuming that all EGF-like domains fold into a similar structure, the RGD-containing sequence in BA46 is inserted between two antiparallel beta strands. This positioning suggests a novel function for the EGF-like domain as a scaffold for RGD presentation.

  3. Dietary intake and plasma metabolomic analysis of polyunsaturated fatty acids in bipolar subjects reveal dysregulation of linoleic acid metabolism.

    PubMed

    Evans, Simon J; Ringrose, Rachel N; Harrington, Gloria J; Mancuso, Peter; Burant, Charles F; McInnis, Melvin G

    2014-10-01

    Polyunsaturated fatty acids (PUFA) profiles associate with risk for mood disorders. This poses the hypothesis of metabolic differences between patients and unaffected healthy controls that relate to the primary illness or are secondary to medication use or dietary intake. However, dietary manipulation or supplementation studies show equivocal results improving mental health outcomes. This study investigates dietary patterns and metabolic profiles relevant to PUFA metabolism, in bipolar I individuals compared to non-psychiatric controls. We collected seven-day diet records and performed metabolomic analysis of fasted plasma collected immediately after diet recording. Regression analyses adjusted for age, gender and energy intake found that bipolar individuals had significantly lower intake of selenium and PUFAs, including eicosapentaenoic acid (EPA) (n-3), docosahexaenoic acid (DHA) (n-3), arachidonic acid (AA) (n-6) and docosapentaenoic acid (DPA) (n-3/n-6 mix); and significantly increased intake of the saturated fats, eicosanoic and docosanoic acid. Regression analysis of metabolomic data derived from plasma samples, correcting for age, gender, BMI, psychiatric medication use and dietary PUFA intake, revealed that bipolar individuals had reduced 13S-HpODE, a major peroxidation product of the n-6, linoleic acid (LA), reduced eicosadienoic acid (EDA), an elongation product of LA; reduced prostaglandins G2, F2 alpha and E1, synthesized from n-6 PUFA; and reduced EPA. These observations remained significant or near significant after Bonferroni correction and are consistent with metabolic variances between bipolar and control individuals with regard to PUFA metabolism. These findings suggest that specific dietary interventions aimed towards correcting these metabolic disparities may impact health outcomes for individuals with bipolar disorder.

  4. Complete amino acid sequence of the N-terminal extension of calf skin type III procollagen.

    PubMed Central

    Brandt, A; Glanville, R W; Hörlein, D; Bruckner, P; Timpl, R; Fietzek, P P; Kühn, K

    1984-01-01

    The N-terminal extension peptide of type III procollagen, isolated from foetal-calf skin, contains 130 amino acid residues. To determine its amino acid sequence, the peptide was reduced and carboxymethylated or aminoethylated and fragmented with trypsin, Staphylococcus aureus V8 proteinase and bacterial collagenase. Pyroglutamate aminopeptidase was used to deblock the N-terminal collagenase fragment to enable amino acid sequencing. The type III collagen extension peptide is homologous to that of the alpha 1 chain of type I procollagen with respect to a three-domain structure. The N-terminal 79 amino acids, which contain ten of the 12 cysteine residues, form a compact globular domain. The next 39 amino acids are in a collagenase triplet sequence (Gly- Xaa - Yaa )n with a high hydroxyproline content. Finally, another short non-collagenous domain of 12 amino acids ends at the cleavage site for procollagen aminopeptidase, which cleaves a proline-glutamine bond. In contrast with type I procollagen, the type III procollagen extension peptides contain interchain disulphide bridges located at the C-terminus of the triple-helical domain. PMID:6331392

  5. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  6. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    PubMed Central

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  7. Forward Genetics by Genome Sequencing Reveals That Rapid Cyanide Release Deters Insect Herbivory of Sorghum bicolor

    PubMed Central

    Krothapalli, Kartikeya; Buescher, Elizabeth M.; Li, Xu; Brown, Elliot; Chapple, Clint; Dilkes, Brian P.; Tuinstra, Mitchell R.

    2013-01-01

    Whole genome sequencing has allowed rapid progress in the application of forward genetics in model species. In this study, we demonstrated an application of next-generation sequencing for forward genetics in a complex crop genome. We sequenced an ethyl methanesulfonate-induced mutant of Sorghum bicolor defective in hydrogen cyanide release and identified the causal mutation. A workflow identified the causal polymorphism relative to the reference BTx623 genome by integrating data from single nucleotide polymorphism identification, prior information about candidate gene(s) implicated in cyanogenesis, mutation spectra, and polymorphisms likely to affect phenotypic changes. A point mutation resulting in a premature stop codon in the coding sequence of dhurrinase2, which encodes a protein involved in the dhurrin catabolic pathway, was responsible for the acyanogenic phenotype. Cyanogenic glucosides are not cyanogenic compounds but their cyanohydrins derivatives do release cyanide. The mutant accumulated the glucoside, dhurrin, but failed to efficiently release cyanide upon tissue disruption. Thus, we tested the effects of cyanide release on insect herbivory in a genetic background in which accumulation of cyanogenic glucoside is unchanged. Insect preference choice experiments and herbivory measurements demonstrate a deterrent effect of cyanide release capacity, even in the presence of wild-type levels of cyanogenic glucoside accumulation. Our gene cloning method substantiates the value of (1) a sequenced genome, (2) a strongly penetrant and easily measurable phenotype, and (3) a workflow to pinpoint a causal mutation in crop genomes and accelerate in the discovery of gene function in the postgenomic era. PMID:23893483

  8. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  9. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  10. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  11. Primary structure of the merozoite surface antigen 1 of Plasmodium vivax reveals sequences conserved between different Plasmodium species.

    PubMed Central

    del Portillo, H A; Longacre, S; Khouri, E; David, P H

    1991-01-01

    Merozoite surface antigen 1 (MSA1) of several species of plasmodia has been shown to be a promising candidate for a vaccine directed against the asexual blood stages of malaria. We report the cloning and characterization of the MSA1 gene of the human malaria parasite Plasmodium vivax. This gene, which we call Pv200, encodes a polypeptide of 1726 amino acids and displays features described for MSA1 genes of other species, such as signal peptide and anchoring sequences, conserved cysteine residues, number of potential N-glycosylation sites, and repeats consisting here of 23 glutamine residues in a row. When the nucleotide and deduced amino acid sequences of the MSA1 of P. vivax are compared to those of another human malaria parasite, Plasmodium falciparum, and to those of the rodent parasite Plasmodium yoelii, 10 regions of high amino acid similarity are observed despite the very different dG + dC contents of the corresponding genes. All of the interspecies conserved regions reside within the conserved or semiconserved blocks delimited by the sequences of different alleles of the MSA1 gene of P. falciparum. Images PMID:2023952

  12. Complete amino acid sequence of branched-chain amino acid aminotransferase (transaminase B) of Salmonella typhimurium, identification of the coenzyme-binding site and sequence comparison analysis

    SciTech Connect

    Feild, M.J.

    1988-01-01

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase of Salmonella typhimurium was determined by automated Edman degradation of peptide fragments generated by chemical and enzymatic digestion of S-carboxymethylated and S-pyridylethylated transaminase B. Peptide fragments of transaminase B were generated by treatment of the enzyme with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. Protocols were developed for separation of the peptide fragments by reverse-phase high performance liquid chromatography (HPLC), ion-exchange HPLC, and SDS-urea gel electrophoresis. The enzyme subunit contains 308 amino acid residues and has a molecular weight of 33,920 daltons. The coenzyme-binding site was determined by treatment of the enzyme, containing bound pyridoxal 5-phosphate, with tritiated sodium borohydride prior to trypsin digestion. Monitoring radioactivity incorporation and peptide map comparisons with an apoenzyme tryptic digest, allowed identification of the pyridoxylated-peptide which was isolated by reverse-phase HPLC and sequenced. The coenzyme-binding site is a lysyl residue at position 159. Some peptides were further characterized by fast atom bombardment mass spectrometry.

  13. Working Memory for Sequences of Temporal Durations Reveals a Volatile Single-Item Store

    PubMed Central

    Manohar, Sanjay G.; Husain, Masud

    2016-01-01

    When a sequence is held in working memory, different items are retained with differing fidelity. Here we ask whether a sequence of brief time intervals that must be remembered show recency effects, similar to those observed in verbal and visuospatial working memory. It has been suggested that prioritizing some items over others can be accounted for by a “focus of attention,” maintaining some items in a privileged state. We therefore also investigated whether such benefits are vulnerable to disruption by attention or expectation. Participants listened to sequences of one to five tones, of varying durations (200 ms to 2 s). Subsequently, the length of one of the tones in the sequence had to be reproduced by holding a key. The discrepancy between the reproduced and actual durations quantified the fidelity of memory for auditory durations. Recall precision decreased with the number of items that had to be remembered, and was better for the first and last items of sequences, in line with set-size and serial position effects seen in other modalities. To test whether attentional filtering demands might impair performance, an irrelevant variation in pitch was introduced in some blocks of trials. In those blocks, memory precision was worse for sequences that consisted of only one item, i.e., the smallest memory set-size. Thus, when irrelevant information was present, the benefit of having only one item in memory is attenuated. Finally we examined whether expectation could interfere with memory. On half the trials, the number of items in the upcoming sequence was cued. When the number of items was known in advance, performance was paradoxically worse when the sequence consisted of only one item. Thus the benefit of having only one item to remember is stronger when it is unexpectedly the only item. Our results suggest that similar mechanisms are used to hold auditory time durations in working memory, as for visual or verbal stimuli. Further, solitary items were remembered

  14. Microbial Analysis of Arctic Snow and Frost Flowers: What Next Generation Sequencing Method Can Reveal

    NASA Astrophysics Data System (ADS)

    Mortazavi, R.; Attiya, S.; Ariya, P. A.

    2014-12-01

    We herein examined and identified the population of the microbial communities of Arctic snow types and frost flower during the spring 2009 campaign of the Ocean-Atmosphere-Sea Ice-Snowpack (OASIS) program in Barrow, Alaska, USA. In addition to conventional microbial identification techniques (culture-isolation-PCR amplification-sequencing) we deployed a state-of-the-art genomic Next Generation Sequencing (NGS) technique to examine the true bacterial communities in Arctic samples. Our results have indicated that diverse community of microbial exists in Arctic with many originating from distinct ecological environment. The alterations observed in the texture of Arctic samples by microbial has further signified their importance in ecosystem.

  15. [Classification of nucleotide sequences over their frequency dictionaries reveals a relation between the structure of sequences and taxonomy of their bearers].

    PubMed

    Gorban', A N; Popova, T G; Sadovskiĭ, M G

    2003-01-01

    Classification of 16S RNA sequences over their frequency dictionaries, both real ones, and transformed ones was studied. Two entities were considered to be close each other from the point of view of their structure, if their frequency dictionaries were close, in Eucledian metric. A transformation procedure of a frequency dictionary has been implemented that reveals the peculiarities of information structure of a nucleotide sequence. A comparative study of two classification developed over the real frequency dictionary vs. that one developed over the transformed frequency dictionary was carried out. The strong correlation is revealed between the classification and the taxonomy of 16S RNA bearer. For the classes isolated, the information valuable words were identified. These words are the main factors of a difference between the classes. The frequency dictionaries containing the words of the length 3 exhibit the best correlation between a class and a genus. A genus, as a rule, is included into the same class, and the exclusion are sporadic. A development of hierarchy classification over the transformed frequency dictionaries separated one or two taxonomy groups, as each stage of classification. The unexpectedly frequent, or contrary, unexpectedly rare occurred of words (of the length 3) in entities under consideration make the structure difference between the classes of the nucleotide sequences.

  16. Unbiased K-mer Analysis Reveals Changes in Copy Number of Highly Repetitive Sequences During Maize Domestication and Improvement

    PubMed Central

    Liu, Sanzhen; Zheng, Jun; Migeon, Pierre; Ren, Jie; Hu, Ying; He, Cheng; Liu, Hongjun; Fu, Junjie; White, Frank F.; Toomajian, Christopher; Wang, Guoying

    2017-01-01

    The major component of complex genomes is repetitive elements, which remain recalcitrant to characterization. Using maize as a model system, we analyzed whole genome shotgun (WGS) sequences for the two maize inbred lines B73 and Mo17 using k-mer analysis to quantify the differences between the two genomes. Significant differences were identified in highly repetitive sequences, including centromere, 45S ribosomal DNA (rDNA), knob, and telomere repeats. Genotype specific 45S rDNA sequences were discovered. The B73 and Mo17 polymorphic k-mers were used to examine allele-specific expression of 45S rDNA in the hybrids. Although Mo17 contains higher copy number than B73, equivalent levels of overall 45S rDNA expression indicates that transcriptional or post-transcriptional regulation mechanisms operate for the 45S rDNA in the hybrids. Using WGS sequences of B73xMo17 doubled haploids, genomic locations showing differential repetitive contents were genetically mapped, which displayed different organization of highly repetitive sequences in the two genomes. In an analysis of WGS sequences of HapMap2 lines, including maize wild progenitor, landraces, and improved lines, decreases and increases in abundance of additional sets of k-mers associated with centromere, 45S rDNA, knob, and retrotransposons were found among groups, revealing global evolutionary trends of genomic repeats during maize domestication and improvement. PMID:28186206

  17. The RNA polymerase I transcription factor UBF is a sequence-tolerant HMG-box protein that can recognize structured nucleic acids.

    PubMed Central

    Copenhaver, G P; Putnam, C D; Denton, M L; Pikaard, C S

    1994-01-01

    Upstream Binding Factor (UBF) is important for activation of ribosomal RNA transcription and belongs to a family of proteins containing nucleic acid binding domains, termed HMG-boxes, with similarity to High Mobility Group (HMG) chromosomal proteins. Proteins in this family can be sequence-specific or highly sequence-tolerant binding proteins. We show that Xenopus UBF can be classified among the sequence-tolerant class. Methylation interference assays using enhancer DNA probes failed to reveal any critical nucleotides required for UBF binding. Selection by UBF of optimal binding sites among a population of enhancer oligonucleotides with randomized sequences also failed to reveal any consensus sequence. The minor groove specific drugs chromomycin A3, distamycin A and actinomycin D competed against UBF for enhancer binding, suggesting that UBF, like other HMG-box proteins, probably interacts with the minor groove. UBF also shares with other HMG box proteins the ability to bind synthetic cruciform DNA. However, UBF appears different from other HMG-box proteins in that it can bind both RNA (tRNA) and DNA. The sequence-tolerant nature of UBF-nucleic acid interactions may accommodate the rapid evolution of ribosomal RNA gene sequences. Images PMID:8041627

  18. The amino acid sequence of cytochromes c-551 from three species of Pseudomonas

    PubMed Central

    Ambler, R. P.; Wynn, Margaret

    1973-01-01

    The amino acid sequences of the cytochromes c-551 from three species of Pseudomonas have been determined. Each resembles the protein from Pseudomonas strain P6009 (now known to be Pseudomonas aeruginosa, not Pseudomonas fluorescens) in containing 82 amino acids in a single peptide chain, with a haem group covalently attached to cysteine residues 12 and 15. In all four sequences 43 residues are identical. Although by bacteriological criteria the organisms are closely related, the differences between pairs of sequences range from 22% to 39%. These values should be compared with the differences in the sequence of mitochondrial cytochrome c between mammals and amphibians (about 18%) or between mammals and insects (about 33%). Detailed evidence for the amino acid sequences of the proteins has been deposited as Supplementary Publication SUP 50015 at the National Lending Library for Science and Technology, Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1973), 131, 5. PMID:4352718

  19. Comment on "Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry".

    PubMed

    Pevzner, Pavel A; Kim, Sangtae; Ng, Julio

    2008-08-22

    Asara et al. (Reports, 13 April 2007, p. 280) reported sequencing of Tyrannosaurus rex proteins and used them to establish the evolutionary relationships between birds and dinosaurs. We argue that the reported T. rex peptides may represent statistical artifacts and call for complete data release to enable experimental and computational verification of their findings.

  20. Exome Sequencing Reveals Primary Immunodeficiencies in Children with Community-Acquired Pseudomonas aeruginosa Sepsis.

    PubMed

    Asgari, Samira; McLaren, Paul J; Peake, Jane; Wong, Melanie; Wong, Richard; Bartha, Istvan; Francis, Joshua R; Abarca, Katia; Gelderman, Kyra A; Agyeman, Philipp; Aebi, Christoph; Berger, Christoph; Fellay, Jacques; Schlapbach, Luregn J

    2016-01-01

    One out of three pediatric sepsis deaths in high income countries occur in previously healthy children. Primary immunodeficiencies (PIDs) have been postulated to underlie fulminant sepsis, but this concept remains to be confirmed in clinical practice. Pseudomonas aeruginosa (P. aeruginosa) is a common bacterium mostly associated with health care-related infections in immunocompromised individuals. However, in rare cases, it can cause sepsis in previously healthy children. We used exome sequencing and bioinformatic analysis to systematically search for genetic factors underpinning severe P. aeruginosa infection in the pediatric population. We collected blood samples from 11 previously healthy children, with no family history of immunodeficiency, who presented with severe sepsis due to community-acquired P. aeruginosa bacteremia. Genomic DNA was extracted from blood or tissue samples obtained intravitam or postmortem. We obtained high-coverage exome sequencing data and searched for rare loss-of-function variants. After rigorous filtrations, 12 potentially causal variants were identified. Two out of eight (25%) fatal cases were found to carry novel pathogenic variants in PID genes, including BTK and DNMT3B. This study demonstrates that exome sequencing allows to identify rare, deleterious human genetic variants responsible for fulminant sepsis in apparently healthy children. Diagnosing PIDs in such patients is of high relevance to survivors and affected families. We propose that unusually severe and fatal sepsis cases in previously healthy children should be considered for exome/genome sequencing to search for underlying PIDs.

  1. Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

    PubMed

    Brozynska, Marta; Copetti, Dario; Furtado, Agnelo; Wing, Rod A; Crayn, Darren; Fox, Glen; Ishikawa, Ryuji; Henry, Robert J

    2016-11-27

    The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement.

  2. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-o...

  3. Reduced representation bisulphite sequencing of the ten bovine somatic tissues reveals DNA methylation patterns

    Technology Transfer Automated Retrieval System (TEKTRAN)

    As a major component epigenetics, DNA methylation has been proved that widely functions in individual development and various diseases. It has been well studied in model organisms and human but includes limited data for the economic animals. Using reduced representation bisulphite sequencing (RRBS),...

  4. Protein similarity networks reveal relationships among sequence, structure, and function within the Cupin superfamily.

    PubMed

    Uberto, Richard; Moomaw, Ellen W

    2013-01-01

    The cupin superfamily is extremely diverse and includes catalytically inactive seed storage proteins, sugar-binding metal-independent epimerases, and metal-dependent enzymes possessing dioxygenase, decarboxylase, and other activities. Although numerous proteins of this superfamily have been structurally characterized, the functions of many of them have not been experimentally determined. We report the first use of protein similarity networks (PSNs) to visualize trends of sequence and structure in order to make functional inferences in this remarkably diverse superfamily. PSNs provide a way to visualize relatedness of structure and sequence among a given set of proteins. Structure- and sequence-based clustering of cupin members reflects functional clustering. Networks based only on cupin domains and networks based on the whole proteins provide complementary information. Domain-clustering supports phylogenetic conclusions that the N- and C-terminal domains of bicupin proteins evolved independently. Interestingly, although many functionally similar enzymatic cupin members bind the same active site metal ion, the structure and sequence clustering does not correlate with the identity of the bound metal. It is anticipated that the application of PSNs to this superfamily will inform experimental work and influence the functional annotation of databases.

  5. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  6. Oil palm genome sequence reveals divergence of interfertile species in Old and New worlds.

    PubMed

    Singh, Rajinder; Ong-Abdullah, Meilina; Low, Eng-Ti Leslie; Manaf, Mohamad Arif Abdul; Rosli, Rozana; Nookiah, Rajanaidu; Ooi, Leslie Cheng-Li; Ooi, Siew-Eng; Chan, Kuang-Lim; Halim, Mohd Amin; Azizi, Norazah; Nagappan, Jayanthi; Bacher, Blaire; Lakey, Nathan; Smith, Steven W; He, Dong; Hogan, Michael; Budiman, Muhammad A; Lee, Ernest K; DeSalle, Rob; Kudrna, David; Goicoechea, Jose Luis; Wing, Rod A; Wilson, Richard K; Fulton, Robert S; Ordway, Jared M; Martienssen, Robert A; Sambanthamurthi, Ravigadevi

    2013-08-15

    Oil palm is the most productive oil-bearing crop. Although it is planted on only 5% of the total world vegetable oil acreage, palm oil accounts for 33% of vegetable oil and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8-gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. A total of 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators, which are highly expressed in the kernel. We also report the draft sequence of the South American oil palm Elaeis oleifera, which has the same number of chromosomes (2n = 32) and produces fertile interspecific hybrids with E. guineensis but seems to have diverged in the New World. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations that restrict the use of clones in commercial plantings, and should therefore help to achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop.

  7. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  8. Exome Sequencing Reveals Primary Immunodeficiencies in Children with Community-Acquired Pseudomonas aeruginosa Sepsis

    PubMed Central

    Asgari, Samira; McLaren, Paul J.; Peake, Jane; Wong, Melanie; Wong, Richard; Bartha, Istvan; Francis, Joshua R.; Abarca, Katia; Gelderman, Kyra A.; Agyeman, Philipp; Aebi, Christoph; Berger, Christoph; Fellay, Jacques; Schlapbach, Luregn J.; Posfay-Barbe, Klara

    2016-01-01

    One out of three pediatric sepsis deaths in high income countries occur in previously healthy children. Primary immunodeficiencies (PIDs) have been postulated to underlie fulminant sepsis, but this concept remains to be confirmed in clinical practice. Pseudomonas aeruginosa (P. aeruginosa) is a common bacterium mostly associated with health care-related infections in immunocompromised individuals. However, in rare cases, it can cause sepsis in previously healthy children. We used exome sequencing and bioinformatic analysis to systematically search for genetic factors underpinning severe P. aeruginosa infection in the pediatric population. We collected blood samples from 11 previously healthy children, with no family history of immunodeficiency, who presented with severe sepsis due to community-acquired P. aeruginosa bacteremia. Genomic DNA was extracted from blood or tissue samples obtained intravitam or postmortem. We obtained high-coverage exome sequencing data and searched for rare loss-of-function variants. After rigorous filtrations, 12 potentially causal variants were identified. Two out of eight (25%) fatal cases were found to carry novel pathogenic variants in PID genes, including BTK and DNMT3B. This study demonstrates that exome sequencing allows to identify rare, deleterious human genetic variants responsible for fulminant sepsis in apparently healthy children. Diagnosing PIDs in such patients is of high relevance to survivors and affected families. We propose that unusually severe and fatal sepsis cases in previously healthy children should be considered for exome/genome sequencing to search for underlying PIDs. PMID:27703454

  9. A simplified sequence-based identification scheme for Bordetella reveals several putative novel species.

    PubMed

    Spilker, Theodore; Leber, Amy L; Marcon, Mario J; Newton, Duane W; Darrah, Rebecca; Vandamme, Peter; Lipuma, John J

    2014-02-01

    The differentiation of Bordetella species, particularly those causing human infection, is problematic. We found that sequence analysis of an internal fragment of nrdA allowed differentiation of the currently named Bordetella species. Analysis of 107 "Bordetella" isolates recovered almost exclusively from human respiratory tract specimens identified several putative novel species.

  10. Population sequencing reveals breed and sub-species specific CNVs in cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Individualized copy number variation (CNV) maps have highlighted the need for population surveys of cattle to detect the rare and common variants. While SNP and comparative genomic hybridization (CGH) arrays have provided preliminary data, next-generation sequence (NGS) data analysis offers an incre...

  11. Population sequencing reveals breed and sub-species specific CNVs in cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Individualized copy number variation (CNV) maps have highlighted the need for population surveys of cattle to detect rare and common variants. While SNP and comparative genomic hybridization (CGH) arrays have provided preliminary data, next-generation sequence (NGS) data analysis offers an increased...

  12. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  13. Sequencing human-gibbon breakpoints of synteny reveals mosaic new insertions at rearrangement sites.

    PubMed

    Girirajan, Santhosh; Chen, Lin; Graves, Tina; Marques-Bonet, Tomas; Ventura, Mario; Fronick, Catrina; Fulton, Lucinda; Rocchi, Mariano; Fulton, Robert S; Wilson, Richard K; Mardis, Elaine R; Eichler, Evan E

    2009-02-01

    The gibbon genome exhibits extensive karyotypic diversity with an increased rate of chromosomal rearrangements during evolution. In an effort to understand the mechanistic origin and implications of these rearrangement events, we sequenced 24 synteny breakpoint regions in the white-cheeked gibbon (Nomascus leucogenys, NLE) in the form of high-quality BAC insert sequences (4.2 Mbp). While there is a significant deficit of breakpoints in genes, we identified seven human gene structures involved in signaling pathways (DEPDC4, GNG10), phospholipid metabolism (ENPP5, PLSCR2), beta-oxidation (ECH1), cellular structure and transport (HEATR4), and transcription (ZNF461), that have been disrupted in the NLE gibbon lineage. Notably, only three of these genes show the expected evolutionary signatures of pseudogenization. Sequence analysis of the breakpoints suggested both nonclassical nonhomologous end-joining (NHEJ) and replication-based mechanisms of rearrangement. A substantial number (11/24) of human-NLE gibbon breakpoints showed new insertions of gibbon-specific repeats and mosaic structures formed from disparate sequences including segmental duplications, LINE, SINE, and LTR elements. Analysis of these sites provides a model for a replication-dependent repair mechanism for double-strand breaks (DSBs) at rearrangement sites and insights into the structure and formation of primate segmental duplications at sites of genomic rearrangements during evolution.

  14. Genome sequence surveyws of Brachiola algerae and Edhazardia aedis reveal microsporidia with low gene densities.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microsporidia are well known models of extreme nuclear genome reduction and compaction. The smallest microsporidian genomes have received the most attention, but with a size range of 2.3 Mb to 19.5 Mb the nature of the larger genomes remains unknown. Here we have undertaken genome sequence surveys ...

  15. Draft Genome Sequence of Sorghum Grain Mold Fungus Epicoccum sorghinum, a Producer of Tenuazonic Acid

    PubMed Central

    Oliveira, Rodrigo C.; Davenport, Karen W.; Hovde, Blake; Silva, Danielle; Chain, Patrick S. G.; Correa, Benedito

    2017-01-01

    ABSTRACT The facultative plant pathogen Epicoccum sorghinum is associated with grain mold of sorghum and produces the mycotoxin tenuazonic acid. This fungus can have serious economic impact on sorghum production. Here, we report the draft genome sequence of E. sorghinum (USPMTOX48). PMID:28126937

  16. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein.

  17. Draft Genome Sequence of Bacillus coagulans NL01, a Wonderful l-Lactic Acid Producer

    PubMed Central

    Zheng, Zhaojuan; Jiang, Ting; Lin, Xi; Zhou, Jie

    2015-01-01

    Here, we report the draft genome sequence of Bacillus coagulans NL01, which could produce high optically pure l-lactic acid using xylose as a sole carbon source. The draft genome is 3,505,081 bp, with 144 contigs. About 3,903 protein-coding genes and 92 rRNAs are predicted from this assembly. PMID:26089419

  18. Defining sequence space and reaction products within the cyanuric acid hydrolase (AtzD)/barbiturase protein family.

    PubMed

    Seffernick, Jennifer L; Erickson, Jasmine S; Cameron, Stephan M; Cho, Seunghee; Dodge, Anthony G; Richman, Jack E; Sadowsky, Michael J; Wackett, Lawrence P

    2012-09-01

    Cyanuric acid hydrolases (AtzD) and barbiturases are homologous, found almost exclusively in bacteria, and comprise a rare protein family with no discernible linkage to other protein families or an X-ray structural class. There has been confusion in the literature and in genome projects regarding the reaction products, the assignment of individual sequences as either cyanuric acid hydrolases or barbiturases, and spurious connection of this family to another protein family. The present study has addressed those issues. First, the published enzyme reaction products of cyanuric acid hydrolase are incorrectly identified as biuret and carbon dioxide. The current study employed (13)C nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry to show that cyanuric acid hydrolase releases carboxybiuret, which spontaneously decarboxylates to biuret. This is significant because it revealed that homologous cyanuric acid hydrolases and barbiturases catalyze completely analogous reactions. Second, enzymes that had been annotated incorrectly in genome projects have been reassigned here by bioinformatics, gene cloning, and protein characterization studies. Third, the AtzD/barbiturase family has previously been suggested to consist of members of the amidohydrolase superfamily, a large class of metallohydrolases. Bioinformatics and the lack of bound metals both argue against a connection to the amidohydrolase superfamily. Lastly, steady-state kinetic measurements and observations of protein stability suggested that the AtzD/barbiturase family might be an undistinguished protein family that has undergone some resurgence with the recent introduction of industrial s-triazine compounds such as atrazine and melamine into the environment.

  19. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    PubMed Central

    Chow, Cheryl-Emiliane T.; Winget, Danielle M.; White, Richard A.; Hallam, Steven J.; Suttle, Curtis A.

    2015-01-01

    Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs), remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10 m) and oxygen-starved basin (200 m) waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs) predicted across all 34 viral fosmids, 77.6% (n = 5010) had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P) waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI's non-redundant “nr” database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems. PMID:25914678

  20. Analyses of mitochondrial amino acid sequence datasets support the proposal that specimens of Hypodontus macropi from three species of macropodid hosts represent distinct species

    PubMed Central

    2013-01-01

    Background Hypodontus macropi is a common intestinal nematode of a range of kangaroos and wallabies (macropodid marsupials). Based on previous multilocus enzyme electrophoresis (MEE) and nuclear ribosomal DNA sequence data sets, H. macropi has been proposed to be complex of species. To test this proposal using independent molecular data, we sequenced the whole mitochondrial (mt) genomes of individuals of H. macropi from three different species of hosts (Macropus robustus robustus, Thylogale billardierii and Macropus [Wallabia] bicolor) as well as that of Macropicola ocydromi (a related nematode), and undertook a comparative analysis of the amino acid sequence datasets derived from these genomes. Results The mt genomes sequenced by next-generation (454) technology from H. macropi from the three host species varied from 13,634 bp to 13,699 bp in size. Pairwise comparisons of the amino acid sequences predicted from these three mt genomes revealed differences of 5.8% to 18%. Phylogenetic analysis of the amino acid sequence data sets using Bayesian Inference (BI) showed that H. macropi from the three different host species formed distinct, well-supported clades. In addition, sliding window analysis of the mt genomes defined variable regions for future population genetic studies of H. macropi in different macropodid hosts and geographical regions around Australia. Conclusions The present analyses of inferred mt protein sequence datasets clearly supported the hypothesis that H. macropi from M. robustus robustus, M. bicolor and T. billardierii represent distinct species. PMID:24261823

  1. Multiregion ultra-deep sequencing reveals early intermixing and variable levels of intratumoral heterogeneity in colorectal cancer.

    PubMed

    Suzuki, Yuka; Ng, Sarah Boonhsi; Chua, Clarinda; Leow, Wei Qiang; Chng, Jermain; Liu, Shi Yang; Ramnarayanan, Kalpana; Gan, Anna; Ho, Dan Liang; Ten, Rachel; Su, Yan; Lezhava, Alexandar; Lai, Jiunn Herng; Koh, Dennis; Lim, Kiat Hon; Tan, Patrick; Rozen, Steven G; Tan, Iain Beehuat

    2017-02-01

    Intratumor heterogeneity (ITH) contributes to cancer progression and chemoresistance. We sought to comprehensively describe ITH of somatic mutations, copy number, and transcriptomic alterations involving clinically and biologically relevant gene pathways in colorectal cancer (CRC). We performed multiregion, high-depth (384× on average) sequencing of 799 cancer-associated genes in 24 spatially separated primary tumor and nonmalignant tissues from four treatment-naïve CRC patients. We then used ultra-deep sequencing (17 075× on average) to accurately verify the presence or absence of identified somatic mutations in each sector. We also digitally measured gene expression and copy number alterations using NanoString assays. We identified the subclonal point mutations and determined the mutational timing and phylogenetic relationships among spatially separated sectors of each tumor. Truncal mutations, those shared by all sectors in the tumor, affected the well-described driver genes such as APC, TP53, and KRAS. With sequencing at 17 075×, we found that mutations first detected at a sequencing depth of 384× were in fact more widely shared among sectors than originally assessed. Interestingly, ultra-deep sequencing also revealed some mutations that were present in all spatially dispersed sectors, but at subclonal levels. Ultra-high-depth validation sequencing, copy number analysis, and gene expression profiling provided a comprehensive and accurate genomic landscape of spatial heterogeneity in CRC. Ultra-deep sequencing allowed more sensitive detection of somatic mutations and a more accurate assessment of ITH. By detecting the subclonal mutations with ultra-deep sequencing, we traced the genomic histories of each tumor and the relative timing of mutational events. We found evidence of early mixing, in which the subclonal ancestral mutations intermixed across the sectors before the acquisition of subsequent nontruncal mutations. Our findings also indicate that

  2. Amino acid sequences of heterotrophic and photosynthetic ferredoxins from the tomato plant (Lycopersicon esculentum Mill.).

    PubMed

    Kamide, K; Sakai, H; Aoki, K; Sanada, Y; Wada, K; Green, L S; Yee, B C; Buchanan, B B

    1995-11-01

    Several forms (isoproteins) of ferredoxin in roots, leaves, and green and red pericarps in tomato plants (Lycopersicon esculentum Mill.) were earlier identified on the basis of N-terminal amino acid sequence and chromatographic behavior (Green et al. 1991). In the present study, a large scale preparation made possible determination of the full length amino acid sequence of the two ferredoxins from leaves. The ferredoxins characteristic of fruit and root were sequenced from the amino terminus to the 30th residue or beyond. The leaf ferredoxins were confirmed to be expressed in pericarp of both green and red fruit. The ferredoxins characteristic of fruit and root appeared to be restricted to those tissue. The results extend earlier findings in demonstrating that ferredoxin occurs in the major organs of the tomato plant where it appears to function irrespective of photosynthetic competence.

  3. Amino acid sequence of myoglobin from white-tailed deer (Odocoileus virginianus).

    PubMed

    Joseph, Poulson; Suman, Surendranath P; Li, Shuting; Fontaine, Michele; Steinke, Laurey

    2012-10-01

    Our objective was to determine the primary structure of white-tailed deer myoglobin (Mb). White-tailed deer Mb was isolated from cardiac muscles employing ammonium sulfate precipitation and gel-filtration chromatography. The amino acid sequence was determined by Edman degradation. Sequence analyses of intact Mb as well as tryptic- and cyanogen bromide-peptides yielded the complete primary structure of white-tailed deer Mb, which shared 100% similarity with red deer Mb. White-tailed deer Mb consists of 153 amino acid residues and shares more than 96% sequence similarity with myoglobins from meat-producing ruminants, such as cattle, buffalo, sheep, and goat. Similar to sheep and goat myoglobins, white-tailed deer Mb contains 12 histidine residues. Proximal (position 93) and distal (position 64) histidine residues responsible for maintaining the stability of heme are conserved in white-tailed deer Mb.

  4. Nucleotide sequence and the encoded amino acids of human apolipoprotein A-I mRNA.

    PubMed Central

    Law, S W; Brewer, H B

    1984-01-01

    The cDNA clones encoding the precursor form of human liver apolipoprotein A-I (apoA-I), preproapoA-I, have been isolated from a cDNA library. A 17-base synthetic oligonucleotide based on residues 108-113 of apoA-I and a 26-base primer-extended, dideoxynucleotide-terminated cDNA were used as hybridization probes to select for recombinant plasmids bearing the apoA-I sequence. The complete nucleic acid sequence of human liver preproapoA-I has been determined by analysis of the cloned cDNA. The sequence is composed of 801 nucleotides encoding 267 amino acid residues. PreproapoA-I contains an 18-amino-acid prepeptide and a 6-amino-acid propeptide connected to the amino terminus of the 243-amino acid mature apoA-I. Southern blotting analysis of chromosomal DNA obtained from peripheral blood indicated the apoA-I gene is contained in a 2.1-kilobase-pair Pst I fragment and there is no gross difference in structural organization between the normal apoA-I gene and the Tangier disease apoA-I gene. Images PMID:6198645

  5. COI and ITS2 sequences delimit species, reveal cryptic taxa and host specificity of fig-associated Sycophila (Hymenoptera, Eurytomidae).

    PubMed

    Li, Yanwei; Zhou, Xin; Feng, Gui; Hu, Haoyuan; Niu, Liming; Hebert, Paul D N; Huang, Dawei

    2010-01-01

    Although the genus Sycophila has broad host preferences, some species are specifically associated with figs as nonpollinator wasps. Because of their sexual dimorphism, morphological plasticity, cryptic mating behaviour and poorly known biology, species identifications are often uncertain. It is particularly difficult to match conspecific females and males. In this study, we employed two molecular markers, mitochondrial COI and nuclear ITS2, to identify Sycophila from six Chinese fig species. Morphological studies revealed 25 female and male morphs, while sequence results for both genes were consistent in supporting the presence of 15 species, of which 13 were host specialists and two used dual hosts. A single species of Sycophila was respectively found on four fig species, but six species were isolated from Ficus benjamina and a same number was reared from Ficus microcarpa. Sequence results revealed three male morphs in one species and detected two species that were overlooked by morphological analysis.

  6. DNA Methylation Profiling at Single-Base Resolution Reveals Gestational Folic Acid Supplementation Influences the Epigenome of Mouse Offspring Cerebellum

    PubMed Central

    Barua, Subit; Kuizon, Salomon; Brown, W. Ted; Junaid, Mohammed A.

    2016-01-01

    It is becoming increasingly more evident that lifestyle, environmental factors, and maternal nutrition during gestation can influence the epigenome of the developing fetus and thus modulate the physiological outcome. Variations in the intake of maternal nutrients affecting one-carbon metabolism may influence brain development and exert long-term effects on the health of the progeny. In this study, we investigated whether supplementation with high maternal folic acid during gestation alters DNA methylation and gene expression in the cerebellum of mouse offspring. We used reduced representation bisulfite sequencing to analyze the DNA methylation profile at the single-base resolution level. The genome-wide DNA methylation analysis revealed that supplementation with higher maternal folic acid resulted in distinct methylation patterns (P < 0.05) of CpG and non-CpG sites in the cerebellum of offspring. Such variations of methylation and gene expression in the cerebellum of offspring were highly sex-specific, including several genes of the neuronal pathways. These findings demonstrate that alterations in the level of maternal folic acid during gestation can influence methylation and gene expression in the cerebellum of offspring. Such changes in the offspring epigenome may alter neurodevelopment and influence the functional outcome of neurologic and psychiatric diseases. PMID:27199632

  7. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids.

    PubMed

    Das, Jayanta Kumar; Das, Provas; Ray, Korak Kumar; Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as 'FPKATD' and 'Y/FTNEKL' without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids' pattern in different proteins.

  8. Sequence-Based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families

    PubMed Central

    Maimanakos, Janine; Chow, Jennifer; Gaßmeyer, Sarah K.; Güllert, Simon; Busch, Florian; Kourist, Robert; Streit, Wolfgang R.

    2016-01-01

    Arylmalonate Decarboxylases (AMDases, EC 4.1.1.76) are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica’s prototype appeared to be limited to the classes of Alpha-, Beta-, and Gamma-proteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the tripartite tricarboxylate transporters family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99%) of the (R)-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes. PMID:27610105

  9. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication.

    PubMed

    Wu, G Albert; Prochnik, Simon; Jenkins, Jerry; Salse, Jerome; Hellsten, Uffe; Murat, Florent; Perrier, Xavier; Ruiz, Manuel; Scalabrin, Simone; Terol, Javier; Takita, Marco Aurélio; Labadie, Karine; Poulain, Julie; Couloux, Arnaud; Jabbari, Kamel; Cattonaro, Federica; Del Fabbro, Cristian; Pinosio, Sara; Zuccolo, Andrea; Chapman, Jarrod; Grimwood, Jane; Tadeo, Francisco R; Estornell, Leandro H; Muñoz-Sanz, Juan V; Ibanez, Victoria; Herrero-Ortega, Amparo; Aleza, Pablo; Pérez-Pérez, Julián; Ramón, Daniel; Brunel, Dominique; Luro, François; Chen, Chunxian; Farmerie, William G; Desany, Brian; Kodira, Chinnappa; Mohiuddin, Mohammed; Harkins, Tim; Fredrikson, Karin; Burns, Paul; Lomsadze, Alexandre; Borodovsky, Mark; Reforgiato, Giuseppe; Freitas-Astúa, Juliana; Quetier, Francis; Navarro, Luis; Roose, Mikeal; Wincker, Patrick; Schmutz, Jeremy; Morgante, Michele; Machado, Marcos Antonio; Talon, Manuel; Jaillon, Olivier; Ollitrault, Patrick; Gmitter, Frederick; Rokhsar, Daniel

    2014-07-01

    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes--a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes--and show that cultivated types derive from two progenitor species. Although cultivated pummelos represent selections from one progenitor species, Citrus maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species Citrus reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, thus implying that wild mandarins were part of the early breeding germplasm. A Chinese wild 'mandarin' diverges substantially from C. reticulata, thus suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and facilitates sequence-directed genetic improvement.

  10. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts.

    PubMed

    Otto, Thomas D; Rayner, Julian C; Böhme, Ulrike; Pain, Arnab; Spottiswoode, Natasha; Sanders, Mandy; Quail, Michael; Ollomo, Benjamin; Renaud, François; Thomas, Alan W; Prugnolle, Franck; Conway, David J; Newbold, Chris; Berriman, Matthew

    2014-09-09

    Plasmodium falciparum causes most human malaria deaths, having prehistorically evolved from parasites of African Great Apes. Here we explore the genomic basis of P. falciparum adaptation to human hosts by fully sequencing the genome of the closely related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete conservation of genomic synteny, but against this strikingly conserved background we observe major differences at loci involved in erythrocyte invasion. The organization of most virulence-associated multigene families, including the hypervariable var genes, is broadly conserved, but P. falciparum has a smaller subset of rif and stevor genes whose products are expressed on the infected erythrocyte surface. Genome-wide analysis identifies other loci under recent positive selection, but a limited number of changes at the host-parasite interface may have mediated host switching.

  11. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters.

    PubMed

    Core, Leighton J; Waterfall, Joshua J; Lis, John T

    2008-12-19

    RNA polymerases are highly regulated molecular machines. We present a method (global run-on sequencing, GRO-seq) that maps the position, amount, and orientation of transcriptionally engaged RNA polymerases genome-wide. In this method, nuclear run-on RNA molecules are subjected to large-scale parallel sequencing and mapped to the genome. We show that peaks of promoter-proximal polymerase reside on approximately 30% of human genes, transcription extends beyond pre-messenger RNA 3' cleavage, and antisense transcription is prevalent. Additionally, most promoters have an engaged polymerase upstream and in an orientation opposite to the annotated gene. This divergent polymerase is associated with active genes but does not elongate effectively beyond the promoter. These results imply that the interplay between polymerases and regulators over broad promoter regions dictates the orientation and efficiency of productive transcription.

  12. Cannabis microbiome sequencing reveals several mycotoxic fungi native to dispensary grade Cannabis flowers

    PubMed Central

    McKernan, Kevin; Spangler, Jessica; Zhang, Lei; Tadigotla, Vasisht; Helbert, Yvonne; Foss, Theodore; Smith, Douglas

    2016-01-01

    The Center for Disease Control estimates 128,000 people in the U.S. are hospitalized annually due to food borne illnesses. This has created a demand for food safety testing targeting the detection of pathogenic mold and bacteria on agricultural products. This risk extends to medical Cannabis and is of particular concern with inhaled, vaporized and even concentrated Cannabis products . As a result, third party microbial testing has become a regulatory requirement in the medical and recreational Cannabis markets, yet knowledge of the Cannabis microbiome is limited. Here we describe the first next generation sequencing survey of the fungal communities found in dispensary based Cannabis flowers by ITS2 sequencing, and demonstrate the sensitive detection of several toxigenic Penicillium and Aspergillus species, including P. citrinum and P. paxilli, that were not detected by one or more culture-based methods currently in use for safety testing. PMID:27303623

  13. Molecular Mechanisms of HIV Type 1 Prophylaxis Failure Revealed by Single-Genome Sequencing

    PubMed Central

    Li, Hui; Blair, Lily; Chen, Yalu; Learn, Gerald; Pfafferott, Katja; John, Mina; Bhattacharya, Tanmoy; Hahn, Beatrice H.; Mallal, Simon; Shaw, George M.; Bar, Katharine J.

    2013-01-01

    Trials of human immunodeficiency virus type 1 (HIV) pre- and postexposure prophylaxis show promise. Here, we describe a novel strategy for deciphering mechanisms of prophylaxis failure that could improve therapeutic outcomes. A healthcare worker began antiretroviral prophylaxis immediately after a high-risk needlestick injury but nonetheless became viremic 11 weeks later. Single-genome sequencing of plasma viral RNA identified 15 drug susceptible transmitted/founder HIV genomes responsible for productive infection. Sequences emanating from these genomes exhibited extremely low diversity, suggesting virus sequestration as opposed to low-level replication as the cause of breakthrough infection. Identification of transmitted/founder viruses allows for genome-wide assessment of molecular mechanisms of prophylaxis failure. PMID:24023257

  14. Exome Sequencing Reveals Cubilin Mutation as a Single-Gene Cause of Proteinuria

    PubMed Central

    Ovunc, Bugsu; Otto, Edgar A.; Vega-Warner, Virginia; Saisawat, Pawaree; Ashraf, Shazia; Ramaswami, Gokul; Fathy, Hanan M.; Schoeb, Dominik; Chernin, Gil; Lyons, Robert H.; Yilmaz, Engin

    2011-01-01

    In two siblings of consanguineous parents with intermittent nephrotic-range proteinuria, we identified a homozygous deleterious frameshift mutation in the gene CUBN, which encodes cubulin, using exome capture and massively parallel re-sequencing. The mutation segregated with affected members of this family and was absent from 92 healthy individuals, thereby identifying a recessive mutation in CUBN as the single-gene cause of proteinuria in this sibship. Cubulin mutations cause a hereditary form of megaloblastic anemia secondary to vitamin B12 deficiency, and proteinuria occurs in 50% of cases since cubilin is coreceptor for both the intestinal vitamin B12-intrinsic factor complex and the tubular reabsorption of protein in the proximal tubule. In summary, we report successful use of exome capture and massively parallel re-sequencing to identify a rare, single-gene cause of nephropathy. PMID:21903995

  15. Exome sequencing reveals cubilin mutation as a single-gene cause of proteinuria.

    PubMed

    Ovunc, Bugsu; Otto, Edgar A; Vega-Warner, Virginia; Saisawat, Pawaree; Ashraf, Shazia; Ramaswami, Gokul; Fathy, Hanan M; Schoeb, Dominik; Chernin, Gil; Lyons, Robert H; Yilmaz, Engin; Hildebrandt, Friedhelm

    2011-10-01

    In two siblings of consanguineous parents with intermittent nephrotic-range proteinuria, we identified a homozygous deleterious frameshift mutation in the gene CUBN, which encodes cubulin, using exome capture and massively parallel re-sequencing. The mutation segregated with affected members of this family and was absent from 92 healthy individuals, thereby identifying a recessive mutation in CUBN as the single-gene cause of proteinuria in this sibship. Cubulin mutations cause a hereditary form of megaloblastic anemia secondary to vitamin B(12) deficiency, and proteinuria occurs in 50% of cases since cubilin is coreceptor for both the intestinal vitamin B(12)-intrinsic factor complex and the tubular reabsorption of protein in the proximal tubule. In summary, we report successful use of exome capture and massively parallel re-sequencing to identify a rare, single-gene cause of nephropathy.

  16. Molecular mechanisms of HIV type 1 prophylaxis failure revealed by single-genome sequencing.

    PubMed

    Li, Hui; Blair, Lily; Chen, Yalu; Learn, Gerald; Pfafferott, Katja; John, Mina; Bhattacharya, Tanmoy; Hahn, Beatrice H; Mallal, Simon; Shaw, George M; Bar, Katharine J

    2013-11-15

    Trials of human immunodeficiency virus type 1 (HIV) pre- and postexposure prophylaxis show promise. Here, we describe a novel strategy for deciphering mechanisms of prophylaxis failure that could improve therapeutic outcomes. A healthcare worker began antiretroviral prophylaxis immediately after a high-risk needlestick injury but nonetheless became viremic 11 weeks later. Single-genome sequencing of plasma viral RNA identified 15 drug susceptible transmitted/founder HIV genomes responsible for productive infection. Sequences emanating from these genomes exhibited extremely low diversity, suggesting virus sequestration as opposed to low-level replication as the cause of breakthrough infection. Identification of transmitted/founder viruses allows for genome-wide assessment of molecular mechanisms of prophylaxis failure.

  17. Functional metagenomics reveals novel β-galactosidases not predictable from gene sequences

    PubMed Central

    Cheng, Jiujun; Romantsov, Tatyana; Engel, Katja; Doxey, Andrew C.; Rose, David R.; Neufeld, Josh D.

    2017-01-01

    The techniques of metagenomics have allowed researchers to access the genomic potential of uncultivated microbes, but there remain significant barriers to determination of gene function based on DNA sequence alone. Functional metagenomics, in which DNA is cloned and expressed in surrogate hosts, can overcome these barriers, and make important contributions to the discovery of novel enzymes. In this study, a soil metagenomic library carried in an IncP cosmid was used for functional complementation for β-galactosidase activity in both Sinorhizobium meliloti (α-Proteobacteria) and Escherichia coli (γ-Proteobacteria) backgrounds. One β-galactosidase, encoded by six overlapping clones that were selected in both hosts, was identified as a member of glycoside hydrolase family 2. We could not identify ORFs obviously encoding possible β-galactosidases in 19 other sequenced clones that were only able to complement S. meliloti. Based on low sequence identity to other known glycoside hydrolases, yet not β-galactosidases, three of these ORFs were examined further. Biochemical analysis confirmed that all three encoded β-galactosidase activity. Lac36W_ORF11 and Lac161_ORF7 had conserved domains, but lacked similarities to known glycoside hydrolases. Lac161_ORF10 had neither conserved domains nor similarity to known glycoside hydrolases. Bioinformatic and structural modeling implied that Lac161_ORF10 protein represented a novel enzyme family with a five-bladed propeller glycoside hydrolase domain. By discovering founding members of three novel β-galactosidase families, we have reinforced the value of functional metagenomics for isolating novel genes that could not have been predicted from DNA sequence analysis alone. PMID:28273103

  18. Next-Generation Sequencing Reveals Frequent Opportunities for Exposure to Hepatitis C Virus in Ghana.

    PubMed

    Forbi, Joseph C; Layden, Jennifer E; Phillips, Richard O; Mora, Nallely; Xia, Guo-Liang; Campo, David S; Purdy, Michael A; Dimitrova, Zoya E; Owusu, Dorcas O; Punkova, Lili T; Skums, Pavel; Owusu-Ofori, Shirley; Sarfo, Fred Stephen; Vaughan, Gilberto; Roh, Hajung; Opare-Sem, Ohene K; Cooper, Richard S; Khudyakov, Yury E

    2015-01-01

    Globally, hepatitis C Virus (HCV) infection is responsible for a large proportion of persons with liver disease, including cancer. The infection is highly prevalent in sub-Saharan Africa. West Africa was identified as a geographic origin of two HCV genotypes. However, little is known about the genetic composition of HCV populations in many countries of the region. Using conventional and next-generation sequencing (NGS), we identified and genetically characterized 65 HCV strains circulating among HCV-positive blood donors in Kumasi, Ghana. Phylogenetic analysis using consensus sequences derived from 3 genomic regions of the HCV genome, 5'-untranslated region, hypervariable region 1 (HVR1) and NS5B gene, consistently classified the HCV variants (n = 65) into genotypes 1 (HCV-1, 15%) and genotype 2 (HCV-2, 85%). The Ghanaian and West African HCV-2 NS5B sequences were found completely intermixed in the phylogenetic tree, indicating a substantial genetic heterogeneity of HCV-2 in Ghana. Analysis of HVR1 sequences from intra-host HCV variants obtained by NGS showed that three donors were infected with >1 HCV strain, including infections with 2 genotypes. Two other donors share an HCV strain, indicating HCV transmission between them. The HCV-2 strain sampled from one donor was replaced with another HCV-2 strain after only 2 months of observation, indicating rapid strain switching. Bayesian analysis estimated that the HCV-2 strains in Ghana were expanding since the 16th century. The blood donors in Kumasi, Ghana, are infected with a very heterogeneous HCV population of HCV-1 and HCV-2, with HCV-2 being prevalent. The detection of three cases of co- or super-infections and transmission linkage between 2 cases suggests frequent opportunities for HCV exposure among the blood donors and is consistent with the reported high HCV prevalence. The conditions for effective HCV-2 transmission existed for ~ 3-4 centuries, indicating a long epidemic history of HCV-2 in Ghana.

  19. Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing.

    PubMed

    Vander Heiden, Jason A; Stathopoulos, Panos; Zhou, Julian Q; Chen, Luan; Gilbert, Tamara J; Bolen, Christopher R; Barohn, Richard J; Dimachkie, Mazen M; Ciafaloni, Emma; Broering, Teresa J; Vigneault, Francois; Nowak, Richard J; Kleinstein, Steven H; O'Connor, Kevin C

    2017-02-15

    Myasthenia gravis (MG) is a prototypical B cell-mediated autoimmune disease affecting 20-50 people per 100,000. The majority of patients fall into two clinically distinguishable types based on whether they produce autoantibodies targeting the acetylcholine receptor (AChR-MG) or muscle specific kinase (MuSK-MG). The autoantibodies are pathogenic, but whether their generation is associated with broader defects in the B cell repertoire is unknown. To address this question, we performed deep sequencing of the BCR repertoire of AChR-MG, MuSK-MG, and healthy subjects to generate ∼518,000 unique VH and VL sequences from sorted naive and memory B cell populations. AChR-MG and MuSK-MG subjects displayed distinct gene segment usage biases in both VH and VL sequences within the naive and memory compartments. The memory compartment of AChR-MG was further characterized by reduced positive selection of somatic mutations in the VH CDR and altered VH CDR3 physicochemical properties. The VL repertoire of MuSK-MG was specifically characterized by reduced V-J segment distance in recombined sequences, suggesting diminished VL receptor editing during B cell development. Our results identify large-scale abnormalities in both the naive and memory B cell repertoires. Particular abnormalities were unique to either AChR-MG or MuSK-MG, indicating that the repertoires reflect the distinct properties of the subtypes. These repertoire abnormalities are consistent with previously observed defects in B cell tolerance checkpoints in MG, thereby offering additional insight regarding the impact of tolerance defects on peripheral autoimmune repertoires. These collective findings point toward a deformed B cell repertoire as a fundamental component of MG.

  20. Genome Sequence of Microbulbifer mangrovi DD-13(T) Reveals Its Versatility to Degrade Multiple Polysaccharides.

    PubMed

    Imran, Md; Pant, Poonam; Shanbhag, Yogini P; Sawant, Samir V; Ghadi, Sanjeev C

    2017-02-01

    Microbulbifer mangrovi strain DD-13(T) is a novel-type species isolated from the mangroves of Goa, India. The draft genome sequence of strain DD-13 comprised 4,528,106 bp with G+C content of 57.15%. Out of 3479 open reading frames, functions for 3488 protein coding sequences were predicted on the basis of similarity with the cluster of orthologous groups. In addition to protein coding sequences, 34 tRNA genes and 3 rRNA genes were detected. Analysis of nucleotide sequence of predicted gene using a Carbohydrate-Active Enzymes (CAZymes) Analysis Toolkit indicates that strain DD-13 encodes a large set of CAZymes including 255 glycoside hydrolases, 76 carbohydrate esterases, 17 polysaccharide lyases, and 113 carbohydrate-binding modules (CBMs). Many genes from strain DD-13 were annotated as carbohydrases specific for degradation of agar, alginate, carrageenan, chitin, xylan, pullulan, cellulose, starch, β-glucan, pectin, etc. Some of polysaccharide-degrading genes were highly modular and were appended at least with one CBM indicating the versatility of strain DD-13 to degrade complex polysaccharides. The cell growth of strain DD-13 was validated using pure polysaccharides such as agarose or alginate as carbon source as well as by using red and brown seaweed powder as substrate. The homologous carbohydrase produced by strain DD-13 during growth degraded the polysaccharide, ensuring the production of metabolizable reducing sugars. Additionally, several other polysaccharides such as carrageenan, xylan, pullulan, pectin, starch, and carboxymethyl cellulose were also corroborated as growth substrate for strain DD-13 and were associated with concomitant production of homologous carbohydrase.

  1. Multilocus Sequence Analysis Reveals that Vibrio harveyi and V. campbellii Are Distinct Species▿ †

    PubMed Central

    Thompson, Fabiano L.; Gomez-Gil, Bruno; Vasconcelos, Ana Teresa Ribeiro; Sawabe, Tomoo

    2007-01-01

    Identification and classification of Vibrio species have relied upon band pattern methods (e.g., amplified fragment length polymorphism) and DNA-DNA hybridization. However, data generated by these methods cannot be used to build an online electronic taxonomy. In order to overcome these limitations, we developed the first standard multilocus sequence scheme focused on the ubiquitous and pathogenic Vibrio harveyi species group (i.e., V. harveyi, V. campbellii, V. rotiferianus, and a new as yet unnamed species). We examined a collection of 104 isolates from different geographical regions and hosts using segments of seven housekeeping genes. These two species formed separated clusters on the basis of topA, pyrH, ftsZ, and mreB gene sequences. The phylogenetic picture obtained by the other three loci, i.e., gyrB, recA, and gapA, was more complex though. V. campbellii appeared nested within V. harveyi in the recA trees, whereas V. harveyi formed a tight nested cluster within V. campbellii by gapA. The gyrB gene had no taxonomic resolution and grouped the two species together. The fuzziness observed in these three genes seems not be related to recombination but to low divergence due to the accumulation of only a few substitutions. In spite of this, the concatenated sequences provided evidence that the two species form two separated clusters. These clusters did not arise by recombination but by accumulation of point mutations. V. harveyi and V. campbellii isolates can be readily identified through the open database resource developed in this study (http://www.taxvibrio.lncc.br/). We argue that the species should be defined by evolutionary criteria. Strains of the same species will share at least 95% concatenated sequence similarity using the seven loci, and, most importantly, cospecific strains will form cohesive readily recognizable phylogenetic clades. PMID:17483280

  2. Biases during DNA extraction of activated sludge samples revealed by high throughput sequencing.

    PubMed

    Guo, Feng; Zhang, Tong

    2013-05-01

    Standardization of DNA extraction is a fundamental issue of fidelity and comparability in investigations of environmental microbial communities. Commercial kits for soil or feces are often adopted for studies of activated sludge because of a lack of specific kits, but they have never been evaluated regarding their effectiveness and potential biases based on high throughput sequencing. In this study, seven common DNA extraction kits were evaluated, based on not only yield/purity but also sequencing results, using two activated sludge samples (two sub-samples each, i.e. ethanol-fixed and fresh, as-is). The results indicate that the bead-beating step is necessary for DNA extraction from activated sludge. The two kits without the bead-beating step yielded very low amounts of DNA, and the least abundant operational taxonomic units (OTUs), and significantly underestimated the Gram-positive Actinobacteria, Nitrospirae, Chloroflexi, and Alphaproteobacteria and overestimated Gammaproteobacteria, Deltaproteobacteria, Bacteroidetes, and the rare phyla whose cell walls might have been readily broken. Among the other five kits, FastDNA(@) SPIN Kit for Soil extracted the most and the purest DNA. Although the number of total OTUs obtained using this kit was not the highest, the abundant OTUs and abundance of Actinobacteria demonstrated its efficiency. The three MoBio kits and one ZR kit produced fair results, but had a relatively low DNA yield and/or less Actinobacteria-related sequences. Moreover, the 50 % ethanol fixation increased the DNA yield, but did not change the sequenced microbial community in a significant way. Based on the present study, the FastDNA SPIN kit for Soil is recommended for DNA extraction of activated sludge samples. More importantly, the selection of the DNA extraction kit must be done carefully if the samples contain dominant lysing-resistant groups, such as Actinobacteria and Nitrospirae.

  3. beta-Keratins in crocodiles reveal amino acid homology with avian keratins.

    PubMed

    Ye, Changjiang; Wu, Xiaobing; Yan, Peng; Amato, George

    2010-03-01

    The DNA sequences encoding beta-keratin have been obtained from Marsh Mugger (Crocodylus palustris) and Orinoco Crocodiles (Crocodylus intermedius). Through the deduced amino acid sequence, these proteins are rich in glycine, proline and serine. The central region of the proteins are composed of two beta-folded regions and show a high degree of identity with beta-keratins of aves and squamates. This central part is thought to be the site of polymerization to build the framework of beta-keratin filaments. It is believed that the beta-keratins in reptiles and birds share a common ancestry. Near the C-terminal, these beta-keratins contain a peptide rich in glycine-X and glycine-X-X, and the distinctive feature of the region is some 12-amino acid repeats, which are similar to the 13-amino acid repeats in chick scale keratin but absent from avian feather keratin. From our phylogenetic analysis, the beta-keratins in crocodile have a closer relationship with avian keratins than the other keratins in reptiles.

  4. The components of the Daphnia pulex immune system as revealed by complete genome sequencing

    PubMed Central

    McTaggart, Seanna J; Conlon, Claire; Colbourne, John K; Blaxter, Mark L; Little, Tom J

    2009-01-01

    Background Branchiopod crustaceans in the genus Daphnia are key model organisms for investigating interactions between genes and the environment. One major theme of research on Daphnia species has been the evolution of resistance to pathogens and parasites, but lack of knowledge of the Daphnia immune system has limited the study of immune responses. Here we provide a survey of the immune-related genome of D. pulex, derived from the newly completed genome sequence. Genes likely to be involved in innate immune responses were identified by comparison to homologues from other arthropods. For each candidate, the gene model was refined, and we conducted an analysis of sequence divergence from homologues from other taxa. Results and conclusion We found that some immune pathways, in particular the TOLL pathway, are fairly well conserved between insects and Daphnia, while other elements, in particular antimicrobial peptides, could not be recovered from the genome sequence. We also found considerable variation in gene family copy number when comparing Daphnia to insects and present phylogenetic analyses to shed light on the evolution of a range of conserved immune gene families. PMID:19386092

  5. Quantitative Analysis of Fundus-Image Sequences Reveals Phase of Spontaneous Venous Pulsations

    PubMed Central

    Moret, Fabrice; Reiff, Charlotte M.; Lagrèze, Wolf A.; Bach, Michael

    2015-01-01

    Purpose Spontaneous venous pulsation correlates negatively with elevated intracranial pressure and papilledema, and it relates to glaucoma. Yet, its etiology remains unclear. A key element to elucidate its underlying mechanism is the time at which collapse occurs with respect to the heart cycle, but previous reports are contradictory. We assessed this question in healthy subjects using quantitative measurements of both vein diameters and artery lateral displacements; the latter being used as the marker of the ocular systole time. Methods We recorded 5-second fundus sequences with a near-infrared scanning laser ophthalmoscope in 12 young healthy subjects. The image sequences were coregistered, cleaned from microsaccades, and filtered via a principal component analysis to remove nonpulsatile dynamic features. Time courses of arterial lateral displacement and of diameter at sites of spontaneous venous pulsation or proximal to the disk were retrieved from those image sequences and compared. Results Four subjects displayed both arterial and venous pulsatile waveforms. On those, we observed venous diameter waveforms differing markedly among the subjects, ranging from a waveform matching the typical intraocular pressure waveform to a close replica of the arterial waveform. Conclusions The heterogeneity in waveforms and arteriovenous phases suggests that the mechanism governing the venous outflow resistance differs among healthy subjects. Translational relevance Further characterizations are necessary to understand the heterogeneous mechanisms governing the venous outflow resistance as this resistance is altered in glaucoma and is instrumental when monitoring intracranial hypertension based on fundus observations. PMID:26396929

  6. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    PubMed Central

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-01-01

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission. PMID:27530749

  7. High diversity of picornaviruses in rats from different continents revealed by deep sequencing.

    PubMed

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-Phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-08-17

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission.

  8. Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation

    PubMed Central

    Wegrzyn, Jill L.; Liechty, John D.; Stevens, Kristian A.; Wu, Le-Shin; Loopstra, Carol A.; Vasquez-Gross, Hans A.; Dougherty, William M.; Lin, Brian Y.; Zieve, Jacob J.; Martínez-García, Pedro J.; Holt, Carson; Yandell, Mark; Zimin, Aleksey V.; Yorke, James A.; Crepeau, Marc W.; Puiu, Daniela; Salzberg, Steven L.; de Jong, Pieter J.; Mockaitis, Keithanne; Main, Doreen; Langley, Charles H.; Neale, David B.

    2014-01-01

    The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20–40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%. PMID:24653211

  9. Legionella oakridgensis ATCC 33761 genome sequence and phenotypic characterization reveals its replication capacity in amoebae.

    PubMed

    Brzuszkiewicz, Elzbieta; Schulz, Tino; Rydzewski, Kerstin; Daniel, Rolf; Gillmaier, Nadine; Dittmann, Christine; Holland, Gudrun; Schunder, Eva; Lautner, Monika; Eisenreich, Wolfgang; Lück, Christian; Heuner, Klaus

    2013-12-01

    Legionella oakridgensis is able to cause Legionnaires' disease, but is less virulent compared to L. pneumophila strains and very rarely associated with human disease. L. oakridgensis is the only species of the family legionellae which is able to grow on media without additional cysteine. In contrast to earlier publications, we found that L. oakridgensis is able to multiply in amoebae. We sequenced the genome of L. oakridgensis type strain OR-10 (ATCC 33761). The genome is smaller than the other yet sequenced Legionella genomes and has a higher G+C-content of 40.9%. L. oakridgensis lacks a flagellum and it also lacks all genes of the flagellar regulon except of the alternative sigma-28 factor FliA and the anti-sigma-28 factor FlgM. Genes encoding structural components of type I, type II, type IV Lvh and type IV Dot/Icm, Sec- and Tat-secretion systems could be identified. Only a limited set of Dot/Icm effector proteins have been recognized within the genome sequence of L. oakridgensis. Like in L. pneumophila strains, various proteins with eukaryotic motifs and eukaryote-like proteins were detected. We could demonstrate that the Dot/Icm system is essential for intracellular replication of L. oakridgensis. Furthermore, we identified new putative virulence factors of Legionella.

  10. Sequencing the mouse Y chromosome reveals convergent gene acquisition and amplification on both sex chromosomes.

    PubMed

    Soh, Y Q Shirleen; Alföldi, Jessica; Pyntikova, Tatyana; Brown, Laura G; Graves, Tina; Minx, Patrick J; Fulton, Robert S; Kremitzki, Colin; Koutseva, Natalia; Mueller, Jacob L; Rozen, Steve; Hughes, Jennifer F; Owens, Elaine; Womack, James E; Murphy, William J; Cao, Qing; de Jong, Pieter; Warren, Wesley C; Wilson, Richard K; Skaletsky, Helen; Page, David C

    2014-11-06

    We sequenced the MSY (male-specific region of the Y chromosome) of the C57BL/6J strain of the laboratory mouse Mus musculus. In contrast to theories that Y chromosomes are heterochromatic and gene poor, the mouse MSY is 99.9% euchromatic and contains about 700 protein-coding genes. Only 2% of the MSY derives from the ancestral autosomes that gave rise to the mammalian sex chromosomes. Instead, all but 45 of the MSY's genes belong to three acquired, massively amplified gene families that have no homologs on primate MSYs but do have acquired, amplified homologs on the mouse X chromosome. The complete mouse MSY sequence brings to light dramatic forces in sex chromosome evolution: lineage-specific convergent acquisition and amplification of X-Y gene families, possibly fueled by antagonism between acquired X-Y homologs. The mouse MSY sequence presents opportunities for experimental studies of a sex-specific chromosome in its entirety, in a genetically tractable model organism.

  11. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

    PubMed Central

    Turner, Tychele N.; Hormozdiari, Fereydoun; Duyzend, Michael H.; McClymont, Sarah A.; Hook, Paul W.; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A.; Zody, Michael C.; Nelson, Bradley J.; Huddleston, John; Sandstrom, Richard; Smith, Joshua D.; Hanna, David; Swanson, James M.; Faustman, Elaine M.; Bamshad, Michael J.; Stamatoyannopoulos, John; Nickerson, Deborah A.; McCallion, Andrew S.; Darnell, Robert; Eichler, Evan E.

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  12. Mutational and fitness landscapes of an RNA virus revealed through population sequencing.

    PubMed

    Acevedo, Ashley; Brodsky, Leonid; Andino, Raul

    2014-01-30

    RNA viruses exist as genetically diverse populations. It is thought that diversity and genetic structure of viral populations determine the rapid adaptation observed in RNA viruses and hence their pathogenesis. However, our understanding of the mechanisms underlying virus evolution has been limited by the inability to accurately describe the genetic structure of virus populations. Next-generation sequencing technologies generate data of sufficient depth to characterize virus populations, but are limited in their utility because most variants are present at very low frequencies and are thus indistinguishable from next-generation sequencing errors. Here we present an approach that reduces next-generation sequencing errors and allows the description of virus populations with unprecedented accuracy. Using this approach, we define the mutation rates of poliovirus and uncover the mutation landscape of the population. Furthermore, by monitoring changes in variant frequencies on serially passaged populations, we determined fitness values for thousands of mutations across the viral genome. Mapping of these fitness values onto three-dimensional structures of viral proteins offers a powerful approach for exploring structure-function relationships and potentially uncovering new functions. To our knowledge, our study provides the first single-nucleotide fitness landscape of an evolving RNA virus and establishes a general experimental platform for studying the genetic changes underlying the evolution of virus populations.

  13. Gas-phase Ion Isomer Analysis Reveals the Mechanism of Peptide Sequence Scrambling

    PubMed Central

    Jia, Chenxi; Wu, Zhe; Lietz, Christopher B.; Liang, Zhidan; Cui, Qiang; Li, Lingjun

    2014-01-01

    Peptide sequence scrambling during mass spectrometry-based gas-phase fragmentation analysis causes misidentification of peptides and proteins. Thus, there is a need to develop an efficient approach to probing the gas-phase fragment ion isomers related to sequence scrambling and the underlying fragmentation mechanism, which will facilitate the development of bioinformatics algorithm for proteomics research. Herein, we report on the first use of electron transfer dissociation (ETD)-produced diagnostic fragment ions to probe the components of gas-phase peptide fragment ion isomers. In combination with ion mobility spectrometry (IMS) and formaldehyde labeling, this novel strategy enables qualitative and quantitative analysis of b-type fragment ion isomers. ETD fragmentation produced diagnostic fragment ions indicative of the precursor ion isomer components, and subsequent IMS analysis of b ion isomers provided their quantitative and structural information. The isomer components of three representative b ions (b9, b10, and b33 from three different peptides) were accurately profiled by this method. IMS analysis of the b9 ion isomers exhibited dynamic conversion among these structures. Furthermore, molecular dynamics simulation predicted theoretical drift time values which were in good agreement with experimentally measured values. Our results strongly support the mechanism of peptide sequence scrambling via b ion cyclization, and provide the first experimental evidence to support that the conversion from molecular precursor ion to cyclic b ion (M→cb) pathway is less energetically (or kinetically) favored. PMID:24313304

  14. Whole Genome Sequencing Reveals a De Novo SHANK3 Mutation in Familial Autism Spectrum Disorder

    PubMed Central

    Nemirovsky, Sergio I.; Córdoba, Marta; Zaiat, Jonathan J.; Completa, Sabrina P.; Vega, Patricia A.; González-Morón, Dolores; Medina, Nancy M.; Fabbro, Mónica; Romero, Soledad; Brun, Bianca; Revale, Santiago; Ogara, María Florencia; Pecci, Adali; Marti, Marcelo; Vazquez, Martin; Turjanski, Adrián; Kauffman, Marcelo A.

    2015-01-01

    Introduction Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD). Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS) for the diagnostic approach to ASD. Methods We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents. Results Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6). Conclusions We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder. PMID:25646853

  15. Diversity of the Cronobacter Genus as Revealed by Multilocus Sequence Typing