Science.gov

Sample records for acid sequence comparisons

  1. Comparisons of the Distribution of Nucleotides and Common Sequences in Deoxyribonucleic Acid from Selected Bacteriophages

    PubMed Central

    Skalka, A.; Hanson, P.

    1972-01-01

    Results from comparisons of deoxyribonucleic acid (DNA) from several classes of bacteriophages suggest that most phage chromosomes contain either a homogeneous distribution of nucleotides or are made up of a few, rather large segments of different quanine plus cytosine (G + C) contents which are internally homogeneous. Among those temperate phages tested, most contained segmented DNA. Comparisons of sequence similarities among segments from lambdoid phage DNA species revealed the following order in relatedness to λ: 82 (and 434) > 21 > 424 > φ80. Most common sequences are found in the highest G + C segments, which in λ contain head and tail genes. Hybridization tests with λ and 186 or P2 DNA species verified that the lambdoids and 186 and P2 belong to two distinct groups. There are fewer homologous sequences between the DNA species of coliphages λ and P2 or 186 than there are between the DNA species of coliphage λ and salmonella phage P22. PMID:4553679

  2. Sequence analysis of four acidic beta-crystallin subunits of amphibian lenses: phylogenetic comparison between beta- and gamma-crystallins.

    PubMed

    Lu, S F; Pan, F M; Chiou, S H

    1996-04-16

    beta-Crystallins composed of the most heterogeneous group of subunit chains among the three major crystallin families of vertebrates, i.e. alpha-, beta- and gamma-crystallins, are less well understood at the structural and functional levels than the other two. They comprise a multigene family with at least three basic (betaB1-3) and four acidic (betaA1-4) subunit polypeptides. In order to facilitate the determination of the primary sequences of all these ubiquitous crystallin subunits present in all vertebrate species, cDNA mixture was synthesized from the poly(A)+ mRNA isolated from bullfrog eye lenses. We report here a protocol of Rapid Amplification of cDNA Ends (RACE) was used to amplify cDNAs encoding beta-crystallin acidic subunit polypeptides by polymerase chain reaction (PCR). Four complete full-length reading frames with two each of 597 and 648 base pairs, which cover four deduced protein sequences of 198 (betaA1-1 and betaA1-2) and 215 (betaA3-1 and betaA3-2) amino acids including the universal initiating methionine, were revealed by nucleotide sequencing. They show about 96-98% sequence similarity among themselves and 76-80%, 80-83% to the homologous betaA1/A3 crystallins of bovine and human species respectively, revealing the close structural relationship among acidic subunits of all beta-crystallins even from remotely related species. In this study a phylogenetic comparison based on amino-acid sequences of various betaA1/A3 crystallins plus the major basic beta-crystallin (betaBp) and gamma-crystallin from different vertebrate species is made using a combination of distance matrix and approximate parsimony methods, which correctly groups these betaA crystallin chains together as one family distinct from basic beta-crystallins and gamma-crystallin and further corroborates the supposition that beta- and gamma-crystallins form a superfamily with a common ancestry.

  3. Comparison of the amino acid sequence of the major immunogen from three serotypes of foot and mouth disease virus.

    PubMed Central

    Makoff, A J; Paynter, C A; Rowlands, D J; Boothroyd, J C

    1982-01-01

    Cloned cDNA molecules from three serotypes of FMDV have been sequenced around the VP1-coding region. The predicted amino acid sequences for VP1 were compared with the published sequences and variable regions identified. The amino acid sequences were also analysed for hydrophilic regions. Two of the variable regions, numbered 129-160 and 193-204 overlapped hydrophilic regions, and were therefore identified as potentially immunogenic. These regions overlap regions shown by others to be immunogenic. PMID:6298715

  4. Comparison of amino acid sequence of bovine coagulation Factor IX (Christmas Factor) with that of other vitamin K-dependent plasma proteins.

    PubMed

    Katayama, K; Ericsson, L H; Enfield, D L; Walsh, K A; Neurath, H; Davie, E W; Titani, K

    1979-10-01

    The amino acid sequence of bovine blood coagulation Factor IX (Christmas Factor) is presented and compared with the sequences of other vitamin K-dependent plasma proteins and pancreatic trypsinogen. The 416-residue sequence of Factor IX was determined largely by automated Edman degradation of two large segments, containing 181 and 235 residues, isolated after activating Factor IX with a protease from Russell's viper venom. Subfragments of the two segments were produced by enzymatic digestion and by chemical cleavage of methionyl, tryptophyl, and asparaginyl-glycyl bonds. Comparison of the amino acid sequences of Factor IX, Factor X, and Protein C demonstrates that they are homologous throughout. Their homology with prothrombin, however, is restricted to the amino-terminal region, which is rich in gamma-carboxyglutamic acid, and the carboxyl-terminal region, which represents the catalytic domain of these proteins and corresponds to that of pancreatic serine proteases.

  5. Comparison of amino acid sequences of the trypsin inhibitors from taro (Colocasia esculenta), giant taro (Alocasia macrorrhiza) and giant swamp taro (Cyrtosperma chamissonis).

    PubMed

    Peng, L; Bradbury, J H; Hammer, B C; Shaw, D C

    1993-09-01

    The amino acid sequences of the trypsin inhibitors from taro Colocasia esculenta var. esculenta and giant swamp taro Cyrtosperma chamissonis have been determined and are compared with the protein sequence of the trypsin/chymotrypsin inhibitor from giant taro Alocasia macrorrhiza. Both inhibitors display polymorphism and there is evidence of two components in the giant swamp taro. The positional identity between the proteins is highest at 73-75% for the comparison of the giant taro (GT) with the polymorphic forms of the taro (T) inhibitors and lowest at 56-58% for the pairs of taro and giant swamp taro (GST) proteins. The comparisons show that the inhibitors from T and GT are more related to each other than to GST, which supports their taxonomic classification into different tribes. Location of the P1 site for the trypsin inhibitors of aroids is different from that of other Kunitz-type inhibitors and could be at Leu56.

  6. Sequence of cDNA for rat cystathionine gamma-lyase and comparison of deduced amino acid sequence with related Escherichia coli enzymes.

    PubMed Central

    Erickson, P F; Maxwell, I H; Su, L J; Baumann, M; Glode, L M

    1990-01-01

    A cDNA clone for cystathionine gamma-lyase was isolated from a rat cDNA library in lambda gt11 by screening with a monospecific antiserum. The identity of this clone, containing 600 bp proximal to the 3'-end of the gene, was confirmed by positive hybridization selection. Northern-blot hybridization showed the expected higher abundance of the corresponding mRNA in liver than in brain. Two further cDNA clones from a plasmid pcD library were isolated by colony hybridization with the first clone and were found to contain inserts of 1600 and 1850 bp. One of these was confirmed as encoding cystathionine gamma-lyase by hybridization with two independent pools of oligodeoxynucleotides corresponding to partial amino acid sequence information for cystathionine gamma-lyase. The other clone (estimated to represent all but 8% of the 5'-end of the mRNA) was sequenced and its deduced amino acid sequence showed similarity to those of the Escherichia coli enzymes cystathionine beta-lyase and cystathionine gamma-synthase throughout its length, especially to that of the latter. Images Fig. 1. Fig. 2. Fig. 3. Fig. 5. PMID:2201285

  7. Composition for nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  8. Sequence Comparison and Phylogeny of Nucleotide Sequence of Coat Protein and Nucleic Acid Binding Protein of a Distinct Isolate of Shallot virus X from India.

    PubMed

    Majumder, S; Baranwal, V K

    2011-06-01

    Shallot virus X (ShVX), a type species in the genus Allexivirus of the family Alfaflexiviridae has been associated with shallot plants in India and other shallot growing countries like Russia, Germany, Netherland, and New Zealand. Coat protein (CP) and nucleic acid binding protein (NB) region of the virus was obtained by reverse transcriptase polymerase chain reaction from scales leaves of shallot bulbs. The partial cDNA contained two open reading frames encoding proteins of molecular weights of 28.66 and 14.18 kDa belonging to Flexi_CP super-family and viral NB super-family, respectively. The percent identity and phylogenetic analysis of amino acid sequences of CP and NB region of the virus associated with shallot indicated that it was a distinct isolate of ShVX.

  9. Sequence Comparison and Phylogeny of Nucleotide Sequence of Coat Protein and Nucleic Acid Binding Protein of a Distinct Isolate of Shallot virus X from India.

    PubMed

    Majumder, S; Baranwal, V K

    2011-06-01

    Shallot virus X (ShVX), a type species in the genus Allexivirus of the family Alfaflexiviridae has been associated with shallot plants in India and other shallot growing countries like Russia, Germany, Netherland, and New Zealand. Coat protein (CP) and nucleic acid binding protein (NB) region of the virus was obtained by reverse transcriptase polymerase chain reaction from scales leaves of shallot bulbs. The partial cDNA contained two open reading frames encoding proteins of molecular weights of 28.66 and 14.18 kDa belonging to Flexi_CP super-family and viral NB super-family, respectively. The percent identity and phylogenetic analysis of amino acid sequences of CP and NB region of the virus associated with shallot indicated that it was a distinct isolate of ShVX. PMID:23637504

  10. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  11. Chip-based sequencing nucleic acids

    SciTech Connect

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  12. A case of orthologous sequences of hemocyanin subunits for an evolutionary study of horseshoe crabs: amino acid sequence comparison of immunologically identical subunits of Carcinoscorpius rotundicauda and Tachypleus tridentatus.

    PubMed

    Sugita, H; Shishikura, F

    1995-10-01

    About 83% of the amino acid sequence of hemocyanin subunit HR6 from the Southeast Asian horseshoe crab, Carcinoscorpius rotundicauda, has been determined. There is a difference of about 43% between HR6 and complete sequences of chelicerate hemocyanin subunits from the American horseshoe crab, Limulus polyphemus, and a tarantula, Eurypelma californicum. However, the immunologically identical subunits HR6 and HT6 from Tachypleus tridentatus (Japanese horseshoe crab) show 2.7% sequence difference. Based on the amino acid sequences of HR6 and HT6, the divergence between C. rotundicauda and T. tridentatus occurred about 9.6 million years ago. In the case of horseshoe crab hemocyanin subunits, it seems that the orthologous homologues in many homologous subunits between species are immunologically detectable.

  13. Method and apparatus for biological sequence comparison

    DOEpatents

    Marr, Thomas G.; Chang, William I-Wei

    1997-01-01

    A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.

  14. Method and apparatus for biological sequence comparison

    DOEpatents

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  15. Sequence comparisons via algorithmic mutual information.

    PubMed

    Milosavljević, A

    1994-01-01

    One of the main problems in DNA and protein sequence comparisons is to decide whether observed similarity of two sequences should be explained by their relatedness or by mere presence of some shared internal structure, e.g., shared internal tandem repeats. The standard methods that are based on statistics or classical information theory can be used to discover either internal structure or mutual sequence similarity, but cannot take into account both. Consequently, currently used methods for sequence comparison employ "masking" techniques that simply eliminate sequences that exhibit internal repetitive structure prior to sequence comparisons. The "masking" approach precludes discovery of homologous sequences of moderate or low complexity, which abound at both DNA and protein levels. As a solution to this problem, we propose a general method that is based on algorithmic information theory and minimal length encoding. We show that algorithmic mutual information factors out the sequence similarity that is due to shared internal structure and thus enables discovery of truly related sequences. We extend that recently developed algorithmic significance method (Milosavljević & Jurka 1993) to show that significance depends exponentially on algorithmic mutual information.

  16. Distinguishing proteins from arbitrary amino acid sequences.

    PubMed

    Yau, Stephen S-T; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  17. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  18. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  19. Comparison of metagenomic samples using sequence signatures

    PubMed Central

    2012-01-01

    Background Sequence signatures, as defined by the frequencies of k-tuples (or k-mers, k-grams), have been used extensively to compare genomic sequences of individual organisms, to identify cis-regulatory modules, and to study the evolution of regulatory sequences. Recently many next-generation sequencing (NGS) read data sets of metagenomic samples from a variety of different environments have been generated. The assembly of these reads can be difficult and analysis methods based on mapping reads to genes or pathways are also restricted by the availability and completeness of existing databases. Sequence-signature-based methods, however, do not need the complete genomes or existing databases and thus, can potentially be very useful for the comparison of metagenomic samples using NGS read data. Still, the applications of sequence signature methods for the comparison of metagenomic samples have not been well studied. Results We studied several dissimilarity measures, including d2, d2* and d2S recently developed from our group, a measure (hereinafter noted as Hao) used in CVTree developed from Hao’s group (Qi et al., 2004), measures based on relative di-, tri-, and tetra-nucleotide frequencies as in Willner et al. (2009), as well as standard lp measures between the frequency vectors, for the comparison of metagenomic samples using sequence signatures. We compared their performance using a series of extensive simulations and three real next-generation sequencing (NGS) metagenomic datasets: 39 fecal samples from 33 mammalian host species, 56 marine samples across the world, and 13 fecal samples from human individuals. Results showed that the dissimilarity measure d2S can achieve superior performance when comparing metagenomic samples by clustering them into different groups as well as recovering environmental gradients affecting microbial samples. New insights into the environmental factors affecting microbial compositions in metagenomic samples are obtained through

  20. Bovine Parathyroid Hormone: Amino Acid Sequence

    PubMed Central

    Brewer, H. Bryan; Ronan, Rosemary

    1970-01-01

    Bovine parathyroid hormone has been isolated in homogeneous form, and its complete amino acid sequence determined. The bovine hormone is a single chain, 84 amino acids long. It contains amino-terminal alanine, and carboxyl-terminal glutamine. The bovine parathyroid hormone is approximately three times the length of the newly discovered hormone, thyrocalcitonin, whose action is reciprocal to parathyroid hormone. Images PMID:5275384

  1. Supercomputers and biological sequence comparison algorithms.

    PubMed

    Core, N G; Edmiston, E W; Saltz, J H; Smith, R M

    1989-12-01

    Comparison of biological (DNA or protein) sequences provides insight into molecular structure, function, and homology and is increasingly important as the available databases become larger and more numerous. One method of increasing the speed of the calculations is to perform them in parallel. We present the results of initial investigations using two dynamic programming algorithms on the Intel iPSC hypercube and the Connection Machine as well as an inexpensive, heuristically-based algorithm on the Encore Multimax.

  2. Quantitation of HIV-1 RNA viral load using nucleic acid sequence based amplification methodology and comparison with other surrogate markers for disease progression.

    PubMed

    Sitnik, R; Pinho, J R

    1998-01-01

    In this study, HIV-1 viral blood quantitation determined by Nucleic Acid Sequence Based Amplification (NASBA) was compared with other surrogate disease progression markers (antigen p24, CD4/CD8 cell counts and beta-2 microglobulin) in 540 patients followed up at São Paulo, SP, Brazil. HIV-1 RNA detection was statistically associated with the presence of antigen p24, but the viral RNA was also detected in 68% of the antigen p24 negative samples, confirming that NASBA is much more sensitive than the determination of antigen p24. Regarding other surrogate markers, no statistically significant association with the detection of viral RNA was found. The reproducibility of this viral load assay was assessed by 14 runs of the same sample, using different reagents batches. Viral load values in this sample ranged from 5.83 to 6.27 log (CV = 36%), less than the range (0.5 log) established to the determination of significant viral load changes. PMID:9698880

  3. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  4. Molecular evolution of herpesviruses: genomic and protein sequence comparisons.

    PubMed Central

    Karlin, S; Mocarski, E S; Schachtel, G A

    1994-01-01

    Phylogenetic reconstruction of herpesvirus evolution is generally founded on amino acid sequence comparisons of specific proteins. These are relevant to the evolution of the specific gene (or set of genes), but the resulting phylogeny may vary depending on the particular sequence chosen for analysis (or comparison). In the first part of this report, we compare 13 herpesvirus genomes by using a new multidimensional methodology based on distance measures and partial orderings of dinucleotide relative abundances. The sequences were analyzed with respect to (i) genomic compositional extremes; (ii) total distances within and between genomes; (iii) partial orderings among genomes relative to a set of sequence standards; (iv) concordance correlations of genome distances; and (v) consistency with the alpha-, beta-, gammaherpesvirus classification. Distance assessments within individual herpesvirus genomes show each to be quite homogeneous relative to the comparisons between genomes. The gammaherpesviruses, Epstein-Barr virus (EBV), herpesvirus saimiri, and bovine herpesvirus 4 are both diverse and separate from other herpesvirus classes, whereas alpha- and betaherpesviruses overlap. The analysis revealed that the most central genome (closest to a consensus herpesvirus genome and most individual herpesvirus sequences of different classes) is that of human herpesvirus 6, suggesting that this genome is closest to a progenitor herpesvirus. The shorter DNA distances among alphaherpesviruses supports the hypothesis that the alpha class is of relatively recent ancestry. In our collection, equine herpesvirus 1 (EHV1) stands out as the most central alphaherpesvirus, suggesting it may approximate an ancestral alphaherpesvirus. Among all herpesviruses, the EBV genome is closest to human sequences. In the DNA partial orderings, the chicken sequence collection is invariably as close as or closer to all herpesvirus sequences than the human sequence collection is, which may imply that

  5. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  6. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  7. Optimization of short amino acid sequences classifier

    NASA Astrophysics Data System (ADS)

    Barcz, Aleksy; Szymański, Zbigniew

    This article describes processing methods used for short amino acid sequences classification. The data processed are 9-symbols string representations of amino acid sequences, divided into 49 data sets - each one containing samples labeled as reacting or not with given enzyme. The goal of the classification is to determine for a single enzyme, whether an amino acid sequence would react with it or not. Each data set is processed separately. Feature selection is performed to reduce the number of dimensions for each data set. The method used for feature selection consists of two phases. During the first phase, significant positions are selected using Classification and Regression Trees. Afterwards, symbols appearing at the selected positions are substituted with numeric values of amino acid properties taken from the AAindex database. In the second phase the new set of features is reduced using a correlation-based ranking formula and Gram-Schmidt orthogonalization. Finally, the preprocessed data is used for training LS-SVM classifiers. SPDE, an evolutionary algorithm, is used to obtain optimal hyperparameters for the LS-SVM classifier, such as error penalty parameter C and kernel-specific hyperparameters. A simple score penalty is used to adapt the SPDE algorithm to the task of selecting classifiers with best performance measures values.

  8. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  9. Development of an expert system for amino acid sequence identification.

    PubMed

    Hu, L; Saulinskas, E F; Johnson, P; Harrington, P B

    1996-08-01

    An expert system for amino acid sequence identification has been developed. The algorithm uses heuristic rules developed by human experts in protein sequencing. The system is applied to the chromatographic data of phenylthiohydantoin-amino acids acquired from an automated sequencer. The peak intensities in the current cycle are compared with those in the previous cycle, while the calibration and succeeding cycles are used as ancillary identification criteria when necessary. The retention time for each chromatographic peak in each cycle is corrected by the corresponding peak in the calibration cycle at the same run. The main improvement of our system compared with the onboard software used by the Applied Biosystems 477A Protein/Peptide Sequencer is that each peak in each cycle is assigned an identification name according to the corrected retention time to be used for the comparison with different cycles. The system was developed from analyses of ribonuclease A and evaluated by runs of four other protein samples that were not used in rule development. This paper demonstrates that rules developed by human experts can be automatically applied to sequence assignment. The expert system performed more accurately than the onboard software of the protein sequencer, in that the misidentification rates for the expert system were around 7%, whereas those for the onboard software were between 13 and 21%.

  10. Protein sequence comparison and protein evolution

    SciTech Connect

    Pearson, W.R.

    1995-12-31

    This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. This tutorial examines how the information conserved during the evolution of a protein molecule can be used to infer reliably homology, and thus a shared proteinfold and possibly a shared active site or function. The authors start by reviewing a geological/evolutionary time scale. Next they look at the evolution of several protein families. During the tutorial, these families will be used to demonstrate that homologous protein ancestry can be inferred with confidence. They also examine different modes of protein evolution and consider some hypotheses that have been presented to explain the very earliest events in protein evolution. The next part of the tutorial will examine the technical aspects of protein sequence comparison. Both optimal and heuristic algorithms and their associated parameters that are used to characterize protein sequence similarities are discussed. Perhaps more importantly, they survey the statistics of local similarity scores, and how these statistics can both be used to improve the selectivity of a search and to evaluate the significance of a match. They them examine distantly related members of three protein families, the serine proteases, the glutathione transferases, and the G-protein-coupled receptors (GCRs). Finally, the discuss how sequence similarity can be used to examine internal repeated or mosaic structures in proteins.

  11. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  12. Single-channel studies on linear gramicidins with altered amino acid sequences. A comparison of phenylalanine, tryptophane, and tyrosine substitutions at positions 1 and 11.

    PubMed Central

    Mazet, J L; Andersen, O S; Koeppe, R E

    1984-01-01

    The relation between chemical structure and permeability characteristics of transmembrane channels has been investigated with the linear gramicidins (A, B, and C), where the amino acid at position 1 was chemically replaced by phenylalanine, tryptophane or tyrosine. The purity of most of the compounds was estimated to be greater than 99.99%. The modifications resulted in a wide range of conductance changes in NaCl solutions: sixfold from tryptophane gramicidin A to tyrosine gramicidin B. The conductance changes induced by a given amino acid substitution at position 1 are not the same as at position 11. The only important change in the Na+ affinity was observed when the first amino acid was tyrosine. No major conformational changes of the polypeptide backbone structure could be detected on the basis of experiments with mixtures of different analogues and valine gramicidin A (except possibly with tyrosine at position 1), as all the compounds investigated could form hybrid channels with valine gramicidin A. The side chains are not in direct contact with the permeating ions. The results were therefore interpreted in terms of modifications of the energy profile for ion movement through the channel, possibly due to an electrostatic interaction between the dipoles of the side chains and ions in the channel. Images FIGURE 1 FIGURE 2 FIGURE 3 PMID:6201199

  13. Computational methods for protein sequence comparison and search.

    PubMed

    Xu, Dong

    2009-04-01

    Protein sequence comparison and search has become commonplace not only for bioinformatics researchers but also for experimentalists in many cases. Because of the exponential growth in sequence data, sequence comparison in particular has become an increasingly important tool. Relating a new gene sequence to other known sequences often reveals its function, structure, and evolution. Many sequence comparison and search tools are available through public Web servers, and biologists can use them easily with little knowledge of computers or bioinformatics. This unit provides some theoretical background and describes popular tools for dot plot, sequence search against a database, multiple sequence alignments, protein tree construction, and protein family and motif search. Step-by-step examples are provided to illustrate how to use some of the most well-known tools. Finally, some general advice is given on combining different sequence analysis tools for biological inference.

  14. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  15. Performance comparison of Next Generation sequencing platforms.

    PubMed

    Erguner, Bekir; Ustek, Duran; Sagiroglu, Mahmut S

    2015-01-01

    Next Generation DNA Sequencing technologies offer ultra high sequencing throughput for very low prices. The increase in throughput and diminished costs open up new research areas. Moreover, number of clinicians utilizing DNA sequencing keeps growing. One of the main concern for researchers and clinicians who are adopting these platforms is their sequencing accuracy. We compared three of the most commonly used Next Generation Sequencing platforms; Ion Torrent from Life Technologies, GS FLX+ from Roche and HiSeq 2000 from Illumina.

  16. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  17. Comparison of next-generation sequencing systems.

    PubMed

    Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

    2012-01-01

    With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized.

  18. Predicting intrinsic disorder from amino acid sequence.

    PubMed

    Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J; Dunker, A Keith

    2003-01-01

    Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. PMID:14579347

  19. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  20. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  1. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  2. Geometric Aspects of Biological Sequence Comparison

    PubMed Central

    Stojmirović, Aleksandar

    2009-01-01

    Abstract We introduce a geometric framework suitable for studying the relationships among biological sequences. In contrast to previous works, our formulation allows asymmetric distances (quasi-metrics), originating from uneven weighting of strings, which may induce non-trivial partial orders on sets of biosequences. The distances considered are more general than traditional generalized string edit distances. In particular, our framework enables non-trivial conversion between sequence similarities, both local and global, and distances. Our constructions apply to a wide class of scoring schemes and require much less restrictive gap penalties than the ones regularly used. Numerous examples are provided to illustrate the concepts introduced and their potential applications. PMID:19361329

  3. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  4. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  5. Heterogeneity of amino acid sequence in hippopotamus cytochrome c.

    PubMed

    Thompson, R B; Borden, D; Tarr, G E; Margoliash, E

    1978-12-25

    The amino acid sequences of chymotryptic and tryptic peptides of Hippopotamus amphibius cytochrome c were determined by a recent modification of the manual Edman sequential degradation procedure. They were ordered by comparison with the structure of the hog protein. The hippopotamus protein differs in three positions: serine, alanine, and glutamine replace alanine, glutamic acid, and lysine in positions 43, 92, and 100, respectively. Since the artiodactyl suborders diverged in the mid-Eocene some 50 million years ago, the fact that representatives of some of them show no differences in their cytochromes c (cow, sheep, and hog), while another exhibits as many as three such differences, verifies that even in relatively closely related lines of descent the rate at which cytochrome c changes in the course of evolution is not constant. Furthermore, 10.6% of the hippopotamus cytochrome c preparation was shown to contain isoleucine instead of valine at position 3, indicating that one of the four animals from which the protein was obtained was heterozygous in the cytochrome c gene. Such heterogeneity is a necessary condition of evolutionary variation and has not been previously observed in the cytochrome c of a wild mammalian population.

  6. Amino acid sequence of anionic peroxidase from the windmill palm tree Trachycarpus fortunei.

    PubMed

    Baker, Margaret R; Zhao, Hongwei; Sakharov, Ivan Yu; Li, Qing X

    2014-12-10

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications.

  7. From Artificial Amino Acids to Sequence-Defined Targeted Oligoaminoamides.

    PubMed

    Morys, Stephan; Wagner, Ernst; Lächelt, Ulrich

    2016-01-01

    Artificial oligoamino acids with appropriate protecting groups can be used for the sequential assembly of oligoaminoamides on solid-phase. With the help of these oligoamino acids multifunctional nucleic acid (NA) carriers can be designed and produced in highly defined topologies. Here we describe the synthesis of the artificial oligoamino acid Fmoc-Stp(Boc3)-OH, the subsequent assembly into sequence-defined oligomers and the formulation of tumor-targeted plasmid DNA (pDNA) polyplexes. PMID:27436323

  8. Homology of amino acid sequences of rat liver cathepsins B and H with that of papain.

    PubMed Central

    Takio, K; Towatari, T; Katunuma, N; Teller, D C; Titani, K

    1983-01-01

    The amino acid sequences of rat liver lysosomal thiol endopeptidases, cathepsins B and H, are presented and compared with that of the plant thiol protease papain. The 252-residue sequence of cathepsin B and the 220-residue sequence of cathepsin H were determined largely by automated Edman degradation of their intact polypeptide chains and of the two chains of each enzyme generated by limited proteolysis. Subfragments of the chains were produced by enzymatic digestion and by chemical cleavage of methionyl and tryptophanyl bonds. Comparison of the amino acid sequences of cathepsins B and H with each other and with that of papain demonstrates a striking homology among their primary structures. Sequence identity is extremely high in regions which, according to the three-dimensional structure of papain, constitute the catalytic site. The results not only reveal the first structural features of mammalian thiol endopeptidases but also provide insight into the evolutionary relationships among plant and mammalian thiol proteases. PMID:6574504

  9. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  10. Segments of amino acid sequence similarity in beta-amylases.

    PubMed

    Friedberg, F; Rhodes, C

    1988-01-01

    In alpha-amylases from animals, plants and bacteria and in beta-amylases from plants and bacteria a number of segments exhibit amino acid sequence similarity specific to the alpha or to the beta type, respectively. In the case of the beta-amylases the similar sequence regions are extensive and they are disrupted only by short interspersed dissimilar regions. Close to the C terminus, however, no such sequence similarity exist. PMID:2464171

  11. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  12. Comparison of mitochondrial genome sequences of pangolins (Mammalia, Pholidota).

    PubMed

    Hassanin, Alexandre; Hugot, Jean-Pierre; van Vuuren, Bettine Jansen

    2015-04-01

    The complete mitochondrial genome was sequenced for three species of pangolins, Manis javanica, Phataginus tricuspis, and Smutsia temminckii, and comparisons were made with two other species, Manis pentadactyla and Phataginus tetradactyla. The genome of Manidae contains the 37 genes found in a typical mammalian genome, and the structure of the control region is highly conserved among species. In Manis, the overall base composition differs from that found in African genera. Phylogenetic analyses support the monophyly of the genera Manis, Phataginus, and Smutsia, as well as the basal division between Maninae and Smutsiinae. Comparisons with GenBank sequences reveal that the reference genomes of M. pentadactyla and P. tetradactyla (accession numbers NC_016008 and NC_004027) were sequenced from misidentified taxa, and that a new species of tree pangolin should be described in Gabon. PMID:25746396

  13. Comparison of mitochondrial genome sequences of pangolins (Mammalia, Pholidota).

    PubMed

    Hassanin, Alexandre; Hugot, Jean-Pierre; van Vuuren, Bettine Jansen

    2015-04-01

    The complete mitochondrial genome was sequenced for three species of pangolins, Manis javanica, Phataginus tricuspis, and Smutsia temminckii, and comparisons were made with two other species, Manis pentadactyla and Phataginus tetradactyla. The genome of Manidae contains the 37 genes found in a typical mammalian genome, and the structure of the control region is highly conserved among species. In Manis, the overall base composition differs from that found in African genera. Phylogenetic analyses support the monophyly of the genera Manis, Phataginus, and Smutsia, as well as the basal division between Maninae and Smutsiinae. Comparisons with GenBank sequences reveal that the reference genomes of M. pentadactyla and P. tetradactyla (accession numbers NC_016008 and NC_004027) were sequenced from misidentified taxa, and that a new species of tree pangolin should be described in Gabon.

  14. Nucleotide sequence of a cloned duck hepatitis B virus genome: comparison with woodchuck and human hepatitis B virus sequences.

    PubMed Central

    Mandart, E; Kay, A; Galibert, F

    1984-01-01

    The nucleotide sequence of an EcoRI duck hepatitis B virus (DHBV) clone was elucidated by using the Maxam and Gilbert method. This sequence, which is 3,021 nucleotides long, was compared with the two previously analyzed hepatitis B-like viruses (human and woodchuck). From this comparison, it was shown that DHBV is derived from an ancestor common to the two others but has a slightly different genomic organization. There was no intergenic region between genes 5 and 8, which were fused into a single open reading frame in DHBV. Genes for the surface and core proteins were assigned to open reading frames 7 and 5/8. Amino acid comparisons showed some structural relationship between gene 6 product and avian reverse transcriptase, suggesting either evolution from a common ancestor or convergence to some particular structure to fulfill a specific function. This should be correlated with the synthesis of an RNA intermediate during DNA replication. This is also taken as an argument in favor of the hypothesis that gene 6 codes for the DNA polymerase that is found within the virion. DNA sequence comparison also showed that the two mammalian hepatitis B viruses are more homologous to each other than they are to DHBV, indicating that DHBV starts to evolve on its own earlier than the two other viruses, as do birds compared with mammals. From this it is proposed that the viruses evolved in a fashion parallel to the species they infect. PMID:6699938

  15. Complete cDNA and derived amino acid sequence of human factor V.

    PubMed Central

    Jenny, R J; Pittman, D D; Toole, J J; Kriz, R W; Aldape, R A; Hewick, R M; Kaufman, R J; Mann, K G

    1987-01-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A) tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approximately equal to 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approximately 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approximately 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues. Images PMID:3110773

  16. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  17. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  18. Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

    SciTech Connect

    Boore, Jeffrey L.

    2004-11-27

    Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincingly resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.

  19. 3D representations of amino acids—applications to protein sequence comparison and classification

    PubMed Central

    Li, Jie; Koehl, Patrice

    2014-01-01

    The amino acid sequence of a protein is the key to understanding its structure and ultimately its function in the cell. This paper addresses the fundamental issue of encoding amino acids in ways that the representation of such a protein sequence facilitates the decoding of its information content. We show that a feature-based representation in a three-dimensional (3D) space derived from amino acid substitution matrices provides an adequate representation that can be used for direct comparison of protein sequences based on geometry. We measure the performance of such a representation in the context of the protein structural fold prediction problem. We compare the results of classifying different sets of proteins belonging to distinct structural folds against classifications of the same proteins obtained from sequence alone or directly from structural information. We find that sequence alone performs poorly as a structure classifier. We show in contrast that the use of the three dimensional representation of the sequences significantly improves the classification accuracy. We conclude with a discussion of the current limitations of such a representation and with a description of potential improvements. PMID:25379143

  20. Amino acid and cDNA sequences of lysozyme from Hyalophora cecropia

    PubMed Central

    Engström, Å.; Xanthopoulos, K. G.; Boman, H. G.; Bennich, H.

    1985-01-01

    The amino acid and cDNA sequences of lysozyme from the giant silk moth Hyalophora cecropia have been determined. This enzyme is one of several immune proteins produced by the diapausing pupae after injection of bacteria. Cecropia lysozyme is composed of 120 amino acids, has a mol. wt. of 13.8 kd and shows great similarity with vertebrate lysozymes of the chicken type. The amino acid residues responsible for the catalytic activity and for the binding of substrate are essentially conserved. Three allelic variants of the Cecropia enzyme are identified. A comparison of the chicken and the Cecropia lysozymes shows that there is a 40% identity at both the amino acid and the nucleotide level. Some evolutionary aspects of the sequence data are discussed. PMID:16453632

  1. Sequence information signal processor for local and global string comparisons

    DOEpatents

    Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.

    1997-01-01

    A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.

  2. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  3. The amino acid sequence of Staphylococcus aureus penicillinase.

    PubMed Central

    Ambler, R P

    1975-01-01

    The amino acid sequence of the penicillinase (penicillin amido-beta-lactamhydrolase, EC 3.5.2.6) from Staphylococcus aureus strain PC1 was determined. The protein consists of a single polypeptide chain of 257 residues, and the sequence was determined by characterization of tryptic, chymotryptic, peptic and CNBr peptides, with some additional evidence from thermolysin and S. aureus proteinase peptides. A mistake in the preliminary report of the sequence is corrected; residues 113-116 are now thought to be -Lys-Lys-Val-Lys- rather than -Lys-Val-Lys-Lys-. Detailed evidence for the amino acid sequence has been deposited as Supplementary Publication SUP 50056 (91 pages) at the British Library (Lending Division), Boston Spa, Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1218078

  4. Nucleotide sequence of dengue 2 RNA and comparison of the encoded proteins with those of other flaviviruses.

    PubMed

    Hahn, Y S; Galler, R; Hunkapiller, T; Dalrymple, J M; Strauss, J H; Strauss, E G

    1988-01-01

    We have determined the complete sequence of the RNA of dengue 2 virus (S1 candidate vaccine strain derived from the PR-159 isolate) with the exception of about 15 nucleotides at the 5' end. The genome organization is the same as that deduced earlier for other flaviviruses and the amino acid sequences of the encoded dengue 2 proteins show striking homology to those of other flaviviruses. The overall amino acid sequence similarity between dengue 2 and yellow fever virus is 44.7%, whereas that between dengue 2 and West Nile virus is 50.7%. These viruses represent three different serological subgroups of mosquito-borne flaviviruses. Comparison of the amino acid sequences shows that amino acid sequence homology is not uniformly distributed among the proteins; highest homology is found in some domains of nonstructural protein NS5 and lowest homology in the hydrophobic polypeptides ns2a and 2b. In general the structural proteins are less well conserved than the nonstructural proteins. Hydrophobicity profiles, however, are remarkably similar throughout the translated region. Comparison of the dengue 2 PR-159 sequence to partial sequence data from dengue 4 and another strain of dengue 2 virus reveals amino acid sequence homologies of about 64 and 96%, respectively, in the structural protein region. Thus as a general rule for flaviviruses examined to date, members of different serological subgroups demonstrate 50% or less amino acid sequence homology, members of the same subgroup average 65-75% homology, and strains of the same virus demonstrate greater than 95% amino acid sequence similarity.

  5. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found.

  6. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences

    PubMed Central

    Derr, Julien; Manapat, Michael L.; Rajamani, Sudha; Leu, Kevin; Xulvi-Brunet, Ramon; Joseph, Isaac; Nowak, Martin A.; Chen, Irene A.

    2012-01-01

    During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life. PMID:22319215

  7. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found. PMID:658039

  8. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group.

  9. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group. PMID:1368578

  10. Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches.

    PubMed

    Horwege, Sebastian; Lindner, Sebastian; Boden, Marcus; Hatje, Klas; Kollmar, Martin; Leimeister, Chris-André; Morgenstern, Burkhard

    2014-07-01

    In this article, we present a user-friendly web interface for two alignment-free sequence-comparison methods that we recently developed. Most alignment-free methods rely on exact word matches to estimate pairwise similarities or distances between the input sequences. By contrast, our new algorithms are based on inexact word matches. The first of these approaches uses the relative frequencies of so-called spaced words in the input sequences, i.e. words containing 'don't care' or 'wildcard' symbols at certain pre-defined positions. Various distance measures can then be defined on sequences based on their different spaced-word composition. Our second approach defines the distance between two sequences by estimating for each position in the first sequence the length of the longest substring at this position that also occurs in the second sequence with up to k mismatches. Both approaches take a set of deoxyribonucleic acid (DNA) or protein sequences as input and return a matrix of pairwise distance values that can be used as a starting point for clustering algorithms or distance-based phylogeny reconstruction. The two alignment-free programmes are accessible through a web interface at 'Göttingen Bioinformatics Compute Server (GOBICS)': http://spaced.gobics.de http://kmacs.gobics.de and the source codes can be downloaded.

  11. AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes.

    PubMed

    Lin, Hao; Chen, Wei; Ding, Hui

    2013-01-01

    The structure and activity of enzymes are influenced by pH value of their surroundings. Although many enzymes work well in the pH range from 6 to 8, some specific enzymes have good efficiencies only in acidic (pH<5) or alkaline (pH>9) solution. Studies have demonstrated that the activities of enzymes correlate with their primary sequences. It is crucial to judge enzyme adaptation to acidic or alkaline environment from its amino acid sequence in molecular mechanism clarification and the design of high efficient enzymes. In this study, we developed a sequence-based method to discriminate acidic enzymes from alkaline enzymes. The analysis of variance was used to choose the optimized discriminating features derived from g-gap dipeptide compositions. And support vector machine was utilized to establish the prediction model. In the rigorous jackknife cross-validation, the overall accuracy of 96.7% was achieved. The method can correctly predict 96.3% acidic and 97.1% alkaline enzymes. Through the comparison between the proposed method and previous methods, it is demonstrated that the proposed method is more accurate. On the basis of this proposed method, we have built an online web-server called AcalPred which can be freely accessed from the website (http://lin.uestc.edu.cn/server/AcalPred). We believe that the AcalPred will become a powerful tool to study enzyme adaptation to acidic or alkaline environment.

  12. AcalPred: A Sequence-Based Tool for Discriminating between Acidic and Alkaline Enzymes

    PubMed Central

    Lin, Hao; Chen, Wei; Ding, Hui

    2013-01-01

    The structure and activity of enzymes are influenced by pH value of their surroundings. Although many enzymes work well in the pH range from 6 to 8, some specific enzymes have good efficiencies only in acidic (pH<5) or alkaline (pH>9) solution. Studies have demonstrated that the activities of enzymes correlate with their primary sequences. It is crucial to judge enzyme adaptation to acidic or alkaline environment from its amino acid sequence in molecular mechanism clarification and the design of high efficient enzymes. In this study, we developed a sequence-based method to discriminate acidic enzymes from alkaline enzymes. The analysis of variance was used to choose the optimized discriminating features derived from g-gap dipeptide compositions. And support vector machine was utilized to establish the prediction model. In the rigorous jackknife cross-validation, the overall accuracy of 96.7% was achieved. The method can correctly predict 96.3% acidic and 97.1% alkaline enzymes. Through the comparison between the proposed method and previous methods, it is demonstrated that the proposed method is more accurate. On the basis of this proposed method, we have built an online web-server called AcalPred which can be freely accessed from the website (http://lin.uestc.edu.cn/server/AcalPred). We believe that the AcalPred will become a powerful tool to study enzyme adaptation to acidic or alkaline environment. PMID:24130738

  13. Nucleotide sequence of Crithidia fasciculata cytosol 5S ribosomal ribonucleic acid.

    PubMed

    MacKay, R M; Gray, M W; Doolittle, W F

    1980-11-11

    The complete nucleotide sequence of the cytosol 5S ribosomal ribonucleic acid of the trypanosomatid protozoan Crithidia fasciculata has been determined by a combination of T1-oligonucleotide catalog and gel sequencing techniques. The sequence is: GAGUACGACCAUACUUGAGUGAAAACACCAUAUCCCGUCCGAUUUGUGAAGUUAAGCACC CACAGGCUUAGUUAGUACUGAGGUCAGUGAUGACUCGGGAACCCUGAGUGCCGUACUCCCOH. This 5S ribosomal RNA is unique in having GAUU in place of the GAAC or GAUC found in all other prokaryotic and eukaryotic 5S RNAs, and thought to be involved in interactions with tRNAs. Comparisons to other eukaryotic cytosol 5S ribosomal RNA sequences indicate that the four major eukaryotic kingdoms (animals, plants, fungi, and protists) are about equally remote from each other, and that the latter kingdom may be the most internally diverse.

  14. Evaluation of integrated anaerobic/aerobic fixed-bed sequencing batch biofilm reactor for decolorization and biodegradation of azo dye acid red 18: comparison of using two types of packing media.

    PubMed

    Hosseini Koupaie, E; Alavi Moghaddam, M R; Hashemi, S H

    2013-01-01

    Two integrated anaerobic/aerobic fixed-bed sequencing batch biofilm reactor (FB-SBBR) were operated to evaluate decolorization and biodegradation of azo dye Acid Red 18 (AR18). Volcanic pumice stones and a type of plastic media made of polyethylene were used as packing media in FB-SBBR1 and FB-SBBR2, respectively. Decolorization of AR18 in both reactors followed first-order kinetic with respect to dye concentration. More than 63.7% and 71.3% of anaerobically formed 1-naphthylamine-4-sulfonate (1N-4S), as one of the main sulfonated aromatic constituents of AR18 was removed during the aerobic reaction phase in FB-SBBR1 and FB-SBBR2, respectively. Based on statistical analysis, performance of FB-SBBR2 in terms of COD removal as well as biodegradation of 1N-4S was significantly higher than that of FB-SBBR1. Spherical and rod shaped bacteria were the dominant species of bacteria in the biofilm grown on the pumice stones surfaces, while, the biofilm grown on surfaces of the polyethylene media had a fluffy structure.

  15. A statistical physics perspective on alignment-independent protein sequence comparison

    PubMed Central

    Chattopadhyay, Amit K.; Nasiev, Diar; Flower, Darren R.

    2015-01-01

    Motivation: Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Results: Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from ‘first passage probability distribution’ to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. Contact: d.r.flower@aston.ac.uk PMID:25810434

  16. The genome of RNA tumor viruses contains polyadenylic acid sequences.

    PubMed

    Green, M; Cartas, M

    1972-04-01

    The 70S genome of two RNA tumor viruses, murine sarcoma virus and avian myeloblastosis virus, binds to Millipore filters in buffer with high salt concentration and to glass fiber filters containing poly(U). These observations suggest that 70S RNA contains adenylic acid-rich sequences. When digested by pancreatic RNase, 70S RNA of murine sarcoma virus yielded poly(A) sequences that contain 91% adenylic acid. These poly(A) sequences sedimented as a relatively homogenous peak in sucrose gradients with a sedimentation coefficient of 4-5 S, but had a mobility during polyacrylamide gel electrophoresis that corresponds to molecules that sediment at 6-7 S. If we estimate a molecular weight for each sequence of 30,000-60,000 (100-200 nucleotides) and a molecular weight for viral 70S RNA of 3-12 million, each viral genome could contain 1-8 poly(A) sequences. Possible functions of poly(A) in the infecting viral RNA may include a role in the initiation of viral DNA or RNA synthesis, in protein maturation, or in the assembly of the viral genome.

  17. Protein sequence comparison based on K-string dictionary.

    PubMed

    Yu, Chenglong; He, Rong L; Yau, Stephen S-T

    2013-10-25

    The current K-string-based protein sequence comparisons require large amounts of computer memory because the dimension of the protein vector representation grows exponentially with K. In this paper, we propose a novel concept, the "K-string dictionary", to solve this high-dimensional problem. It allows us to use a much lower dimensional K-string-based frequency or probability vector to represent a protein, and thus significantly reduce the computer memory requirements for their implementation. Furthermore, based on this new concept, we use Singular Value Decomposition to analyze real protein datasets, and the improved protein vector representation allows us to obtain accurate gene trees.

  18. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  19. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  20. Nucleic acid sequence design via efficient ensemble defect optimization.

    PubMed

    Zadeh, Joseph N; Wolfe, Brian R; Pierce, Niles A

    2011-02-01

    We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user-specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree-decomposition of the target structure. During leaf optimization, defect-weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N(3) ) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N(3) ). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp.

  1. rasbhari: Optimizing Spaced Seeds for Database Searching, Read Mapping and Alignment-Free Sequence Comparison

    PubMed Central

    Hahn, Lars; Leimeister, Chris-André; Morgenstern, Burkhard

    2016-01-01

    Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don’t-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/ PMID:27760124

  2. On combining protein sequences and nucleic acid sequences in phylogenetic analysis: the homeobox protein case.

    PubMed

    Agosti, D; Jacobs, D; DeSalle, R

    1996-01-01

    Amino acid encoding genes contain character state information that may be useful for phylogenetic analysis on at least two levels. The nucleotide sequence and the translated amino acid sequences have both been employed separately as character states for cladistic studies of various taxa, including studies of the genealogy of genes in multigene families. In essence, amino acid sequences and nucleic acid sequences are two different ways of character coding the information in a gene. Silent positions in the nucleotide sequence (first or third positions in codons that can accrue change without changing the identity of the amino acid that the triplet codes for) may accrue change relatively rapidly and become saturated, losing the pattern of historical divergence. On the other hand, non-silent nucleotide alterations and their accompanying amino acid changes may evolve too slowly to reveal relationships among closely related taxa. In general, the dynamics of sequence change in silent and non-silent positions in protein coding genes result in homoplasy and lack of resolution, respectively. We suggest that the combination of nucleic acid and the translated amino acid coded character states into the same data matrix for phylogenetic analysis addresses some of the problems caused by the rapid change of silent nucleotide positions and overall slow rate of change of non-silent nucleotide positions and slowly changing amino acid positions. One major theoretical problem with this approach is the apparent non-independence of the two sources of characters. However, there are at least three possible outcomes when comparing protein coding nucleic acid sequences with their translated amino acids in a phylogenetic context on a codon by codon basis. First, the two character sets for a codon may be entirely congruent with respect to the information they convey about the relationships of a certain set of taxa. Second, one character set may display no information concerning a phylogenetic

  3. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  4. Further Examples of Evolution by Gene Duplication Revealed through DNA Sequence Comparisons

    PubMed Central

    Ohta, T.

    1994-01-01

    To test the theory that evolution by gene duplication occurs as a result of positive Darwinian selection that accompanies the acceleration of mutant substitutions, DNA sequences of recent duplication were analyzed by estimating the numbers of synonymous and nonsynonymous substitutions. For the troponin C family, at the period of differentiation of the fast and slow isoforms, amino acid substitutions were shown to have been accelerated relative to synonymous substitutions. Comparison of the first exon of α-actin genes revealed that amino acid substitutions were accelerated when the smooth muscle, skeletal and cardiac isoforms differentiated. Analysis of members of the heat shock protein 70 gene family of mammals indicates that heat shock responsive genes including duplicated copies are evolving rapidly, contrary to the cognitive genes which have been evolutionarily conservative. For the α(1)-antitrypsin reactive center, the acceleration of amino acid substitution has been found for gene pairs of recent duplication. PMID:7896112

  5. The amino acid sequence of Escherichia coli cyanase.

    PubMed

    Chin, C C; Anderson, P M; Wold, F

    1983-01-10

    The amino acid sequence of the enzyme cyanase (cyanate hydrolase) from Escherichia coli has been determined by automatic Edman degradation of the intact protein and of its component peptides. The primary peptides used in the sequencing were produced by cyanogen bromide cleavage at the methionine residues, yielding 4 peptides plus free homoserine from the NH2-terminal methionine, and by trypsin cleavage at the 7 arginine residues after acetylation of the lysines. Secondary peptides required for overlaps and COOH-terminal sequences were produced by chymotrypsin or clostripain cleavage of some of the larger peptides. The complete sequence of the cyanase subunit consists of 156 amino acid residues (Mr 16,350). Based on the observation that the cysteine-containing peptide is obtained as a disulfide-linked dimer, it is proposed that the covalent structure of cyanase is made up of two subunits linked by a disulfide bond between the single cystine residue in each subunit. The native enzyme (Mr 150,000) then appears to be a complex of four or five such subunit dimers.

  6. Evolution of an Enzyme from a Noncatalytic Nucleic Acid Sequence.

    PubMed

    Gysbers, Rachel; Tram, Kha; Gu, Jimmy; Li, Yingfu

    2015-01-01

    The mechanism by which enzymes arose from both abiotic and biological worlds remains an unsolved natural mystery. We postulate that an enzyme can emerge from any sequence of any functional polymer under permissive evolutionary conditions. To support this premise, we have arbitrarily chosen a 50-nucleotide DNA fragment encoding for the Bos taurus (cattle) albumin mRNA and subjected it to test-tube evolution to derive a catalytic DNA (DNAzyme) with RNA-cleavage activity. After only a few weeks, a DNAzyme with significant catalytic activity has surfaced. Sequence comparison reveals that seven nucleotides are responsible for the conversion of the noncatalytic sequence into the enzyme. Deep sequencing analysis of DNA pools along the evolution trajectory has identified individual mutations as the progressive drivers of the molecular evolution. Our findings demonstrate that an enzyme can indeed arise from a sequence of a functional polymer via permissive molecular evolution, a mechanism that may have been exploited by nature for the creation of the enormous repertoire of enzymes in the biological world today. PMID:26091540

  7. Sequence comparisons in the aminoacyl-tRNA synthetases with emphasis on regions of likely homology with sequences in the Rossmann fold in the methionyl and tyrosyl enzymes.

    PubMed

    Walker, E J; Jeffrey, P D

    1988-02-01

    Amino acid sequences of aminoacyl-tRNA synthetases specific for 12 different amino acids have now been published. Differences in origin at the species and organelle level result in 20 distinct sequences being available for comparison. Some of these were compared in small groups as they were determined and, although some homologies were detected, it was generally concluded that there was surprisingly little sequence homology in this functionally related group of enzymes. We have made comparisons of all of the available sequences by using a combination of computer and manual alignment methods and knowledge of the sequences in the Rossmann fold region of methionyl-tRNA synthetase from E. coli and tyrosyl-tRNA synthetase from B. stearothermophilus, enzymes whose three-dimensional structures have been described. It emerges that all of the aminoacyl-tRNA synthetase sequences thus examined show considerable homology with each other over at least parts of this region, some over virtually all of it. We conclude that a great deal more similarity than had previously been suspected exists in these proteins. In particular, the alignments we have made strongly imply the existence of a mononucleotide binding site of the Rossmann fold configuration in all of the synthetases compared. PMID:3283733

  8. Amino acid sequence of human cholinesterase. Annual report, 30 September 1984-30 September 1985

    SciTech Connect

    Lockridge, O.

    1985-10-01

    The active-site serine residue is located 198 amino acids from the N-terminal. The active-site peptide was isolated from three different genetic types of human serum cholinesterase: from usual, atypical, and atypical-silent genotypes. It was found that the amino acid sequence of the active-site peptide was identical in all three genotypes. Comparison of the complete sequences of cholinesterase from human serum and acetylcholinesterase from the electric organ of Torpedo californica shows an identity of 53%. Cholinesterase is of interest to the Department of Defense because cholinesterase protects against organophosphate poisons of the type used in chemical warfare. The structural results presented here will serve as the basis for cloning the gene for cholinesterase. The potential uses of large amounts of cholinesterase would be for cleaning up spills of organophosphates and possibly for detoxifying exposed personnel.

  9. Comparison of DNA Quantification Methods for Next Generation Sequencing

    PubMed Central

    Robin, Jérôme D.; Ludlow, Andrew T.; LaRanger, Ryan; Wright, Woodring E.; Shay, Jerry W.

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library’s heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  10. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  11. Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species

    NASA Technical Reports Server (NTRS)

    Haney, P. J.; Badger, J. H.; Buldak, G. L.; Reich, C. I.; Woese, C. R.; Olsen, G. J.

    1999-01-01

    The genome sequence of the extremely thermophilic archaeon Methanococcus jannaschii provides a wealth of data on proteins from a thermophile. In this paper, sequences of 115 proteins from M. jannaschii are compared with their homologs from mesophilic Methanococcus species. Although the growth temperatures of the mesophiles are about 50 degrees C below that of M. jannaschii, their genomic G+C contents are nearly identical. The properties most correlated with the proteins of the thermophile include higher residue volume, higher residue hydrophobicity, more charged amino acids (especially Glu, Arg, and Lys), and fewer uncharged polar residues (Ser, Thr, Asn, and Gln). These are recurring themes, with all trends applying to 83-92% of the proteins for which complete sequences were available. Nearly all of the amino acid replacements most significantly correlated with the temperature change are the same relatively conservative changes observed in all proteins, but in the case of the mesophile/thermophile comparison there is a directional bias. We identify 26 specific pairs of amino acids with a statistically significant (P < 0.01) preferred direction of replacement.

  12. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    SciTech Connect

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  13. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  14. Characterization and amino acid sequence of a fatty acid-binding protein from human heart.

    PubMed Central

    Offner, G D; Brecher, P; Sawlivich, W B; Costello, C E; Troxler, R F

    1988-01-01

    The complete amino acid sequence of a fatty acid-binding protein from human heart was determined by automated Edman degradation of CNBr, BNPS-skatole [3'-bromo-3-methyl-2-(2-nitrobenzenesulphenyl)indolenine], hydroxylamine, Staphylococcus aureus V8 proteinase, tryptic and chymotryptic peptides, and by digestion of the protein with carboxypeptidase A. The sequence of the blocked N-terminal tryptic peptide from citraconylated protein was determined by collisionally induced decomposition mass spectrometry. The protein contains 132 amino acid residues, is enriched with respect to threonine and lysine, lacks cysteine, has an acetylated valine residue at the N-terminus, and has an Mr of 14768 and an isoelectric point of 5.25. This protein contains two short internal repeated sequences from residues 48-54 and from residues 114-119 located within regions of predicted beta-structure and decreasing hydrophobicity. These short repeats are contained within two longer repeated regions from residues 48-60 and residues 114-125, which display 62% sequence similarity. These regions could accommodate the charged and uncharged moieties of long-chain fatty acids and may represent fatty acid-binding domains consistent with the finding that human heart fatty acid-binding protein binds 2 mol of oleate or palmitate/mol of protein. Detailed evidence for the amino acid sequences of the peptides has been deposited as Supplementary Publication SUP 50143 (23 pages) at the British Library Lending Division, Boston Spa, Yorkshire LS23 7BQ, U.K., from whom copies may be obtained as indicated in Biochem. J. (1988) 249, 5. PMID:3421901

  15. Comparison of Buffer Effect of Different Acids During Sandstone Acidizing

    NASA Astrophysics Data System (ADS)

    Umer Shafiq, Mian; Khaled Ben Mahmud, Hisham; Hamid, Mohamed Ali

    2015-04-01

    The most important concern of sandstone matrix acidizing is to increase the formation permeability by removing the silica particles. To accomplish this, the mud acid (HF: HCl) has been utilized successfully for many years to stimulate the sandstone formations, but still it has many complexities. This paper presents the results of laboratory investigations of different acid combinations (HF: HCl, HF: H3PO4 and HF: HCOOH). Hydrofluoric acid and fluoboric acid are used to dissolve clays and feldspar. Phosphoric and formic acids are added as a buffer to maintain the pH of the solution; also it allows the maximum penetration of acid into the core sample. Different tests have been performed on the core samples before and after the acidizing to do the comparative study on the buffer effect of these acids. The analysis consists of permeability, porosity, color change and pH value tests. There is more increase in permeability and porosity while less change in pH when phosphoric and formic acids were used compared to mud acid. From these results it has been found that the buffer effect of phosphoric acid and formic acid is better than hydrochloric acid.

  16. Studies on adenosine triphosphate transphosphorylases. Amino acid sequence of rabbit muscle ATP-AMP transphosphorylase.

    PubMed

    Kuby, S A; Palmieri, R H; Frischat, A; Fischer, A H; Wu, L H; Maland, L; Manship, M

    1984-05-22

    The total amino acid sequence of rabbit muscle adenylate kinase has been determined, and the single polypeptide chain of 194 amino acid residues starts with N-acetylmethionine and ends with leucyllysine at its carboxyl terminus, in agreement with the earlier data on its amino acid composition [Mahowald, T. A., Noltmann, E. A., & Kuby, S. A. (1962) J. Biol. Chem. 237, 1138-1145] and its carboxyl-terminus sequence [Olson, O. E., & Kuby, S. A. (1964) J. Biol. Chem. 239, 460-467]. Elucidation of the primary structure was based on tryptic and chymotryptic cleavages of the performic acid oxidized protein, cyanogen bromide cleavages of the 14C-labeled S-carboxymethylated protein at its five methionine sites (followed by maleylation of peptide fragments), and tryptic cleavages at its 12 arginine sites of the maleylated 14C-labeled S-carboxymethylated protein. Calf muscle myokinase, whose sequence has also been established, differs primarily from the rabbit muscle myokinase's sequence in the following: His-30 is replaced by Gln-30; Lys-56 is replaced by Met-56; Ala-84 and Asp 85 are replaced by Val-84 and Asn-85. A comparison of the four muscle-type adenylate kinases, whose covalent structures have now been determined, viz., rabbit, calf, porcine, and human [for the latter two sequences see Heil, A., Müller, G., Noda, L., Pinder, T., Schirmer, H., Schirmer, I., & Von Zabern, I. (1974) Eur. J. Biochem. 43, 131-144, and Von Zabern, I., Wittmann-Liebold, B., Untucht-Grau, R., Schirmer, R. H., & Pai, E. F. (1976) Eur. J. Biochem. 68, 281-290], demonstrates an extraordinary degree of homology.(ABSTRACT TRUNCATED AT 250 WORDS)

  17. Relationships amongst bluetongue viruses revealed by comparisons of capsid and outer coat protein nucleotide sequences.

    PubMed

    Gould, A R; Pritchard, L I

    1990-08-01

    Sequence data from the gene segments coding for the capsid protein. VP3, of all eight Australian bluetongue virus serotypes were compared. The high degree of nucleotide sequence homology for VP3 genes amongst BTV isolates from the same geographic region supported previous studies (Gould, 1987; 1988b, c; Gould et al., 1988b) and was proposed as a basis for "topotyping" a bluetongue virus isolate (Gould et al., 1989). The complete nucleotide sequences which coded for the VP2 outer coat proteins of South African BTV serotypes 1 and 3 (vaccine strains) were determined and compared to cognate gene sequences from North American and Australian BTVs. These VP2 comparisons demonstrated that BTVs of the same serotype, but from different geographical regions, were closely related at the nucleotide and amino acid levels. However, close inter-relationships were also demonstrated amongst other BTVs irrespective of serotype or geographic origin. These data enabled phylogenic relationships of the BTV serotypes to be analysed using VP2 nucleotide sequences as a determinant.

  18. The sequence of carnation etched ring virus DNA: comparison with cauliflower mosaic virus and retroviruses

    PubMed Central

    Hull, R.; Sadler, J.; Longstaff, M.

    1986-01-01

    Carnation etched ring virus (CERV) DNA comprises 7932 bp. CERV primer binding sites and overall genome organization are similar to those of the related cauliflower mosaic virus (CaMV). The six open reading frames of CERV showed amino acid homology (50-80%) with CaMV ORFs I-VI; no homologues of CaMV ORFs VII or VIII were found. CERV ORFs 1-5 interface each other with the sequence ATGA. The comparison of CERV ORF5 with CaMV ORFV highlighted regions which show homologies to retrovirus gag/pol protease, RNase H and DNA polymerase domains; the possibility that the DNA polymerase domain comprises two subdomains, operating off different templates, is discussed. Both CERV and CaMV ORFs I have sequence homology to tobacco mosaic virus P30 and plastocyanin. PMID:16453731

  19. New approaches for computer analysis of nucleic acid sequences.

    PubMed

    Karlin, S; Ghandour, G; Ost, F; Tavare, S; Korn, L J

    1983-09-01

    A new high-speed computer algorithm is outlined that ascertains within and between nucleic acid and protein sequences all direct repeats, dyad symmetries, and other structural relationships. Large repeats, repeats of high frequency, dyad symmetries of specified stem length and loop distance, and their distributions are determined. Significance of homologies is assessed by a hierarchy of permutation procedures. Applications are made to papovaviruses, the human papillomavirus HPV, lambda phage, the human and mouse mitochondrial genomes, and the human and mouse immunoglobulin kappa-chain genes. PMID:6577449

  20. The amino-acid sequence of the alpha-crystallin A chains of red kangaroo and Virginia opossum.

    PubMed

    De Jong, W W; Terwindt, E C

    1976-08-16

    The amino acid sequence of the A chain of the eye lens protein alpha-crystallin from the red kangaroo (Macropus rufus) was completely determined by manual Edman degradation of tryptic, thermolytic and cyanogen bromide peptides. The sequence of the alpha-crystallin A chain from the Virginia opossum (Didelphis marsupialis) was deduced from amino acid analyses and partial Edman degradation of peptides. The 173-residue A chains of kangaroo and opossum differ in six positions, whereas comparison with the bovine alpha-crystallin A chain reveals 17 and 22 substitutions, respectively. Most substitutions occur in the COOH-terminal part of the chain.

  1. Metazoan remaining genes for essential amino acid biosynthesis: sequence conservation and evolutionary analyses.

    PubMed

    Costa, Igor R; Thompson, Julie D; Ortega, José Miguel; Prosdocimi, Francisco

    2014-12-24

    Essential amino acids (EAA) consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS) and betaine-homocysteine S-methyltransferase (BHMT) diverged from the expected Tree of Life (ToL) relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  2. The amino acid sequence of the aspartate aminotransferase from baker's yeast (Saccharomyces cerevisiae).

    PubMed Central

    Cronin, V B; Maras, B; Barra, D; Doonan, S

    1991-01-01

    1. The single (cytosolic) aspartate aminotransferase was purified in high yield from baker's yeast (Saccharomyces cerevisiae). 2. Amino-acid-sequence analysis was carried out by digestion of the protein with trypsin and with CNBr; some of the peptides produced were further subdigested with Staphylococcus aureus V8 proteinase or with pepsin. Peptides were sequenced by the dansyl-Edman method and/or by automated gas-phase methods. The amino acid sequence obtained was complete except for a probable gap of two residues as indicated by comparison with the structures of counterpart proteins in other species. 3. The N-terminus of the enzyme is blocked. Fast-atom-bombardment m.s. was used to identify the blocking group as an acetyl one. 4. Alignment of the sequence of the enzyme with those of vertebrate cytosolic and mitochondrial aspartate aminotransferases and with the enzyme from Escherichia coli showed that about 25% of residues are conserved between these distantly related forms. 5. Experimental details and confirmatory data for the results presented here are given in a Supplementary Publication (SUP 50164, 25 pages) that has been deposited at the British Library Document Supply Centre, Boston Spa. Wetherby, West Yorkshire LS23 7 BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1991) 273, 5. PMID:1859361

  3. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    PubMed Central

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  4. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison.

    PubMed

    Kato, Mikio

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  5. Comparison of exon 5 sequences from 35 class I genes of the BALB/c mouse

    PubMed Central

    1989-01-01

    DNA sequences of the fifth exon, which encodes the transmembrane domain, were determined for the BALB/c mouse class I MHC genes and used to study the relationships between them. Based on nucleotide sequence similarity, the exon 5 sequences can be divided into seven groups. Although most members within each group are at least 80% similar to each other, comparison between groups reveals that the groups share little similarity. However, in spite of the extensive variation of the fifth exon sequences, analysis of their predicted amino acid translations reveals that only four class I gene fifth exons have frameshifts or stop codons that terminate their translation and prevent them from encoding a domain that is both hydrophobic and long enough to span a lipid bilayer. Exactly 27 of the remaining fifth exons could encode a domain that is similar to those of the transplantation antigens in that it consists of a proline-rich connecting peptide, a transmembrane segment, and a cytoplasmic portion with membrane- anchoring basic residues. The conservation of this motif in the majority of the fifth exon translations in spite of extensive variation suggests that selective pressure exists for these exons to maintain their ability to encode a functional transmembrane domain, raising the possibility that many of the nonclassical class I genes encode functionally important products. PMID:2584927

  6. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment. PMID:23485423

  7. A comparison of chromic acid and sulfuric acid anodizing

    NASA Technical Reports Server (NTRS)

    Danford, M. D.

    1992-01-01

    Because of federal and state mandates restricting the use of hexavalent chromium, it was deemed worthwhile to compare the corrosion protection afforded 2219-T87 aluminum alloy by both Type I chromic acid and Type II sulfuric acid anodizing per MIL-A-8625. Corrosion measurements were made on large, flat 2219-T87 aluminum alloy sheet material with an area of 1 cm(exp 2) exposed to a corrosive medium of 3.5-percent sodium chloride at pH 5.5. Both ac electrochemical impedance spectroscopy and the dc polarization resistance techniques were employed. The results clearly indicate that the corrosion protection obtained by Type II sulfuric acid anodizing is superior, and no problems should result by substituting Type II sulfuric acid anodizing for Type I chromic acid anodizing.

  8. Sequence comparison on a cluster of workstations using the PVM system

    SciTech Connect

    Guan, X.; Mural, R.J.; Uberbacher, E.C.

    1995-02-01

    We have implemented a distributed sequence comparison algorithm on a cluster of workstations using the PVM paradigm. This implementation has achieved similar performance to the intel iPSC/860 Hypercube, a massively parallel computer. The distributed sequence comparison algorithm serves as a search tool for two Internet servers GRAIL and GENQUEST. This paper describes the implementation and the performance of the algorithm.

  9. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  10. Comparison and analysis of the nucleotide sequences of pilin genes from Haemophilus influenzae type b strains Eagan and M43.

    PubMed Central

    Forney, L J; Marrs, C F; Bektesh, S L; Gilsdorf, J R

    1991-01-01

    Previous studies have demonstrated antigenic differences among the pili expressed by various strains of Haemophilus influenzae type b (Hib). In order to understand the molecular basis for these differences, the structural gene for pilin was cloned from Hib strain Eagan (p+) and the nucleotide sequence was compared to those of strains M43 (p+) and 770235 b0f+, which had been previously determined. The pilin gene of Hib strain Eagan (p+) had a 648-bp open reading frame that encoded a 20-amino-acid leader sequence followed by the 196 amino acids found in mature pilin. The translated sequence was three amino acids larger than pilins of strains M43 (p+) and 770235 b0f+ and was 78% identical and 95% homologous when conservative amino acid substitutions were considered. Differences between the amino acid sequences were not localized to any one region but rather were distributed throughout the proteins. Comparison of protein hydrophilicity profiles showed several hydrophilic regions with sequences that were conserved between strain Eagan (p+) and pilins of other Hib strains, and these regions represent potentially conserved antigenic domains. Southern blot analyses using an intragenic probe from the pilin gene of strain Eagan (p+) showed that the pilin gene was conserved among all type b and nontypeable strains of H. influenzae examined, and only a single copy was present in these strains. Homologous genes were not present in the phylogenetically related species Pasteurella multocida, Pasteurella haemolytica, and Actinobacillus pleuropneumoniae. These data indicate that the pilin gene was highly conserved among different strains of H. influenzae and that small differences in the pilin amino acid sequences account for the observed antigenic differences of assembled pili from these strains. Images PMID:2037360

  11. A novel statistical measure for sequence comparison on the basis of k-word counts.

    PubMed

    Yang, Xiwu; Wang, Tianming

    2013-02-01

    Numerous efficient methods based on word counts for sequence analysis have been proposed to characterize DNA sequences to help in comparison, retrieval from the databases and reconstructing evolutionary relations. However, most of them seem unrelated to any intrinsic characteristics of DNA. In this paper, we proposed a novel statistical measure for sequence comparison on the basis of k-word counts. This new measure removed the influence of sequences' lengths and uncovered bulk property of DNA sequences. The proposed measure was tested by similarity search and phylogenetic analysis. The experimental assessment demonstrated that our similarity measure was efficient.

  12. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  13. Amino acid substitutions in genetic variants of human serum albumin and in sequences inferred from molecular cloning

    SciTech Connect

    Takahashi, N.; Takahashi, Y.; Blumberg, B.S.; Putnam, F.W.

    1987-07-01

    The structural changes in four genetic variants of human serum albumin were analyzed by tandem high-pressure liquid chromatography (HPLC) of the tryptic peptides, HPLC mapping and isoelectric focusing of the CNBr fragments, and amino acid sequence analysis of the purified peptides. Lysine-372 of normal (common) albumin A was changed to glutamic acid both in albumin Naskapi, a widespread polymorphic variant of North American Indians, and in albumin Mersin found in Eti Turks. The two variants also exhibited anomalous migration in NaDodSO/sub 4//PAGE, which is attributed to a conformational change. The identity of albumins Naskapi and Mersin may have originated through descent from a common mid-Asiatic founder of the two migrating ethnic groups, or it may represent identical but independent mutations of the albumin gene. In albumin Adana, from Eti Turks, the substitution site was not identified but was localized to the region from positions 447 through 548. The substitution of aspartic acid-550 by glycine was found in albumin Mexico-2 from four individuals of the Pima tribe. Although only single-point substitutions have been found in these and in certain other genetic variants of human albumin, five differences exist in the amino acid sequences inferred from cDNA sequences by workers in three other laboratories. However, our results on albumin A and on 14 different genetic variants accord with the amino acid sequence of albumin deduced from the genomic sequence. The apparent amino acid substitutions inferred from comparison of individual cDNA sequences probably reflect artifacts in cloning or in cDNA sequence analysis rather than polymorphism of the coding sections of the albumin gene.

  14. Amino acid substitutions in genetic variants of human serum albumin and in sequences inferred from molecular cloning.

    PubMed

    Takahashi, N; Takahashi, Y; Blumberg, B S; Putnam, F W

    1987-07-01

    The structural changes in four genetic variants of human serum albumin were analyzed by tandem high-pressure liquid chromatography (HPLC) of the tryptic peptides, HPLC mapping and isoelectric focusing of the CNBr fragments, and amino acid sequence analysis of the purified peptides. Lysine-372 of normal (common) albumin A was changed to glutamic acid both in albumin Naskapi, a widespread polymorphic variant of North American Indians, and in albumin Mersin found in Eti Turks. The two variants also exhibited anomalous migration in NaDodSO4/PAGE, which is attributed to a conformational change. The identity of albumins Naskapi and Mersin may have originated through descent from a common mid-Asiatic founder of the two migrating ethnic groups, or it may represent identical but independent mutations of the albumin gene. In albumin Adana, from Eti Turks, the substitution site was not identified but was localized to the region from positions 447 through 548. The substitution of aspartic acid-550 by glycine was found in albumin Mexico-2 from four individuals of the Pima tribe. Although only single-point substitutions have been found in these and in certain other genetic variants of human albumin, five differences exist in the amino acid sequences inferred from cDNA sequences by workers in three other laboratories. However, our results on albumin A and on 14 different genetic variants accord with the amino acid sequence of albumin deduced from the genomic sequence. The apparent amino acid substitutions inferred from comparison of individual cDNA sequences probably reflect artifacts in cloning or in cDNA sequence analysis rather than polymorphism of the coding sections of the albumin gene.

  15. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  16. Buffalo (Bubalus bubalis) interleukin-2: sequence analysis reveals high nucleotide and amino acid identity with interleukin-2 of cattle and other ruminants.

    PubMed

    Sreekumar, E; Premraj, A; Saravanakumar, M; Rasool, T J

    2002-08-01

    A 4400-bp genomic sequence and a 332-bp truncated cDNA sequence of the interleukin-2 (IL-2) gene of Indian water buffalo (Bubalus bubalis) were amplified by polymerase chain reaction and cloned. The coding sequence of the buffalo IL-2 gene was assembled from the 5' end of the genomic clone and the truncated cDNA clone. This sequence had 98.5% nucleotide identity and 98% amino acid identity with cattle IL-2. Three amino acid substitutions were observed at positions 63, 124 and 135. Comparison of the predicted protein structure of buffalo IL-2 with that of human and cattle IL-2 did not reveal significant differences. The putative amino acids responsible for IL-2 receptor binding were conserved in buffalo, cattle and human IL-2. The amino acid sequence of buffalo IL-2 also showed very high identity with that of other ruminants, indicating functional cross-reactivity.

  17. 3-d structure-based amino acid sequence alignment of esterases, lipases and related proteins

    SciTech Connect

    Gentry, M.K.; Doctor, B.P.; Cygler, M.; Schrag, J.D.; Sussman, J.L.

    1993-05-13

    Acetylcholinesterase and butyrylcholinesterase, enzymes with potential as pretreatment drugs for organophosphate toxicity, are members of a larger family of homologous proteins that includes carboxylesterases, cholesterol esterases, lipases, and several nonhydrolytic proteins. A computer-generated alignment of 18 of the proteins, the acetylcholinesases, butyrylcholinesterases, carboxylesterases, some esterases, and the nonenzymatic proteins has been previously presented. More recently, the three-dimensional structures of two enzymes enzymes in this group, acetylcholinesterase from Torpedo californica and lipase from Geotrichum candidum, have been determined. Based on the x-ray structures and the superposition of these two enzymes, it was possible to obtain an improved amino acid sequence alignment of 32 members of this family of proteins. Examination of this alignment reveals that 24 amino acids are invariant in all of the hydrolytic proteins, and an additional 49 are well conserved. Conserved amino acids include those of the active site, the disulfide bridges, the salt bridges, in the core of the proteins, and at the edges of secondary structural elements. Comparison of the three-dimensional structures makes it possible to find a well-defined structural basis for the conservation of many of these amino acids.

  18. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    SciTech Connect

    Chang, Soo-Ik ); Hammes, G.G. )

    1989-11-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chicken and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.

  19. Cloning, sequencing, and expression of the Zymomonas mobilis fructokinase gene and structural comparison of the enzyme with other hexose kinases.

    PubMed Central

    Zembrzuski, B; Chilco, P; Liu, X L; Liu, J; Conway, T; Scopes, R

    1992-01-01

    The frk gene encoding the enzyme fructokinase (fructose 6-phosphotransferase [EC 2.7.1.4]) from Zymomonas mobilis has been isolated on a partial TaqI digest fragment of the genome and sequenced. An open reading frame of 906 bp corresponding to 302 amino acids was identified on a 3-kbp TaqI fragment. The deduced amino acid sequence corresponds to the first 20 amino acids (including an N-terminal methionine) determined by amino acid sequencing of the purified protein. The 118 bp preceding the methionine codon on this fragment does not appear to contain a promoter sequence. There was weak expression of the active enzyme in the recombinant Escherichia coli clone under control of the lac promoter on the pUC plasmid. Comparison of the amino acid sequence with that of the glucokinase enzyme (EC 2.7.1.2) from Z. mobilis reveals relatively little homology, despite the fact that fructokinase also binds glucose and has kinetic and structural properties similar to those of glucokinase. Also, there is little homology with hexose kinases that have been sequenced from other organisms. Northern (RNA) blot analysis showed that the frk transcript is 1.2 kb long. Fructokinase activity is elevated up to twofold when Z. mobilis was grown on fructose instead of glucose, and there was a parallel increase in frk mRNA levels. Differential mRNA stability was not a factor, since the half-lives of the frk transcript were 6.2 min for glucose-grown cells and 6.6 min for fructose-grown cells. Images PMID:1317376

  20. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  1. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    PubMed

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  2. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  3. Purification of a marsupial insulin: amino-acid sequence of insulin from the eastern grey kangaroo Macropus giganteus.

    PubMed

    Treacy, G B; Shaw, D C; Griffiths, M E; Jeffrey, P D

    1989-03-24

    Insulin has been purified from kangaroo pancreas by acidic ethanol extraction, diethyl ether precipitation and gel filtration. The amino-acid sequence of this, the first marsupial insulin to be studied, is reported. It differs from human insulin by only four amino-acid substitutions, all in regions of the molecule previously known to be variable. However, it should be noted that one of these, asparagine for threonine at A8, has not been reported before. Computer comparisons of all 43 insulin sequences reported to date with kangaroo insulin show it to be most closely related to a group of mammalian insulins (dog, pig, cow, human) known to be of high biological potency. The measurement of blood glucose lowering in the rabbit by kangaroo insulin is consistent with this conclusion. Comparisons of amino-acid sequences of other proteins with their kangaroo counterparts show a greater difference, in line with the time of divergence of marsupials. The limited differences observed in insulin and cytochrome c suggest that their structures need to be closely conserved in order to maintain function.

  4. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza.

    PubMed

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  5. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  6. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones.

  7. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. PMID:26656109

  8. Detection of piscine nodaviruses by real-time nucleic acid sequence based amplification (NASBA).

    PubMed

    Starkey, William G; Millar, Rose Mary; Jenkins, Mary E; Ireland, Jacqueline H; Muir, K Fiona; Richards, Randolph H

    2004-05-01

    Nucleic acid sequence based amplification (NASBA) is an isothermal nucleic acid amplification procedure based on target-specific primers and probes, and the co-ordinated activity of 3 enzymes: AMV reverse transcriptase, RNase H, and T7 RNA polymerase. We have developed a real-time NASBA procedure for detection of piscine nodaviruses, which have emerged as major pathogens of marine fish. Viral RNA was isolated by guanidine thiocyanate lysis followed by purification on silica particles. Primers were designed to target sequences in the nodavirus capsid protein gene, yielding an amplification product of 120 nucleotides. Amplification products were detected in real-time with a molecular beacon (FAM labelled/methyl-red quenched) that recognised an internal region of the target amplicon. Amplification and detection were performed at 41 degrees C for 90 min in a Corbett Research Rotorgene. Based on the detection of cell culture-derived nodavirus, and a synthetic RNA target, the real-time NASBA procedure was approximately 100-fold more sensitive than single-tube RT-PCR. When used to test a panel of 37 clinical samples (negative, n = 18; positive, n = 19), the real-time NASBA assay correctly identified all 18 negative and 19 positive samples. In comparison, the RT-PCR procedure identified all 18 negative samples, but only 16 of the positive samples. These results suggest that real-time NASBA may represent a sensitive and specific diagnostic procedure for piscine nodaviruses.

  9. Two Dimensional Yau-Hausdorff Distance with Applications on Comparison of DNA and Protein Sequences

    PubMed Central

    Tian, Kun; Yang, Xiaoqian; Kong, Qin; Yin, Changchuan; He, Rong L.; Yau, Stephen S.-T.

    2015-01-01

    Comparing DNA or protein sequences plays an important role in the functional analysis of genomes. Despite many methods available for sequences comparison, few methods retain the information content of sequences. We propose a new approach, the Yau-Hausdorff method, which considers all translations and rotations when seeking the best match of graphical curves of DNA or protein sequences. The complexity of this method is lower than that of any other two dimensional minimum Hausdorff algorithm. The Yau-Hausdorff method can be used for measuring the similarity of DNA sequences based on two important tools: the Yau-Hausdorff distance and graphical representation of DNA sequences. The graphical representations of DNA sequences conserve all sequence information and the Yau-Hausdorff distance is mathematically proved as a true metric. Therefore, the proposed distance can preciously measure the similarity of DNA sequences. The phylogenetic analyses of DNA sequences by the Yau-Hausdorff distance show the accuracy and stability of our approach in similarity comparison of DNA or protein sequences. This study demonstrates that Yau-Hausdorff distance is a natural metric for DNA and protein sequences with high level of stability. The approach can be also applied to similarity analysis of protein sequences by graphic representations, as well as general two dimensional shape matching. PMID:26384293

  10. Multiple Comparison Analysis of Two New Genomic Sequences of ILTV Strains from China with Other Strains from Different Geographic Regions.

    PubMed

    Zhao, Yan; Kong, Congcong; Wang, Yunfeng

    2015-01-01

    To date, twenty complete genome sequences of ILTV strains have been published in GenBank, including one strain from China, and nineteen strains from Australian and the United States. To investigate the genomic information on ILTVs from different geographic regions, two additional individual complete genome sequences of WG and K317 strains from China were determined. The genomes of WG and K317 strains were 153,505 and 153,639 bp in length, respectively. Alignments performed on the amino acid sequences of the twelve glycoproteins showed that 13 out of 116 mutational sites were present only among the Chinese strain WG and the Australian strains SA2 and A20. The phylogenetic tree analysis suggested that the WG strain established close relationships with the Australian strain SA2. The recombination events were detected and confirmed in different subregions of the WG strain with the sequences of SA2 and K317 strains as parental. In this study, two new complete genome sequences of Chinese ILTV strains were used in comparative analysis with other complete genome sequences of ILTV strains from China, the United States, and Australia. The analysis of genome comparison, phylogenetic trees, and recombination events showed close relationships among the Chinese strain WG and the Australian strains SA2. The information of the two new complete genome sequences from China will help to facilitate the analysis of phylogenetic relationships and the molecular differences among ILTV strains from different geographic regions.

  11. Multiple Comparison Analysis of Two New Genomic Sequences of ILTV Strains from China with Other Strains from Different Geographic Regions

    PubMed Central

    Zhao, Yan; Kong, Congcong; Wang, Yunfeng

    2015-01-01

    To date, twenty complete genome sequences of ILTV strains have been published in GenBank, including one strain from China, and nineteen strains from Australian and the United States. To investigate the genomic information on ILTVs from different geographic regions, two additional individual complete genome sequences of WG and K317 strains from China were determined. The genomes of WG and K317 strains were 153,505 and 153,639 bp in length, respectively. Alignments performed on the amino acid sequences of the twelve glycoproteins showed that 13 out of 116 mutational sites were present only among the Chinese strain WG and the Australian strains SA2 and A20. The phylogenetic tree analysis suggested that the WG strain established close relationships with the Australian strain SA2. The recombination events were detected and confirmed in different subregions of the WG strain with the sequences of SA2 and K317 strains as parental. In this study, two new complete genome sequences of Chinese ILTV strains were used in comparative analysis with other complete genome sequences of ILTV strains from China, the United States, and Australia. The analysis of genome comparison, phylogenetic trees, and recombination events showed close relationships among the Chinese strain WG and the Australian strains SA2. The information of the two new complete genome sequences from China will help to facilitate the analysis of phylogenetic relationships and the molecular differences among ILTV strains from different geographic regions. PMID:26186451

  12. Phylogenetic analysis of evolutionary relationships of the planctomycete division of the domain bacteria based on amino acid sequences of elongation factor Tu.

    PubMed

    Jenkins, C; Fuerst, J A

    2001-05-01

    Sequences from the tuf gene coding for the elongation factor EF-Tu were amplified and sequenced from the genomic DNA of Pirellula marina and Isosphaera pallida, two species of bacteria within the order Planctomycetales. A near-complete (1140-bp) sequence was obtained from Pi. marina and a partial (759-bp) sequence was obtained for I. pallida. Alignment of the deduced Pi. marina EF-Tu amino acid sequence against reference sequences demonstrated the presence of a unique 11-amino acid sequence motif not present in any other division of the domain Bacteria. Pi. marina shared the highest percentage amino acid sequence identity with I. pallida but showed only a low percentage identity with other members of the domain Bacteria. This is consistent with the concept of the planctomycetes as a unique division of the Bacteria. Neither primary sequence comparison of EF-Tu nor phylogenetic analysis supports any close relationship between planctomycetes and the chlamydiae, which has previously been postulated on the basis of 16S rRNA. Phylogenetic analysis of aligned EF-Tu amino acid sequences performed using distance, maximum-parsimony, and maximum-likelihood approaches yielded contradictory results with respect to the position of planctomycetes relative to other bacteria. It is hypothesized that long-branch attraction effects due to unequal evolutionary rates and mutational saturation effects may account for some of the contradictions. PMID:11443344

  13. Nucleotide sequence of the DNA polymerase gene of herpes simplex virus type 2 and comparison with the type 1 counterpart.

    PubMed

    Tsurumi, T; Maeno, K; Nishiyama, Y

    1987-01-01

    The complete nucleotide sequence of the DNA polymerase gene of herpes simplex virus (HSV) type 2 strain 186 has been determined. The gene included a 3720-bp major open reading frame capable of encoding 1240 amino acids. The predicted primary translation product had an Mr of 137,354, which was slightly larger than its HSV-1 counterpart. A comparison of the predicted functional amino acid sequences of the HSV-1 and HSV-2 DNA polymerases revealed 95.5% overall amino acid homology, the value of which was the highest among those of the other known polypeptides encoded by HSV-1 and HSV-2. The functional amino acid changes were spread in the N-terminal one-third of the protein, whereas the C-terminal two-third was almost identical between the two types except a particular hydrophilic region. A highly conserved sequence of 6 aa, YGDTDS, which has been observed in DNA polymerases of HSV-1, Epstein-Barr virus, adenovirus, and vaccinia virus, was also present at positions 889 to 894 in the C-terminal region of HSV-2 DNA polymerase.

  14. Comparison of simple sequence repeats in 19 Archaea.

    PubMed

    Trivedi, S

    2006-01-01

    All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome. PMID:17183484

  15. Amino acid sequence of rabbit kidney neutral endopeptidase 24.11 (enkephalinase) deduced from a complementary DNA.

    PubMed Central

    Devault, A; Lazure, C; Nault, C; Le Moual, H; Seidah, N G; Chrétien, M; Kahn, P; Powell, J; Mallet, J; Beaumont, A

    1987-01-01

    Neutral endopeptidase (EC 3.4.24.11) is a major constituent of kidney brush border membranes. It is also present in the brain where it has been shown to be involved in the inactivation of opioid peptides, methionine- and leucine-enkephalins. For this reason this enzyme is often called 'enkephalinase'. In order to characterize the primary structure of the enzyme, oligonucleotide probes were designed from partial amino acid sequences and used to isolate clones from kidney cDNA libraries. Sequencing of the cDNA inserts revealed the complete primary structure of the enzyme. Neutral endopeptidase consists of 750 amino acids. It contains a short N-terminal cytoplasmic domain (27 amino acids), a single membrane-spanning segment (23 amino acids) and an extracellular domain that comprises most of the protein mass. The comparison of the primary structure of neutral endopeptidase with that of thermolysin, a bacterial Zn-metallopeptidase, indicates that most of the amino acid residues involved in Zn coordination and catalytic activity in thermolysin are found within highly honmologous sequences in neutral endopeptidase. Images Fig. 1. Fig. 3. PMID:2440677

  16. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  17. Quantitative comparison between a multiecho sequence and a single-echo sequence for susceptibility-weighted phase imaging.

    PubMed

    Gilbert, Guillaume; Savard, Geneviève; Bard, Céline; Beaudoin, Gilles

    2012-06-01

    The aim of this study was to investigate the benefits arising from the use of a multiecho sequence for susceptibility-weighted phase imaging using a quantitative comparison with a standard single-echo acquisition. Four healthy adult volunteers were imaged on a clinical 3-T system using a protocol comprising two different three-dimensional susceptibility-weighted gradient-echo sequences: a standard single-echo sequence and a multiecho sequence. Both sequences were repeated twice in order to evaluate the local noise contribution by a subtraction of the two acquisitions. For the multiecho sequence, the phase information from each echo was independently unwrapped, and the background field contribution was removed using either homodyne filtering or the projection onto dipole fields method. The phase information from all echoes was then combined using a weighted linear regression. R2 maps were also calculated from the multiecho acquisitions. The noise standard deviation in the reconstructed phase images was evaluated for six manually segmented regions of interest (frontal white matter, posterior white matter, globus pallidus, putamen, caudate nucleus and lateral ventricle). The use of the multiecho sequence for susceptibility-weighted phase imaging led to a reduction of the noise standard deviation for all subjects and all regions of interest investigated in comparison to the reference single-echo acquisition. On average, the noise reduction ranged from 18.4% for the globus pallidus to 47.9% for the lateral ventricle. In addition, the amount of noise reduction was found to be strongly inversely correlated to the estimated R2 value (R=-0.92). In conclusion, the use of a multiecho sequence is an effective way to decrease the noise contribution in susceptibility-weighted phase images, while preserving both contrast and acquisition time. The proposed approach additionally permits the calculation of R2 maps.

  18. Phylogenetic relationships of Cryptosporidium determined by ribosomal RNA sequence comparison.

    PubMed

    Johnson, A M; Fielke, R; Lumb, R; Baverstock, P R

    1990-04-01

    Reverse transcription of total cellular RNA was used to obtain a partial sequence of the small subunit ribosomal RNA of Cryptosporidium, a protist currently placed in the phylum Apicomplexa. The semi-conserved regions were aligned with homologous sequences in a range of other eukaryotes, and the evolutionary relationships of Cryptosporidium were determined by two different methods of phylogenetic analysis. The prokaryotes Escherichia coli and Halobacterium cuti were included as outgroups. The results do not show an especially close relationship of Cryptosporidium to other members of the phylum Apicomplexa. PMID:2332273

  19. Amino acid sequence of horseshoe crab, Tachypleus tridentatus, striated muscle troponin C.

    PubMed

    Kobayashi, T; Kagami, O; Takagi, T; Konishi, K

    1989-05-01

    The amino acid sequence of troponin C obtained from horseshoe crab, Tachypleus tridentatus, striated muscle was determined by sequence analysis and alignments of chemically and enzymatically cleaved peptides. Troponin C is composed of 153 amino acid residues with a blocked N-terminus and contains no tryptophan or cysteine residue. The site I, one of the four Ca2+-binding sites, is considered to have lost its ability to bind Ca2+ owing to the replacements of certain amino acid residues.

  20. Meiofaunal community analysis by high-throughput sequencing: comparison of extraction, quality filtering, and clustering methods.

    PubMed

    Brannock, Pamela M; Halanych, Kenneth M

    2015-10-01

    Using molecular tools to examine community composition of meiofauna, animals 45μm to 1mm in size living between sediment grains in aquatic environments, is relatively new in comparison to bacterial and archaeal microbial studies. Although high-throughput molecular approaches are starting to be applied to these ccommunities, effectiveness of different approaches for nucleic acid extraction from meiofauna is poorly known and bioinformatic pipelines vary between studies. Given this situation, there is a need for protocols to be developed that promote consistency in sample collection and processing, sequence quality filtering, and Operational Taxonomic Unit (OTU) clustering methods. Herein, we assess different approaches used for DNA extraction (DNA extracted directly from sediment versus elutriated material retained on a 45μm sieve) as well as how different quality filtering methods of sequences and OTU clustering algorithms impact genetic assessment of meiofauna community composition. DNA extracted directly from sediment resulted in higher presence of non-metazoan eukaryotic taxa; in contrast, an elutriation (resuspension with decanting) approach increased meiofauna abundance and enriched metazoan OTUs. In regards to bioinformatics analyses, the number of overall OTUs varied by clustering algorithm, primarily due to the applied method of sequence quality filtering. However, alpha and beta diversity analyses showed similar trends regardless of bioinformatics pipeline utilized. Based on our results, we recommend studies of meiofauna communities first elutriate samples prior to DNA extraction and include multiple biological replicates to account for variation in community-level composition. The quality filtering method should be carefully considered as this step accounted for large discrepancy in the number of OTUs inferred.

  1. The complete genome sequences of poxviruses isolated from a penguin and a pigeon in South Africa and comparison to other sequenced avipoxviruses

    PubMed Central

    2014-01-01

    Background Two novel avipoxviruses from South Africa have been sequenced, one from a Feral Pigeon (Columba livia) (FeP2) and the other from an African penguin (Spheniscus demersus) (PEPV). We present a purpose-designed bioinformatics pipeline for analysis of next generation sequence data of avian poxviruses and compare the different avipoxviruses sequenced to date with specific emphasis on their evolution and gene content. Results The FeP2 (282 kbp) and PEPV (306 kbp) genomes encode 271 and 284 open reading frames respectively and are more closely related to one another (94.4%) than to either fowlpox virus (FWPV) (85.3% and 84.0% respectively) or Canarypox virus (CNPV) (62.0% and 63.4% respectively). Overall, FeP2, PEPV and FWPV have syntenic gene arrangements; however, major differences exist throughout their genomes. The most striking difference between FeP2 and the FWPV-like avipoxviruses is a large deletion of ~16 kbp from the central region of the genome of FeP2 deleting a cc-chemokine-like gene, two Variola virus B22R orthologues, an N1R/p28-like gene and a V-type Ig domain family gene. FeP2 and PEPV both encode orthologues of vaccinia virus C7L and Interleukin 10. PEPV contains a 77 amino acid long orthologue of Ubiquitin sharing 97% amino acid identity to human ubiquitin. Conclusions The genome sequences of FeP2 and PEPV have greatly added to the limited repository of genomic information available for the Avipoxvirus genus. In the comparison of FeP2 and PEPV to existing sequences, FWPV and CNPV, we have established insights into African avipoxvirus evolution. Our data supports the independent evolution of these South African avipoxviruses from a common ancestral virus to FWPV and CNPV. PMID:24919868

  2. Protein sequence comparisons show that the 'pseudoproteases' encoded by poxviruses and certain retroviruses belong to the deoxyuridine triphosphatase family.

    PubMed Central

    McGeoch, D J

    1990-01-01

    Amino acid sequence comparisons show extensive similarities among the deoxyuridine triphosphatases (dUTPases) of Escherichia coli and of herpesviruses, and the 'protease-like' or 'pseudoprotease' sequences encoded by certain retroviruses in the oncovirus and lentivirus families and by poxviruses. These relationships suggest strongly that the 'pseudoproteases' actually are dUTPases, and have not arisen by duplication of an oncovirus protease gene as had been suggested. The herpesvirus dUTPase sequences differ from the others in that they are longer (about 370 residues, against around 140) and one conserved element ('Motif 3') is displaced relative to its position in the other sequences; a model involving internal duplication of the herpesvirus gene can account effectively for these observations. Sequences closely similar to Motif 3 are also found in phosphofructokinases, where they form part of the active site and fructose phosphate binding structure; thus these sequences may represent a class of structural element generally involved in phosphate transfer to and from glycosides. PMID:2165588

  3. 3D reconstruction software comparison for short sequences

    NASA Astrophysics Data System (ADS)

    Strupczewski, Adam; Czupryński, BłaŻej

    2014-11-01

    Large scale multiview reconstruction is recently a very popular area of research. There are many open source tools that can be downloaded and run on a personal computer. However, there are few, if any, comparisons between all the available software in terms of accuracy on small datasets that a single user can create. The typical datasets for testing of the software are archeological sites or cities, comprising thousands of images. This paper presents a comparison of currently available open source multiview reconstruction software for small datasets. It also compares the open source solutions with a simple structure from motion pipeline developed by the authors from scratch with the use of OpenCV and Eigen libraries.

  4. Molecular cloning and sequence analysis of complementary DNA encoding rat mammary gland medium-chain S-acyl fatty acid synthetase thio ester hydrolase

    SciTech Connect

    Safford, R.; de Silva, J.; Lucas, C.; Windust, J.H.C.; Shedden, J.; James, C.M.; Sidebottom, C.M.; Slabas, A.R.; Tombs, M.P.; Hughes, S.G.

    1987-03-10

    Poly(A) + RNA from pregnant rat mammary glands was size-fractionated by sucrose gradient centrifugation, and fractions enriched in medium-chain S-acyl fatty acid synthetase thio ester hydrolase (MCH) were identified by in vitro translation and immunoprecipitation. A cDNA library was constructed, in pBR322, from enriched poly(A) + RNA and screened with two oligonucleotide probes deduced from rat MCH amino acid sequence data. Cross-hybridizing clones were isolated and found to contain cDNA inserts ranging from approx. 1100 to 1550 base pairs (bp). A 1550-bp cDNA insert, from clone 43H09, was confirmed to encode MCH by hybrid-select translation/immunoprecipitation studies and by comparison of the amino acid sequence deduced from the DNA sequence of the clone to the amino acid sequence of the MCH peptides. Northern blot analysis revealed the size of the MCH mRNA to be 1500 nucleotides, and it is therefore concluded that the 1550-bp insert (including G x C tails) of clone 43H09 represents a full- or near-full-length copy of the MCH gene. The rat MCH sequence is the first reported sequence of a thioesterase from a mammalian source, but comparison of the deduced amino acid sequences of MCH and the recently published mallard duck medium-chain S-acyl fatty acid synthetase thioesterase reveals significant homology. In particular, a seven amino acid sequence containing the proposed active serine of the duck thioesterase is found to be perfectly conserved in rat MCH.

  5. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor. PMID:24338313

  6. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  7. Attenuation of very virulent infectious bursal disease virus and comparison of full sequences of virulent and attenuated strains.

    PubMed

    Lazarus, D; Pasmanik-Chor, M; Gutter, B; Gallili, G; Barbakov, M; Krispel, S; Pitcovski, J

    2008-04-01

    A very virulent strain of infectious bursal disease virus (IBDVks) was isolated from the bursae of Fabricius of IBDV-affected broiler chickens. Following 43 serial passages in specific pathogen-free embryonated eggs, an attenuated strain was established (IBDVmb). Dosages of IBDVmb in the range 10(2) to 10(4) embryo infective dose of 50% were found to be safe and protective for commercial chicks. Chickens vaccinated with live vaccine containing IBDVmb responded with precipitating and type-specific neutralizing antibodies, and were immune to subsequent challenge with a very virulent IBDV. IBDVmb has been used as an attenuated vaccine throughout the world since 1993. A comparison of the full sequences of the virulent and attenuated strains (IBDVks and IBDVmb, respectively) revealed seven nucleotides that were different, four of them leading to changes in the amino-acid sequence. Comparison of the protein sequence of these strains and published sequences of very virulent and attenuated phenotypes lead us to suggest that the novel difference responsible for virulence of the Israeli strains are: residue 272 (VP2, very conserved site) and residue 527 (VP4), both in segment A, and in segment B (VP1) residues 96 and 161 (both conserved). Our study strengthens the possibility that more than one protein is involved in IBDV attenuation. In all reports, including ours, virulence was reduced without affecting antigenicity of the neutralizing epitopes in VP2. This could have practical implications for attenuated-vaccine development.

  8. Comparison and combination effects on antioxidant power of curcumin with gallic acid, ascorbic acid, and xanthone.

    PubMed

    Naksuriya, Ornchuma; Okonogi, Siriporn

    2015-04-01

    Curcumin has been extensively reported as a potential natural antioxidant. However, there was no data on activity comparison as well as the biological interactions of curcumin with other natural antioxidants. The aim of the present study was to investigate the antioxidant power of curcumin in comparison with three important natural antioxidants; gallic acid, ascorbic acid, and xanthone on free radical scavenging action and their combination effects on this activity. The results indicated that the activities of these compounds were dose-dependent. The 50% effective concentration (EC50) of curcumin was found to be 11 μg/mL. Curcumin showed significantly higher antioxidant activity than ascorbic acid and xanthone but less than gallic acid. Interestingly, curcumin revealed synergistic antioxidant effect when combined with gallic acid whereas the antagonistic effect occurred in curcumin combination with ascorbic acid or xanthone. These results suggest that curcumin-gallic acid combination is the potential antioxidant mixture to be used in place of the individual substance whereas using of curcumin in combination with ascorbic acid or xanthone should be avoid.

  9. tax and rex Sequences of bovine leukaemia virus from globally diverse isolates: rex amino acid sequence more variable than tax.

    PubMed

    McGirr, K M; Buehring, G C

    2005-02-01

    Bovine leukaemia virus (BLV) is an important agricultural problem with high costs to the dairy industry. Here, we examine the variation of the tax and rex genes of BLV. The tax and rex genes share 420 bases and have overlapping reading frames. The tax gene encodes a protein that functions as a transactivator of the BLV promoter, is required for viral replication, acts on cellular promoters, and is responsible for oncogenesis. The rex facilitates the export of viral mRNAs from the nucleus and regulates transcription. We have sequenced five new isolates of the tax/rex gene. We examined the five new and three previously published tax/rex DNA and predicted amino acid sequences of BLV isolates from cattle in representative regions worldwide. The highest variation among nucleic acid sequences for tax and rex was 7% and 5%, respectively; among predicted amino acid sequences for Tax and Rex, 9% and 11%, respectively. Significantly more nucleotide changes resulted in predicted amino acid changes in the rex gene than in the tax gene (P < or = 0.0006). This variability is higher than previously reported for any region of the viral genome. This research may also have implications for the development of Tax-based vaccines. PMID:15702995

  10. A nucleic acid sequence-based amplification system for detection of Listeria monocytogenes hlyA sequences.

    PubMed Central

    Blais, B W; Turner, G; Sooknanan, R; Malek, L T

    1997-01-01

    A nucleic acid sequence-based amplification system primarily targeting mRNA from the Listeria monocytogenes hlyA gene was developed. This system enabled the detection of low numbers (< 10 CFU/g) of L. monocytogenes cells inoculated into a variety of dairy and egg products after 48 h of enrichment in modified listeria enrichment broth. PMID:8979357

  11. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  12. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  13. The amino acid sequence of elephant (Elephas maximus) myoglobin and the phylogeny of Proboscidea.

    PubMed

    Dene, H; Goodman, M; Romero-Herrera, A E

    1980-02-13

    The complete amino acid sequence of skeletal myoglobin from the Asian elephant (Elephas maximus) is reported. The functional significance of variations seen when this sequence is compared with that of sperm whale myoglobin is explored in the light of the crystallographic model available for the latter molecule. The phylogenetic implications of the elephant myoglobin amino acid sequence are evaluated by using the maximum parsimony technique. A similar analysis is also presented which incorporates all of the proteins sequenced from the elephant. These results are discussed with respect to current views on proboscidean phylogeny.

  14. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  15. Basal Murphy belt and Chilhowee Group -- Sequence stratigraphic comparison

    SciTech Connect

    Aylor, J.G. Jr. . Dept. of Geology)

    1994-03-01

    The lower Murphy belt in the central western Blue Ridge is interpreted to be correlative to the Early Cambrian Chilhowee Group of the westernmost Blue Ridge and Appalachian fold and thrust belt. Basal Murphy belt depositional sequence stratigraphy represents a second-order, type-2 transgressive systems tract initiated with deposition of lowstand turbidites of the Dean Formation. These transgressive deposits of the Nantahala and Brasstown Formations are interpreted as middle to outer continental shelf deposits. Cyclic and stacked third-order regressive, coarsening upwards sequences of the Nantahala Formation display an overall increase in feldspar content stratigraphically upsection. These transgressive siliciclastic deposits are interpreted to be conformably overlain by a carbonate highstand systems tract of the Murphy Marble. Palinspastic reconstruction indicates that the Nantahala and Brasstown Formations possibly represent a basinward extension of up to 3 km thick siliciclastic wedge. The wedge tapers to the southwest along the strike of the Murphy belt at 10[degree] and thins northwestward to 2 km in the Tennessee depocenter where it is represented by the Chilhowee Group. The Murphy belt basin is believed to represent a transitional rift-to-drift facies deposited on the lower plate of the southern Blue Ridge rift zone.

  16. Reconstruction of an ancestral Yersinia pestis genome and comparison with an ancient sequence

    PubMed Central

    2015-01-01

    Background We propose the computational reconstruction of a whole bacterial ancestral genome at the nucleotide scale, and its validation by a sequence of ancient DNA. This rare possibility is offered by an ancient sequence of the late middle ages plague agent. It has been hypothesized to be ancestral to extant Yersinia pestis strains based on the pattern of nucleotide substitutions. But the dynamics of indels, duplications, insertion sequences and rearrangements has impacted all genomes much more than the substitution process, which makes the ancestral reconstruction task challenging. Results We use a set of gene families from 13 Yersinia species, construct reconciled phylogenies for all of them, and determine gene orders in ancestral species. Gene trees integrate information from the sequence, the species tree and gene order. We reconstruct ancestral sequences for ancestral genic and intergenic regions, providing nearly a complete genome sequence for the ancestor, containing a chromosome and three plasmids. Conclusion The comparison of the ancestral and ancient sequences provides a unique opportunity to assess the quality of ancestral genome reconstruction methods. But the quality of the sequencing and assembly of the ancient sequence can also be questioned by this comparison. PMID:26450112

  17. Facile Analysis and Sequencing of Linear and Branched Peptide Boronic Acids by MALDI Mass Spectrometry

    PubMed Central

    Crumpton, Jason; Zhang, Wenyu; Santos, Webster

    2011-01-01

    Interest in peptides incorporating boronic acid moieties is increasing due to their potential as therapeutics/diagnostics for a variety of diseases such as cancer. The utility of peptide boronic acids may be expanded with access to vast libraries that can be deconvoluted rapidly and economically. Unfortunately, current detection protocols using mass spectrometry are laborious and confounded by boronic acid trimerization, which requires time consuming analysis of dehydration products. These issues are exacerbated when the peptide sequence is unknown, as with de novo sequencing, and especially when multiple boronic acid moieties are present. Thus, a rapid, reliable and simple method for peptide identification is of utmost importance. Herein, we report the identification and sequencing of linear and branched peptide boronic acids containing up to five boronic acid groups by matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). Protocols for preparation of pinacol boronic esters were adapted for efficient MALDI analysis of peptides. Additionally, a novel peptide boronic acid detection strategy was developed in which 2,5-dihydroxybenzoic acid (DHB) served as both matrix and derivatizing agent in a convenient, in situ, on-plate esterification. Finally, we demonstrate that DHB-modified peptide boronic acids from a single bead can be analyzed by MALDI-MSMS analysis, validating our approach for the identification and sequencing of branched peptide boronic acid libraries. PMID:21449540

  18. Nucleotide and derived amino acid sequences of a cDNA coding for pre-uteroglobin from the lung of the hare (Lepus capensis).

    PubMed Central

    López de Haro, M S; Nieto, A

    1986-01-01

    An almost full-length cDNA coding for pre-uteroglobin from hare lung was cloned and sequenced. The derived amino acid sequence indicated that hare pre-uteroglobin contained 91 amino acids, including a signal peptide of 21 residues. Comparison of the nucleotide sequence of hare pre-uteroglobin cDNA with that previously reported for the rabbit gene indicated five silent point substitutions and six others leading to amino acid changes in the coding region. The untranslated regions of both pre-uteroglobin mRNAs were very similar. The amino acid changes observed are discussed in relation to the different progesterone-binding abilities of both homologous proteins. PMID:3019311

  19. Triose phosphate isomerase from the coelacanth. An approach to the rapid determination of an amino acid sequence with small amounts of material.

    PubMed

    Kolb, E; Harris, J I; Bridgen, J

    1974-02-01

    The preparation and purification of cyanogen bromide fragments from [(14)C]carboxymethylated coelacanth triose phosphate isomerase is presented. The automated sequencing of these fragments, the lysine-blocked tryptic peptides derived from them, and also of the intact protein, is described. Combination with results from manual sequence analysis has given the 247-residue amino acid sequence of coelacanth triose phosphate isomerase in 4 months, by using 100mg of enzyme. (Two small adjacent peptides were placed by homology with the rabbit enzyme.) Comparison of this sequence with that of the rabbit muscle enzyme shows that 207 (84%) of the residues are identical. This slow rate of evolutionary change (corresponding to two amino acid substitutions per 100 residues per 100 million years) is similar to that found for glyceraldehyde 3-phosphate dehydrogenase. The reliability of sequence information obtained by automated methods is discussed.

  20. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  1. alpha. -Amylase of Clostridium thermosulfurogenes EM1: Nucleotide sequence of the gene, processing of the enzyme, and comparison to other. alpha. -amylases

    SciTech Connect

    Bahl, H.; Burchhardt, G.; Spreinat, A.; Haeckel, K.; Wienecke, A.; Antranikian, G.; Schmidt, B. )

    1991-05-01

    The nucleotide sequence of the {alpha}-amylase gene (amyA) from Clostridium thermosulfurogenes EM1 cloned in Escherichia coli was determined. The reading frame of the gene consisted of 2,121 bp. Comparison of the DNA sequence data with the amino acid sequence of the N terminus of the purified secreted protein of C. thermosulfurogenes Em1 suggested that the {alpha}-amylase is translated form mRNA as a secretory precursor with a signal peptide of 27 amino acid residues. The deduced amino acid sequence of the mature {alpha}-amylase contained 679 residues, resulting in a protein with a molecular mass of 75,112 Da. In E. coli the enzyme was transported to the periplasmic space and the signal peptide was cleaved at exactly the same site between two alanine residues. Comparison of the amino acid sequence of the C. thermosulfurogenes EM1 {alpha}-amylase with those from other bacterial and eukaryotic {alpha}-amylases showed several homologous regions, probably in the enzymatically functioning regions. The tentative Ca{sup 2+}-binding site (consensus region I) of this Ca{sub 2+}-independent enzyme showed only limited homology. The deduced amino acid sequence of a second obviously truncated open reading frame showed significant homology to the malG gene product of E. coli. Comparison of the {alpha}-amylase gene region of C. thermosulfurogenes EM1 (DSM3896) with the {beta}-amylase gene region of C. thermosulfurogenes (ATCC 33743) indicated that both genes have been exchanged with each other at identical sites in the chromosomes of these strains.

  2. Naked but not Hairless: the pitfalls of analyses of molecular adaptation based on few genome sequence comparisons.

    PubMed

    Delsuc, Frédéric; Tilak, Marie-Ka

    2015-02-20

    The naked mole-rat (Heterocephalus glaber) is the only rodent species that naturally lacks fur. Genome sequencing of this atypical rodent species recently shed light on a number of its morphological and physiological adaptations. More specifically, its hairless phenotype has been traced back to a single amino acid change (C397W) in the hair growth associated (HR) protein (or Hairless). By considering the available species diversity, we show that this specific position is in fact variable across mammals, including in the horse that was misleadingly reported to have the ancestral Cysteine. Moreover, by sequencing the corresponding HR exon in additional rodent species, we demonstrate that the C397W substitution is actually not a peculiarity of the naked mole-rat. Instead, this specific amino acid substitution is present in all hystricognath rodents investigated, which are all fully furred, including the naked mole-rat closest relative, the Damaraland mole-rat (Fukomys damarensis). Overall, we found no statistical correlation between amino acid changes at position 397 of the HR protein and reduced pilosity across the mammalian phylogeny. This demonstrates that this single amino acid change does not explain the naked mole-rat hairless phenotype. Our case study calls for caution before making strong claims regarding the molecular basis of phenotypic adaptation based on the screening of specific amino acid substitutions using only few model species in genome sequence comparisons. It also exposes the more general problem of the dilution of essential information in the supplementary material of genome papers thereby increasing the probability that misleading results will escape the scrutiny of editors, reviewers, and ultimately readers.

  3. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly).

  4. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly). PMID:9836434

  5. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    PubMed

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  6. Studies on monotreme proteins. VII. Amino acid sequence of myoglobin from the platypus, Ornithoryhynchus anatinus.

    PubMed

    Fisher, W K; Thompson, E O

    1976-03-01

    Myoglobin isolated from skeletal muscle of the platypus contains 153 amino acid residues. The complete amino acid sequence has been determined following cleavage with cyanogen bromide and further digestion of the four fragments with trypsin, chymotrypsin, pepsin and thermolysin. Sequences of the purified peptides were determined by the dansyl-Edman procedure. The amino acid sequence showed 25 differences from human myoglobin and 24 from kangaroo myoglobin. Amino acid sequences in myoglobins are more conserved than sequences in the alpha- and beta-globin chains, and platypus myoglobin shows a similar number of variations in sequence to kangaroo myoglobin when compared with myoglobin of other species. The date of divergence of the platypus from other mammals was estimated at 102 +/- 31 million years, based on the number of amino acid differences between species and allowing for mutations during the evolutionary period. This estimate differs widely from the estimate given by similar treatment of the alpha- and beta-chain sequences and a constant rate of mutation of globin chains is not supported. PMID:962722

  7. Secure distributed genome analysis for GWAS and sequence comparison computation

    PubMed Central

    2015-01-01

    Background The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. Methods In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest. Results We provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques. Conclusions This work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice. PMID:26733307

  8. Sequence analysis of frog alpha B-crystallin cDNA: sequence homology and evolutionary comparison of alpha A, alpha B and heat shock proteins.

    PubMed

    Lu, S F; Pan, F M; Chiou, S H

    1995-11-22

    alpha-Crystallin is a major lens protein present in the lenses of all vertebrate species. Recent studies have revealed that bovine alpha-crystallins possess genuine chaperone activity similar to small heat-shock proteins. In order to facilitate the determination of the primary sequence of amphibian alpha B-crystallin, cDNA encoding alpha B subunit chain was amplified using a new "Rapid Amplification of cDNA Ends" (RACE) protocol of Polymerase Chain Reaction (PCR). PCR-amplified product corresponding to alpha B subunit was then subcloned into pUC18 vector and transformed into E. coli strain JM109. Plasmids purified from the positive clones were prepared for nucleotide sequencing by the automatic fluorescence-based dideoxynucleotide chain-termination method. Sequencing more than five clones containing DNA inserts coding for alpha B-crystallin subunit constructed only one complete full-length reading frame of 522 base pairs similar to that of alpha A subunit, covering a deduced protein sequence of 173 amino acids including the universal translation-initiating methionine. The frog alpha B crystallin shows 69, 66 and 56% whereas alpha A crystallin shows 83, 81 and 69% sequence similarity to the homologous chains of bovine, chicken and dogfish, respectively, revealing a more divergent structural relationship among these alpha B subunits as compared to alpha A subunits. Structural analysis and comparison of alpha A- and alpha B-crystallin subunits from eye lenses of different classes of vertebrates also shed some light on the evolutionary relatedness between alpha B/alpha A crystallins and the small heat-shock proteins.

  9. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  10. The nucleotide sequence of the mitochondrial DNA molecule of the grey seal, Halichoerus grypus, and a comparison with mitochondrial sequences of other true seals.

    PubMed

    Arnason, U; Gullberg, A; Johnsson, E; Ledje, C

    1993-10-01

    The sequence of the mtDNA of the grey seal, Halichoerus grypus, was determined. The length of the molecule was 16,797 base pairs. The organization of the molecule conformed with that of other eutherian mammals but the control region was unusually long due to the presence of two types of repeated motifs. The grey seal and the previously reported harbor seal, Phoca vitulina, belong to different but closely related genera of family Phocidae, true (or earless) seals. In order to determine the degree of differences that may occur between mtDNAs of closely related mammalian genera, the 2 rRNA genes, the 13 peptide coding genes, and the 22 tRNA genes of the 2 species were compared. Total nucleotide difference in the peptide coding genes was 2.0-6.1%. The range of conservative difference was 0.0-1.5%. In the inferred peptide sequences the amino acid difference was 0.0-4.5%, and the difference with respect to chemical properties of amino acids was 0.0-3.0%. A gene that showed a limited degree of difference in one mode of comparison did not necessarily show a corresponding limited difference in another mode. The ratio for differences in codon positions 1, 2, and 3 was approximately 2.7:1:16. The corresponding ratio for conservative differences was approximately 1.8:1.1:1. The evolutionary separation of the two species was calculated to have taken place 2-2.5 million years ago. This dating gives the figure approximately 8 x 10(-9) as the mean rate of substitution per site and year in the entire mtDNA molecule. Comparison with the cytochrome b gene of the Hawaiian monk seal and the Weddell seal suggested that the lineage of these two species and that of the grey and harbor seals separated approximately 8 million years ago. PMID:8308902

  11. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies.

    PubMed

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79-92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50-70%). PMID:27547936

  12. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

    PubMed Central

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79–92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50–70%). PMID:27547936

  13. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    PubMed Central

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  14. Complete Genome Sequence of Streptomyces clavuligerus F613-1, an Industrial Producer of Clavulanic Acid.

    PubMed

    Cao, Guangxiang; Zhong, Chuanqing; Zong, Gongli; Fu, Jiafang; Liu, Zhong; Zhang, Guimin; Qin, Ronghuo

    2016-01-01

    Streptomyces clavuligerus strain F613-1 is an industrial strain with high-yield clavulanic acid production. In this study, the complete genome sequence of S. clavuligerus strain F613-1 was determined, including one linear chromosome and one linear plasmid, carrying numerous sets of genes involving in the biosynthesis of clavulanic acid.

  15. Complete Genome Sequence of Streptomyces clavuligerus F613-1, an Industrial Producer of Clavulanic Acid.

    PubMed

    Cao, Guangxiang; Zhong, Chuanqing; Zong, Gongli; Fu, Jiafang; Liu, Zhong; Zhang, Guimin; Qin, Ronghuo

    2016-01-01

    Streptomyces clavuligerus strain F613-1 is an industrial strain with high-yield clavulanic acid production. In this study, the complete genome sequence of S. clavuligerus strain F613-1 was determined, including one linear chromosome and one linear plasmid, carrying numerous sets of genes involving in the biosynthesis of clavulanic acid. PMID:27660792

  16. Complete Genome Sequence of Streptomyces clavuligerus F613-1, an Industrial Producer of Clavulanic Acid

    PubMed Central

    Zhong, Chuanqing; Zong, Gongli; Fu, Jiafang; Liu, Zhong; Zhang, Guimin; Qin, Ronghuo

    2016-01-01

    Streptomyces clavuligerus strain F613-1 is an industrial strain with high-yield clavulanic acid production. In this study, the complete genome sequence of S. clavuligerus strain F613-1 was determined, including one linear chromosome and one linear plasmid, carrying numerous sets of genes involving in the biosynthesis of clavulanic acid. PMID:27660792

  17. Circular Helix-Like Curve: An Effective Tool of Biological Sequence Analysis and Comparison

    PubMed Central

    Li, Yushuang

    2016-01-01

    This paper constructed a novel injection from a DNA sequence to a 3D graph, named circular helix-like curve (CHC). The presented graphical representation is available for visualizing characterizations of a single DNA sequence and identifying similarities and differences among several DNAs. A 12-dimensional vector extracted from CHC, as a numerical characterization of CHC, was applied to analyze phylogenetic relationships of 11 species, 74 ribosomal RNAs, 48 Hepatitis E viruses, and 18 eutherian mammals, respectively. Successful experiments illustrated that CHC is an effective tool of biological sequence analysis and comparison. PMID:27403205

  18. Circular Helix-Like Curve: An Effective Tool of Biological Sequence Analysis and Comparison.

    PubMed

    Li, Yushuang; Xiao, Wenli

    2016-01-01

    This paper constructed a novel injection from a DNA sequence to a 3D graph, named circular helix-like curve (CHC). The presented graphical representation is available for visualizing characterizations of a single DNA sequence and identifying similarities and differences among several DNAs. A 12-dimensional vector extracted from CHC, as a numerical characterization of CHC, was applied to analyze phylogenetic relationships of 11 species, 74 ribosomal RNAs, 48 Hepatitis E viruses, and 18 eutherian mammals, respectively. Successful experiments illustrated that CHC is an effective tool of biological sequence analysis and comparison. PMID:27403205

  19. Parvalbumins from coelacanth muscle. III. Amino acid sequence of the major component.

    PubMed

    Jauregui-Adell, J; Pechere, J F

    1978-09-26

    The primary structure of the major parvalbumin (pI = 4.52) from coelacanth muscle (Latimeria chalumnae) has been determined. Sequence analysis of the tryptic peptides, in some cases obtained with beta-trypsin, accounts for the total amino acid content of the protein. Chymotryptic peptides provide appropriate sequence overlaps, to complete the localization of the tryptic peptides. Examination of the amino acid sequence of this protein shows the typical structure of a beta-parvalbumin. Its position in the dendrogram of related calcium-binding proteins corresponds to that usually accepted for crossopterygians.

  20. Comparison and trend study on acidity and acidic buffering capacity of particulate matter in China

    NASA Astrophysics Data System (ADS)

    Ren, Lihong; Wang, Wei; Wang, Qingyue; Yang, XiaoYang; Tang, Dagang

    2011-12-01

    The acidity of about 2000 particulate matter samples from aircraft and ground-based monitoring is analyzed by the method similar to soil acidity determination. The ground-based samples were collected at about 50 urban or background sites in northern and southern China. Moreover, the acidic buffering capacity of those samples is also analyzed by the method of micro acid-base titration. Results indicate that the acidity level is lower in most northern areas than those in the south, and the acidic buffering capacity showed inverse tendency, correspondingly. This is the most important reason why the pollution of acidic-precipitation is much more serious in Southern China than that in Northern China. The acidity increases and the acidic buffering capacity drops with the decreasing of the particle sizes, indicating that fine particle is the main influencing factor of the acidification. The ionic results show that Ca salt is the main alkaline substance in particulate matter, whereas the acidification of particulate matter is due to the SO 2 and NO x emitted from the fossil fuel burning. And among of them, coal burning is the main contributor of SO 2, however the contribution of NO x that emitted from fuel burning of motor vehicles has increased in recent years. By comparison of the experimental results during the past 20 years, it can be concluded that the acid precipitation of particulate matter has not been well controlled, and it even shows an increasing tendency in China lately. The acid precipitation of particulate matter has begun to frequently attack in part of the northern areas. Multiple regression analysis indicates that coefficient value of the ions is the lowest at the urban sites and the highest at the regional sites, whereas the aircraft measurement results are intermediate between those two kinds of sites.

  1. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  2. Amino acid sequence of a new mitochondrially synthesized proteolipid of the ATP synthase of Saccharomyces cerevisiae.

    PubMed Central

    Velours, J; Esparza, M; Hoppe, J; Sebald, W; Guerin, B

    1984-01-01

    The purification and the amino acid sequence of a proteolipid translated on ribosomes in yeast mitochondria is reported. This protein, which is a subunit of the ATP synthase, was purified by extraction with chloroform/methanol (2/1) and subsequent chromatography on phosphocellulose and reverse phase h.p.l.c. A mol. wt. of 5500 was estimated by chromatography on Bio-Gel P-30 in 80% formic acid. The complete amino acid sequence of this protein was determined by automated solid phase Edman degradation of the whole protein and of fragments obtained after cleavage with cyanogen bromide. The sequence analysis indicates a length of 48 amino acid residues. The calculated mol. wt. of 5870 corresponds to the value found by gel chromatography. This polypeptide contains three basic residues and no negatively charged side chain. The three basic residues are clustered at the C terminus. The primary structure of this protein is in full agreement with the predicted amino acid sequence of the putative polypeptide encoded by the mitochondrial aap1 gene recently discovered in Saccharomyces cerevisiae. Moreover, this protein shows 50% homology with the amino acid sequence of a putative polypeptide encoded by an unidentified reading frame also discovered near the mitochondrial ATPase subunit 6 gene in Aspergillus nidulans. Images Fig. 2. PMID:6323165

  3. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor. PMID:2708331

  4. Assessing the Drosophila melanogaster and Anopheles gambiae Genome Annotations Using Genome-Wide Sequence Comparisons

    PubMed Central

    Jaillon, Olivier; Dossat, Carole; Eckenberg, Ralph; Eiglmeier, Karin; Segurens, Béatrice; Aury, Jean-Marc; Roth, Charles W.; Scarpelli, Claude; Brey, Paul T.; Weissenbach, Jean; Wincker, Patrick

    2003-01-01

    We performed genome-wide sequence comparisons at the protein coding level between the genome sequences of Drosophila melanogaster and Anopheles gambiae. Such comparisons detect evolutionarily conserved regions (ecores) that can be used for a qualitative and quantitative evaluation of the available annotations of both genomes. They also provide novel candidate features for annotation. The percentage of ecores mapping outside annotations in the A. gambiae genome is about fourfold higher than in D. melanogaster. The A. gambiae genome assembly also contains a high proportion of duplicated ecores, possibly resulting from artefactual sequence duplications in the genome assembly. The occurrence of 4063 ecores in the D. melanogaster genome outside annotations suggests that some genes are not yet or only partially annotated. The present work illustrates the power of comparative genomics approaches towards an exhaustive and accurate establishment of gene models and gene catalogues in insect genomes. PMID:12840038

  5. Detection of Dengue Viral RNA Using a Nucleic Acid Sequence-Based Amplification Assay

    PubMed Central

    Wu, Shuenn-Jue L.; Lee, Eun Mi; Putvatana, Ravithat; Shurtliff, Roxanne N.; Porter, Kevin R.; Suharyono, Wuryadi; Watts, Douglas M.; King, Chwan-Chuen; Murphy, Gerald S.; Hayes, Curtis G.; Romano, Joseph W.

    2001-01-01

    Faster techniques are needed for the early diagnosis of dengue fever and dengue hemorrhagic fever during the acute viremic phase of infection. An isothermal nucleic acid sequence-based amplification (NASBA) assay was optimized to amplify viral RNA of all four dengue virus serotypes by a set of universal primers and to type the amplified products by serotype-specific capture probes. The NASBA assay involved the use of silica to extract viral nucleic acid, which was amplified without thermocycling. The amplified product was detected by a probe-hybridization method that utilized electrochemiluminescence. Using normal human plasma spiked with dengue viruses, the NASBA assay had a detection threshold of 1 to 10 PFU/ml. The sensitivity and specificity of the assay were determined by testing 67 dengue virus-positive and 21 dengue virus-negative human serum or plasma samples. The “gold standard” used for comparison and evaluation was the mosquito C6/36 cell culture assay followed by an immunofluorescent assay. Viral infectivity titers in test samples were also determined by a direct plaque assay in Vero cells. The NASBA assay was able to detect dengue viral RNA in the clinical samples at plaque titers below 25 PFU/ml (the detection limit of the plaque assay). Of the 67 samples found positive by the C6/36 assay, 66 were found positive by the NASBA assay, for a sensitivity of 98.5%. The NASBA assay had a specificity of 100% based on the negative test results for the 21 normal human serum or plasma samples. These results indicate that the NASBA assay is a promising assay for the early diagnosis of dengue infections. PMID:11473994

  6. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  7. Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

    PubMed

    Soares, Inês; Goios, Ana; Amorim, António

    2012-01-01

    The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.

  8. Amino acid sequence heterogeneity of the chromosomal encoded Borrelia burgdorferi sensu lato major antigen P100.

    PubMed

    Fellinger, W; Farencena, A; Redl, B; Sambri, V; Cevenini, R; Stöffler, G

    1995-04-01

    The entire nucleotide sequence of the chromosomal encoded major antigen p100 of the European Borrelia garinii isolate B29 was determined and the deduced amino acid sequence was compared to the homologous antigen p83 of the North American Borrelia burgdorferi sensu stricto strain B31 and the p100 of the European Borrelia afzelii (group VS461) strain PKo. p100 of strain B29 shows 87% amino acid sequence identity to strain B31 and 79.2% to strain PKo, p100 of strain B31 and PKo shows 62.5% identity to each other. In addition, partial nucleotide sequences of the most heterogeneous region of the p100 gene of two other Borrelia garinii isolates (PBi and VS286) have been determined and the deduced amino acid sequences were compared with all p100 of Borrelia garinii published so far. We found an amino acid sequence identity between 88.6 and 100% within the same genospecies. The N-terminal part of the p100 proteins is highly conserved whereas a striking heterogeneous region within the C-terminal part of the proteins was observed.

  9. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  10. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  11. Genomic 3' terminal sequence comparison of three isolates of rabbit haemorrhagic disease virus.

    PubMed

    Milton, I D; Vlasak, R; Nowotny, N; Rodak, L; Carter, M J

    1992-05-15

    Comparison of sequence data is necessary in older to investigate virus origins, identify features common to virulent strains, and characterize genomic organization within virus families. A virulent caliciviral disease of rabbits recently emerged in China. We have sequenced 1100 bases from the 3' ends of two independent European isolates of this virus, and compared these with previously determined calicivirus sequences. Rabbit caliciviruses were closely related, despite the different countries in which isolation was made. This supports the rapid spread of a new virus across Europe. The capsid protein sequences of these rabbit viruses differ markedly from those determined for feline calicivirus, but a hypothetical 3' open reading frame is relatively well conserved between the caliciviruses of these two different hosts and argues for a functional role.

  12. Dynamite: a flexible code generating language for dynamic programming methods used in sequence comparison.

    PubMed

    Birney, E; Durbin, R

    1997-01-01

    We have developed a code generating language, called Dynamite, specialised for the production and subsequent manipulation of complex dynamic programming methods for biological sequence comparison. From a relatively simple text definition file Dynamite will produce a variety of implementations of a dynamic programming method, including database searches and linear space alignments. The speed of the generated code is comparable to hand written code, and the additional flexibility has proved invaluable in designing and testing new algorithms. An innovation is a flexible labelling system, which can be used to annotate the original sequences with biological information. We illustrate the Dynamite syntax and flexibility by showing definitions for dynamic programming routines (i) to align two protein sequences under the assumption that they are both poly-topic transmembrane proteins, with the simultaneous assignment of transmembrane helices and (ii) to align protein information to genomic DNA, allowing for introns and sequencing error.

  13. Analyses of mitochondrial amino acid sequence datasets support the proposal that specimens of Hypodontus macropi from three species of macropodid hosts represent distinct species

    PubMed Central

    2013-01-01

    Background Hypodontus macropi is a common intestinal nematode of a range of kangaroos and wallabies (macropodid marsupials). Based on previous multilocus enzyme electrophoresis (MEE) and nuclear ribosomal DNA sequence data sets, H. macropi has been proposed to be complex of species. To test this proposal using independent molecular data, we sequenced the whole mitochondrial (mt) genomes of individuals of H. macropi from three different species of hosts (Macropus robustus robustus, Thylogale billardierii and Macropus [Wallabia] bicolor) as well as that of Macropicola ocydromi (a related nematode), and undertook a comparative analysis of the amino acid sequence datasets derived from these genomes. Results The mt genomes sequenced by next-generation (454) technology from H. macropi from the three host species varied from 13,634 bp to 13,699 bp in size. Pairwise comparisons of the amino acid sequences predicted from these three mt genomes revealed differences of 5.8% to 18%. Phylogenetic analysis of the amino acid sequence data sets using Bayesian Inference (BI) showed that H. macropi from the three different host species formed distinct, well-supported clades. In addition, sliding window analysis of the mt genomes defined variable regions for future population genetic studies of H. macropi in different macropodid hosts and geographical regions around Australia. Conclusions The present analyses of inferred mt protein sequence datasets clearly supported the hypothesis that H. macropi from M. robustus robustus, M. bicolor and T. billardierii represent distinct species. PMID:24261823

  14. Comparison of Dixon Sequences for Estimation of Percent Breast Fibroglandular Tissue

    PubMed Central

    Ledger, Araminta E. W.; Scurr, Erica D.; Hughes, Julie; Macdonald, Alison; Wallace, Toni; Thomas, Karen; Wilson, Robin; Leach, Martin O.; Schmidt, Maria A.

    2016-01-01

    Objectives To evaluate sources of error in the Magnetic Resonance Imaging (MRI) measurement of percent fibroglandular tissue (%FGT) using two-point Dixon sequences for fat-water separation. Methods Ten female volunteers (median age: 31 yrs, range: 23–50 yrs) gave informed consent following Research Ethics Committee approval. Each volunteer was scanned twice following repositioning to enable an estimation of measurement repeatability from high-resolution gradient-echo (GRE) proton-density (PD)-weighted Dixon sequences. Differences in measures of %FGT attributable to resolution, T1 weighting and sequence type were assessed by comparison of this Dixon sequence with low-resolution GRE PD-weighted Dixon data, and against gradient-echo (GRE) or spin-echo (SE) based T1-weighted Dixon datasets, respectively. Results %FGT measurement from high-resolution PD-weighted Dixon sequences had a coefficient of repeatability of ±4.3%. There was no significant difference in %FGT between high-resolution and low-resolution PD-weighted data. Values of %FGT from GRE and SE T1-weighted data were strongly correlated with that derived from PD-weighted data (r = 0.995 and 0.96, respectively). However, both sequences exhibited higher mean %FGT by 2.9% (p < 0.0001) and 12.6% (p < 0.0001), respectively, in comparison with PD-weighted data; the increase in %FGT from the SE T1-weighted sequence was significantly larger at lower breast densities. Conclusion Although measurement of %FGT at low resolution is feasible, T1 weighting and sequence type impact on the accuracy of Dixon-based %FGT measurements; Dixon MRI protocols for %FGT measurement should be carefully considered, particularly for longitudinal or multi-centre studies. PMID:27011312

  15. Ligation with nucleic acid sequence-based amplification.

    PubMed

    Ong, Carmichael; Tai, Warren; Sarma, Aartik; Opal, Steven M; Artenstein, Andrew W; Tripathi, Anubhav

    2012-01-01

    This work presents a novel method for detecting nucleic acid targets using a ligation step along with an isothermal, exponential amplification step. We use an engineered ssDNA with two variable regions on the ends, allowing us to design the probe for optimal reaction kinetics and primer binding. This two-part probe is ligated by T4 DNA Ligase only when both parts bind adjacently to the target. The assay demonstrates that the expected 72-nt RNA product appears only when the synthetic target, T4 ligase, and both probe fragments are present during the ligation step. An extraneous 38-nt RNA product also appears due to linear amplification of unligated probe (P3), but its presence does not cause a false-positive result. In addition, 40 mmol/L KCl in the final amplification mix was found to be optimal. It was also found that increasing P5 in excess of P3 helped with ligation and reduced the extraneous 38-nt RNA product. The assay was also tested with a single nucleotide polymorphism target, changing one base at the ligation site. The assay was able to yield a negative signal despite only a single-base change. Finally, using P3 and P5 with longer binding sites results in increased overall sensitivity of the reaction, showing that increasing ligation efficiency can improve the assay overall. We believe that this method can be used effectively for a number of diagnostic assays. PMID:22449695

  16. Sequence analysis and comparison of cDNAs of the zein multigene family .

    PubMed Central

    Geraghty, D E; Messing, J; Rubenstein, I

    1982-01-01

    The nucleotide sequence of two zein cDNAs in hybrid plasmids A20 and B49 have been determined. The insert in A20 is 921 bp long including a 5' non-coding region of 60 nucleotides, preceded by what is believed to be an artifactual sequence of 41 nucleotides, and a 3' non-coding region of 87 nucleotides. The B49 insert is 467 bp long and includes approximately one-half the protein coding sequence as well as a 3' non-coding region of 97 nucleotides. These sequences have been compared with the previously published sequence of another zein clone, A30 . A20 and A30 , both encoding 19 000 mol. wt. zeins , have approximately 85% homology at the nucleotide level. The B49 sequence, corresponding to a 22 000 mol. wt. zein, has approximately 65% homology to either A20 or A30 . All three zeins share common features including nearly identical amino acid compositions. In addition, the tandem repeats of 20 amino acids first seen in A30 are also present in A20 and B49 . PMID:6897917

  17. The amino acid sequence of mitogenic lectin-B from the roots of pokeweed (Phytolacca americana).

    PubMed

    Yamaguchi, K; Yurino, N; Kino, M; Ishiguro, M; Funatsu, G

    1997-04-01

    The complete amino acid sequence of pokeweed lectin-B (PL-B) has been analyzed by first sequencing seven lysylendopeptidase peptides derived from the reduced and S-pyridylethylated PL-B and then connecting them by analyzing the arginylendopeptidase peptides from the reduced and S-carboxymethylated PL-B. PL-B consists of 295 amino acid residues and two oligosaccharides linked to Asn96 and Asn139, and has a molecular mass of 34,493 Da. PL-B is composed of seven repetitive chitin-binding domains having 48-79% sequence homology with each other. Twelve amino acid residues including eight cysteine residues in these domains are absolutely conserved in all other chitin-binding domains of plant lectins and class I chitinases. Also, it was strongly suggested that the extremely high hemagglutinating and mitogenic activities of PL-B may be ascribed to its seven-domain structure.

  18. ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids.

    PubMed

    Ashkenazy, Haim; Erez, Elana; Martz, Eric; Pupko, Tal; Ben-Tal, Nir

    2010-07-01

    It is informative to detect highly conserved positions in proteins and nucleic acid sequence/structure since they are often indicative of structural and/or functional importance. ConSurf (http://consurf.tau.ac.il) and ConSeq (http://conseq.tau.ac.il) are two well-established web servers for calculating the evolutionary conservation of amino acid positions in proteins using an empirical Bayesian inference, starting from protein structure and sequence, respectively. Here, we present the new version of the ConSurf web server that combines the two independent servers, providing an easier and more intuitive step-by-step interface, while offering the user more flexibility during the process. In addition, the new version of ConSurf calculates the evolutionary rates for nucleic acid sequences. The new version is freely available at: http://consurf.tau.ac.il/.

  19. Amino acid repeats cause extraordinary coding sequence variation in the social amoeba Dictyostelium discoideum.

    PubMed

    Scala, Clea; Tian, Xiangjun; Mehdiabadi, Natasha J; Smith, Margaret H; Saxer, Gerda; Stephens, Katie; Buzombo, Prince; Strassmann, Joan E; Queller, David C

    2012-01-01

    Protein sequences are normally the most conserved elements of genomes owing to purifying selection to maintain their functions. We document an extraordinary amount of within-species protein sequence variation in the model eukaryote Dictyostelium discoideum stemming from triplet DNA repeats coding for long strings of single amino acids. D. discoideum has a very large number of such strings, many of which are polyglutamine repeats, the same sequence that causes various human neurological disorders in humans, like Huntington's disease. We show here that D. discoideum coding repeat loci are highly variable among individuals, making D. discoideum a candidate for the most variable proteome. The coding repeat loci are not significantly less variable than similar non-coding triplet repeats. This pattern is consistent with these amino-acid repeats being largely non-functional sequences evolving primarily by mutation and drift. PMID:23029418

  20. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  1. Shark myoglobins. II. Isolation, characterization and amino acid sequence of myoglobin from Galeorhinus japonicus.

    PubMed

    Suzuki, T; Suzuki, T; Yata, T

    1985-01-01

    Native oxymyoglobin (MbO2) was isolated from red muscle of G. japonicus by chromatographic separation from metmyoglobin (metMb) on DEAE-cellulose and the amino acid sequence of the major chain was determined with the aid of sequence homology with that of G. australis. It was shown to differ in amino acid sequence from that of G. australis by 10 replacements, to be acetylated at the amino terminus and to contain glutamine at the distal (E7) residue. It was also shown to have a spectrum very similar to that of mammalian MbO2. However, the pH-dependence for the autoxidation of MbO2 was seen to be quite different from that of sperm whale (Physeter catodon) MbO2. Although the sequence homology between sperm whale and G. japonicus myoglobins is about 40%, their hydropathy profiles were very similar, indicating that they have a similar geometry in their globin folding.

  2. Comparison between optimized GRE and RARE sequences for 19F MRI studies

    NASA Astrophysics Data System (ADS)

    Soffientini, Chiara D.; Mastropietro, Alfonso; Caffini, Matteo; Cocco, Sara; Zucca, Ileana; Scotti, Alessandro; Baselli, Giuseppe; Bruzzone, Maria Grazia

    2014-03-01

    In 19F-MRI studies limiting factors are the presence of a low signal due to the low concentration of 19F-nuclei, necessary for biological applications, and the inherent low sensitivity of MRI. Hence, acquiring images using the pulse sequence with the best signal to noise ratio (SNR) by optimizing the acquisition parameters specifically to a 19F compound is a core issue. In 19F-MRI, multiple-spin-echo (RARE) and gradient-echo (GRE) are the two most frequently used pulse sequence families; therefore we performed an optimization study of GRE pulse sequences based on numerical simulations and experimental acquisitions on fluorinated compounds. We compared GRE performance to an optimized RARE sequence. Images were acquired on a 7T MRI preclinical scanner on phantoms containing different fluorinated compounds. Actual relaxation times (T1, T2, T2*) were evaluated in order to predict SNR dependence on sequence parameters. Experimental comparisons between spoiled GRE and RARE, obtained at a fixed acquisition time and in steady state condition, showed RARE sequence outperforming the spoiled GRE (up to 406% higher). Conversely, the use of the unbalanced-SSFP showed a significant increase in SNR compared to RARE (up to 28% higher). Moreover, this sequence (as GRE in general) was confirmed to be virtually insensitive to T1 and T2 relaxation times, after proper optimization, thus improving marker independence from the biological environment. These results confirm the efficacy of the proposed optimization tool and foster further investigation addressing in-vivo applicability.

  3. mtDNAprofiler: a Web application for the nomenclature and comparison of human mitochondrial DNA sequences.

    PubMed

    Yang, In Seok; Lee, Hwan Young; Yang, Woo Ick; Shin, Kyoung-Jin

    2013-07-01

    Mitochondrial DNA (mtDNA) is a valuable tool in the fields of forensic, population, and medical genetics. However, recording and comparing mtDNA control region or entire genome sequences would be difficult if researchers are not familiar with mtDNA nomenclature conventions. Therefore, mtDNAprofiler, a Web application, was designed for the analysis and comparison of mtDNA sequences in a string format or as a list of mtDNA single-nucleotide polymorphisms (mtSNPs). mtDNAprofiler which comprises four mtDNA sequence-analysis tools (mtDNA nomenclature, mtDNA assembly, mtSNP conversion, and mtSNP concordance-check) supports not only the accurate analysis of mtDNA sequences via an automated nomenclature function, but also consistent management of mtSNP data via direct comparison and validity-check functions. Since mtDNAprofiler consists of four tools that are associated with key steps of mtDNA sequence analysis, mtDNAprofiler will be helpful for researchers working with mtDNA. mtDNAprofiler is freely available at http://mtprofiler.yonsei.ac.kr. PMID:23682804

  4. Effect of k-tuple length on sample-comparison with high-throughput sequencing data.

    PubMed

    Wang, Ying; Lei, Xiaoye; Wang, Shun; Wang, Zicheng; Song, Nianfeng; Zeng, Feng; Chen, Ting

    2016-01-22

    The high-throughput metagenomic sequencing offers a powerful technique to compare the microbial communities. Without requiring extra reference sequences, alignment-free models with short k-tuple (k = 2-10 bp) yielded promising results. Short k-tuples describe the overall statistical distribution, but is hard to capture the specific characteristics inside one microbial community. Longer k-tuple contains more abundant information. However, because the frequency vector of long k-tuple(k ≥ 30 bp) is sparse, the statistical measures designed for short k-tuples are not applicable. In our study, we considered each tuple as a meaningful word and then each sequencing data as a document composed of the words. Therefore, the comparison between two sequencing data is processed as "topic analysis of documents" in text mining. We designed a pipeline with long k-tuple features to compare metagenomic samples combined using algorithms from text mining and pattern recognition. The pipeline is available at http://culotuple.codeplex.com/. Experiments show that our pipeline with long k-tuple features: ①separates genomes with high similarity; ②outperforms short k-tuple models in all experiments. When k ≥ 12, the short k-tuple measures are not applicable anymore. When k is between 20 and 40, long k-tuple pipeline obtains much better grouping results; ③is free from the effect of sequencing platforms/protocols. ③We obtained meaningful and supported biological results on the 40-tuples selected for comparison.

  5. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  6. Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

    SciTech Connect

    Ovacik, Meric A.; Androulakis, Ioannis P.

    2013-09-15

    Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogenetic relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.

  7. Draft Genome Sequence of Ustilago trichophora RK089, a Promising Malic Acid Producer.

    PubMed

    Zambanini, Thiemo; Buescher, Joerg M; Meurer, Guido; Wierckx, Nick; Blank, Lars M

    2016-01-01

    The basidiomycetous smut fungus Ustilago trichophora RK089 produces malate from glycerol. De novo genome sequencing revealed a 20.7-Mbp genome (301 gap-closed contigs, 246 scaffolds). A comparison to the genome of Ustilago maydis 521 revealed all essential genes for malate production from glycerol contributing to metabolic engineering for improving malate production. PMID:27469969

  8. Draft Genome Sequence of Ustilago trichophora RK089, a Promising Malic Acid Producer

    PubMed Central

    Zambanini, Thiemo; Buescher, Joerg M.; Meurer, Guido; Blank, Lars M.

    2016-01-01

    The basidiomycetous smut fungus Ustilago trichophora RK089 produces malate from glycerol. De novo genome sequencing revealed a 20.7-Mbp genome (301 gap-closed contigs, 246 scaffolds). A comparison to the genome of Ustilago maydis 521 revealed all essential genes for malate production from glycerol contributing to metabolic engineering for improving malate production. PMID:27469969

  9. Visible sensing of nucleic acid sequences using a genetically encodable unmodified mRNA probe.

    PubMed

    Narita, Atsushi; Ogawa, Kazumasa; Sando, Shinsuke; Aoyama, Yasuhiro

    2006-01-01

    We previously reported a molecular beacon-mRNA (MB-mRNA) strategy for nucleic acid detection/sensing in a cell-free translation system using unmodified RNA as a probe. Here in this presentation, we report that a combination with RNase H activity, which induces an additional process of irreversible cleavage of MB-domain, achieves an improved sequence selectivity (one nucleotide selectivity) and an enhanced sensitivity. This improved system finally enabled visible sensing of target nucleic acid sequence at a single nucleotide resolution under isothermal conditions.

  10. Definition and Analysis of a System for the Automated Comparison of Curriculum Sequencing Algorithms in Adaptive Distance Learning

    ERIC Educational Resources Information Center

    Limongelli, Carla; Sciarrone, Filippo; Temperini, Marco; Vaste, Giulia

    2011-01-01

    LS-Lab provides automatic support to comparison/evaluation of the Learning Object Sequences produced by different Curriculum Sequencing Algorithms. Through this framework a teacher can verify the correspondence between the behaviour of different sequencing algorithms and her pedagogical preferences. In fact the teacher can compare algorithms…

  11. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  12. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids.

  13. Shotgun sequencing analysis of Trypanosoma cruzi I Sylvio X10/1 and comparison with T. cruzi VI CL Brener.

    PubMed

    Franzén, Oscar; Ochaya, Stephen; Sherwood, Ellen; Lewis, Michael D; Llewellyn, Martin S; Miles, Michael A; Andersson, Björn

    2011-03-08

    Trypanosoma cruzi is the causative agent of Chagas disease, which affects more than 9 million people in Latin America. We have generated a draft genome sequence of the TcI strain Sylvio X10/1 and compared it to the TcVI reference strain CL Brener to identify lineage-specific features. We found virtually no differences in the core gene content of CL Brener and Sylvio X10/1 by presence/absence analysis, but 6 open reading frames from CL Brener were missing in Sylvio X10/1. Several multicopy gene families, including DGF, mucin, MASP and GP63 were found to contain substantially fewer genes in Sylvio X10/1, based on sequence read estimations. 1,861 small insertion-deletion events and 77,349 nucleotide differences, 23% of which were non-synonymous and associated with radical amino acid changes, further distinguish these two genomes. There were 336 genes indicated as under positive selection, 145 unique to T. cruzi in comparison to T. brucei and Leishmania. This study provides a framework for further comparative analyses of two major T. cruzi lineages and also highlights the need for sequencing more strains to understand fully the genomic composition of this parasite.

  14. Complete mitochondrial DNA sequence of the yellowfin seabream Acanthopagrus latus and a genomic comparison among closely related sparid species.

    PubMed

    Xia, Junhong; Xia, Kuaifei; Jiang, Shigui

    2008-08-01

    The complete mitochondrial genome of the yellowfin seabream Acanthopagrus latus was determined in the present study. The genome was 16,609 bp in length and contained 37 genes (2 ribosomal RNA, 22 transfer RNA and 13 protein-coding genes) and the control region (CR), with the content and order of genes being similar to those in typical teleosts. Comparisons of the 37 genes and CR among species indicate the CR was the highest divergent (0.3341), but tRNA(Gly) possesses the lowest genetic variation (0.0542). Much greater p-genetic distances [mean = 0.1559, standard deviation (SD) = 0.0235; n = 1653] for the interspecies level with high frequency (99.4%) than those of the intraspecies level (mean = 0.0098, SD = 0.0090; n = 20) were inferred from 212 Cyt b sequence data, suggesting the Cyt b gene is conserved within Sparidae species and supporting the barcoding validity of Cyt b sequence data for Sparidae species identification. Phylogenetic analysis using amino acid sequences of 13 protein-coding genes supported that the genus Pagrus was not monophyletic, showing the need to re-evaluate the morphological characteristics of Pagrus fishes.

  15. The new sequencer on the block: comparison of Life Technology's Proton sequencer to an Illumina HiSeq for whole-exome sequencing.

    PubMed

    Boland, Joseph F; Chung, Charles C; Roberson, David; Mitchell, Jason; Zhang, Xijun; Im, Kate M; He, Ji; Chanock, Stephen J; Yeager, Meredith; Dean, Michael

    2013-10-01

    We assessed the performance of the new Life Technologies Proton sequencer by comparing whole-exome sequence data in a Centre d'Etude du Polymorphisme Humain trio (family 1463) to the Illumina HiSeq instrument. To simulate a typical user's results, we utilized the standard capture, alignment and variant calling methods specific to each platform. We restricted data analysis to include the capture region common to both methods. The Proton produced high quality data at a comparable average depth and read length, and the Ion Reporter variant caller identified 96 % of single nucleotide polymorphisms (SNPs) detected by the HiSeq and GATK pipeline. However, only 40 % of small insertion and deletion variants (indels) were identified by both methods. Usage of the trio structure and segregation of platform-specific alleles supported this result. Further comparison of the trio data with Complete Genomics sequence data and Illumina SNP microarray genotypes documented high concordance and accurate SNP genotyping of both Proton and Illumina platforms. However, our study underscored the problem of accurate detection of indels for both the Proton and HiSeq platforms.

  16. Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.

    PubMed

    Mirsky, Alexander; Kazandjian, Linda; Anisimova, Maria

    2015-03-01

    Antibodies are glycoproteins produced by the immune system as a dynamically adaptive line of defense against invading pathogens. Very elegant and specific mutational mechanisms allow B lymphocytes to produce a large and diversified repertoire of antibodies, which is modified and enhanced throughout all adulthood. One of these mechanisms is somatic hypermutation, which stochastically mutates nucleotides in the antibody genes, forming new sequences with different properties and, eventually, higher affinity and selectivity to the pathogenic target. As somatic hypermutation involves fast mutation of antibody sequences, this process can be described using a Markov substitution model of molecular evolution. Here, using large sets of antibody sequences from mice and humans, we infer an empirical amino acid substitution model AB, which is specific to antibody sequences. Compared with existing general amino acid models, we show that the AB model provides significantly better description for the somatic evolution of mice and human antibody sequences, as demonstrated on large next generation sequencing (NGS) antibody data. General amino acid models are reflective of conservation at the protein level due to functional constraints, with most frequent amino acids exchanges taking place between residues with the same or similar physicochemical properties. In contrast, within the variable part of antibody sequences we observed an elevated frequency of exchanges between amino acids with distinct physicochemical properties. This is indicative of a sui generis mutational mechanism, specific to antibody somatic hypermutation. We illustrate this property of antibody sequences by a comparative analysis of the network modularity implied by the AB model and general amino acid substitution models. We recommend using the new model for computational studies of antibody sequence maturation, including inference of alignments and phylogenetic trees describing antibody somatic hypermutation in

  17. The value of short amino acid sequence matches for prediction of protein allergenicity.

    PubMed

    Silvanovich, Andre; Nemeth, Margaret A; Song, Ping; Herman, Rod; Tagliani, Laura; Bannon, Gary A

    2006-03-01

    Typically, genetically engineered crops contain traits encoded by one or a few newly expressed proteins. The allergenicity assessment of newly expressed proteins is an important component in the safety evaluation of genetically engineered plants. One aspect of this assessment involves sequence searches that compare the amino acid sequence of the protein to all known allergens. Analyses are performed to determine the potential for immunologically based cross-reactivity where IgE directed against a known allergen could bind to the protein and elicit a clinical reaction in sensitized individuals. Bioinformatic searches are designed to detect global sequence similarity and short contiguous amino acid sequence identity. It has been suggested that potential allergen cross-reactivity may be predicted by identifying matches as short as six to eight contiguous amino acids between the protein of interest and a known allergen. A series of analyses were performed, and match probabilities were calculated for different size peptides to determine if there was a scientifically justified search window size that identified allergen sequence characteristics. Four probability modeling methods were tested: (1) a mock protein and a mock allergen database, (2) a mock protein and genuine allergen database, (3) a genuine allergen and genuine protein database, and (4) a genuine allergen and genuine protein database combined with a correction for repeating peptides. These analyses indicated that searches for short amino acid sequence matches of eight amino acids or fewer to identify proteins as potential cross-reactive allergens is a product of chance and adds little value to allergy assessments for newly expressed proteins.

  18. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    SciTech Connect

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would be very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since

  19. Quantitative detection of Aspergillus spp. by real-time nucleic acid sequence-based amplification.

    PubMed

    Zhao, Yanan; Perlin, David S

    2013-01-01

    Rapid and quantitative detection of Aspergillus from clinical samples may facilitate an early diagnosis of invasive pulmonary aspergillosis (IPA). As nucleic acid-based detection is a viable option, we demonstrate that Aspergillus burdens can be rapidly and accurately detected by a novel real-time nucleic acid assay other than qPCR by using the combination of nucleic acid sequence-based amplification (NASBA) and the molecular beacon (MB) technology. Here, we detail a real-time NASBA assay to determine quantitative Aspergillus burdens in lungs and bronchoalveolar lavage (BAL) fluids of rats with experimental IPA.

  20. Draft Genome Sequence of the Butyric Acid Producer Clostridium tyrobutyricum Strain CIP I-776 (IFP923)

    PubMed Central

    Clément, Benjamin; Lopes Ferreira, Nicolas

    2016-01-01

    Here, we report the draft genome sequence of Clostridium tyrobutyricum CIP I-776 (IFP923), an efficient producer of butyric acid. The genome consists of a single chromosome of 3.19 Mb and provides useful data concerning the metabolic capacities of the strain. PMID:26941139

  1. Amino acid sequence of the encephalitogenic basic protein from human myelin

    PubMed Central

    Carnegie, P. R.

    1971-01-01

    Myelin from the central nervous system contains an unusual basic protein, which can induce experimental autoimmune encephalomyelitis. The basic protein from human brain was digested with trypsin and other enzymes and the sequence of the 170 amino acids was determined. The localization of the encephalitogenic determinants was described. Possible roles for the protein in the structure and function of myelin are discussed. PMID:4108501

  2. Sequence-specific formation of d-amino acids in a monoclonal antibody during light exposure.

    PubMed

    Mozziconacci, Olivier; Schöneich, Christian

    2014-11-01

    The photoirradiation of a monoclonal antibody 1 (mAb1) at λ = 254 nm and λmax = 305 nm resulted in the sequence-specific generation of d-Val, d-Tyr, and potentially d-Ala and d-Arg, in the heavy chain sequence [95-101] YCARVVY. d-Amino acid formation is most likely the product of reversible intermediary carbon-centered radical formation at the (α)C-positions of the respective amino acids ((α)C(•) radicals) through the action of Cys thiyl radicals (CysS(•)). The latter can be generated photochemically either through direct homolysis of cystine or through photoinduced electron transfer from Trp and/or Tyr residues. The potential of mAb1 sequences to undergo epimerization was first evaluated through covalent H/D exchange during photoirradiation in D2O, and proteolytic peptides exhibiting deuterium incorporation were monitored by HPLC-MS/MS analysis. Subsequently, mAb1 was photoirradiated in H2O, and peptides, for which deuterium incorporation in D2O had been documented, were purified by HPLC and subjected to hydrolysis and amino acid analysis. Importantly, not all peptide sequences which incorporated deuterium during photoirradiation in D2O also exhibited photoinduced d-amino acid formation. For example, the heavy chain sequence [12-18] VQPGGSL showed significant deuterium incorporation during photoirradiation in D2O, but no photoinduced formation of d-amino acids was detected. Instead this sequence contained ca. 22% d-Val in both a photoirradiated and a control sample. This observation could indicate that d-Val may have been generated either during production and/or storage or during sample preparation. While sample preparation did not lead to the formation of d-Val or other d-amino acids in the control sample for the heavy chain sequence [95-101] YCARVVY, we may have to consider that during hydrolysis N-terminal residues (such as in VQPGGSL) may be more prone to epimerization. We conclude that the photoinduced, radical-dependent formation of d-amino acids

  3. The complete amino acid sequence of chitinase-B from the leaves of pokeweed (Phytolacca americana).

    PubMed

    Tanigawa, M; Yamagami, T; Funatsu, G

    1995-05-01

    The complete amino acid sequence of pokeweed leaf chitinase-B (PLC-B) has been determined by first sequencing all 19 tryptic peptides derived from the reduced and S-carboxymethylated (RCm-) PLC-B and then connecting them by analyzing the chymotryptic peptides from three fragments produced by cyanogen bromide cleavage of RCm-PLC-B. PLC-B consists of 274 amino acid residues and has a molecular mass of 29,473 Da. Six cysteine residues are linked by disulfide bonds between Cys20 and Cys67, Cys50 and Cys57, and Cys159 and Cys188. From 58-68% sequence homology of PLC-B with five class III chitinases, it was concluded that PLC-B is a basic class III chitinase.

  4. Pyruvate decarboxylase from Pisum sativum. Properties, nucleotide and amino acid sequences.

    PubMed

    Mücke, U; Wohlfarth, T; Fiedler, U; Bäumlein, H; Rücknagel, K P; König, S

    1996-04-15

    To study the molecular structure and function of pyruvate decarboxylase (PDC) from plants the protein was isolated from pea seeds and partially characterised. The active enzyme which occurs in the form of higher oligomers consists of two different subunits appearing in SDS/PAGE and mass spectroscopy experiments. For further experiments, like X-ray crystallography, it was necessary to elucidate the protein sequence. Partial cDNA clones encoding pyruvate decarboxylase from seeds of Pisum sativum cv. Miko have been obtained by means of polymerase chain reaction techniques. The first sequences were found using degenerate oligonucleotide primers designated according to conserved amino acid sequences of known pyruvate decarboxylases. The missing parts of one cDNA were amplified applying the 3'- and 5'-rapid amplification of cDNA ends systems. The amino acid sequence deduced from the entire cDNA sequence displays strong similarity to pyruvate decarboxylases from other organisms, especially from plants. A molecular mass of 64 kDa was calculated for this protein correlating with estimations for the smaller subunit of the oligomeric enzyme. The PCR experiments led to at least three different clones representing the middle part of the PDC cDNA indicating the existence of three isozymes. Two of these isoforms could be confirmed on the protein level by sequencing tryptic peptides. Only anaerobically treated roots showed a positive signal for PDC mRNA in Northern analysis although the cDNA from imbibed seeds was successfully used for PCR.

  5. Octopus S-crystallins with endogenous glutathione S-transferase (GST) activity: sequence comparison and evolutionary relationships with authentic GST enzymes.

    PubMed Central

    Chiou, S H; Yu, C W; Lin, C W; Pan, F M; Lu, S F; Lee, H J; Chang, G G

    1995-01-01

    S-Crystallin is a major protein present in the lenses of cephalopods (octopus and squid). To facilitate the cloning of this crystallin gene, cDNA was constructed from the poly(A)+ mRNA of octopus lenses, and amplified by PCR for nucleotide sequencing. Sequencing of 10 of 15 positive clones coding for this crystallin revealed three distinct S-crystallin isoforms with 61-64% identity in nucleotide sequences and 42-58% similarity in amino acid sequences when compared with homologous crystallins in squid lenses. These charge-isomeric crystallins also show between 26 and 33% amino acid sequence identity to four major classes of glutathione S-transferase (GST), a major detoxification enzyme present in most mammalian tissues. For further analysis, expression of one of the S-crystallin cDNAs was carried out in the bacterial expression system pQE-30, and the S-crystallin protein produced in Escherichia coli was purified to homogeneity to determine the enzymic properties. We found that the expressed octopus S-crystallin possessed much lower GST activity than the authentic GSTs from other tissues. Sequence comparison and construction of phylogenetic trees for S-crystallins from squid and octopus lenses and various classes of GSTs revealed that S-crystallins represent a multigene family which is structurally related to Alpha-class GSTs and probably derived from the ancestral GST by gene duplication and subsequent multiple mutational substitutions. Images Figure 2 Figure 3 Figure 6 Figure 7 PMID:7639695

  6. Allelic polymorphism in arabian camel ribonuclease and the amino acid sequence of bactrian camel ribonuclease.

    PubMed

    Welling, G W; Mulder, H; Beintema, J J

    1976-04-01

    Pancreatic ribonucleases from several species (whitetail deer, roe deer, guinea pig, and arabian camel) exhibit more than one amino acid at particular positions in their amino acid sequences. Since these enzymes were isolated from pooled pancreas, the origin of this heterogeneity is not clear. The pancreatic ribonucleases from 11 individual arabian camels (Camelus dromedarius) have been investigated with respect to the lysine-glutamine heterogeneity at position 103 (Welling et al., 1975). Six ribonucleases showed only one basic band and five showed two bands after polyacrylamide gel electrophoresis, suggesting a gene frequency of about 0.75 for the Lys gene and about 0.25 for the Gln gene. The amino acid sequence of bactrian camel (Camelus bactrianus) ribonuclease isolated from individual pancreatic tissue was determined and compared with that of arabian camel ribonuclease. The only difference was observed at position 103. In the ribonucleases from two unrelated bactrian camels, only glutamine was observed at that position. PMID:962846

  7. Pattern recognition in nucleic acid sequences. II. An efficient method for finding locally stable secondary structures.

    PubMed Central

    Kanehisa, M I; Goad, W B

    1982-01-01

    We present a method for calculating all possible single hairpin loop secondary structures in a nucleic acid sequence by the order of N2 operations where N is the total number of bases. Each structure may contain any number of bulges and internal loops. Most natural sequences are found to be indistinguishable from random sequences in the potential of forming secondary structures, which is defined by the frequency of possible secondary structures calculated by the method. There is a strong correlation between the higher G+C content and the higher structure forming potential. Interestingly, the removal of intervening sequences in mRNAs is almost always accompanied by an increase in the G+C content, which may suggest an involvement of structural stabilization in the mRNA maturation. PMID:6174936

  8. Pleistocene glaciation of volcano Ajusco, central Mexico, and comparison with the standard Mexican glacial sequence

    NASA Astrophysics Data System (ADS)

    White, Sidney E.; Valastro, Salvatore

    1984-01-01

    Three Pleistocene glaciations and two Holocene Neoglacial advances occurred on volcano Ajusco in central Mexico. Lateral moraines of the oldest glaciation, the Marqués, above 3250 m are made of light-gray indurated till and are extensively modified by erosion. Below 3200 m the till is dark red, decomposed, and buried beneath volcanic colluvium and tephra. Very strongly to strongly developed soil profiles (Inceptisols) have formed in the Marqués till and in overlying colluvia and tephra. Large sharp-crested moraines of the second glaciation, the Santo Tomás, above 3300 m are composed of pale-brown firm till and are somewhat eroded by gullies. Below 3250 m the till is light reddish brown, cemented, and weathered. Less-strongly developed soil profiles (Inceptisols) have formed in the Santo Tomás till and in overlying colluvia and tephra. Narrow-crested moraines of yellowish-brown loose till of the third glaciation, the Albergue, are uneroded. Weakly developed soil profiles (Inceptisols) in the Albergue till have black ash in the upper horizon. Two small Neoglacial moraines of yellowish-brown bouldery till on the cirque floor of the largest valley support weakly developed soil profiles with only A and Cox horizons and no ash in the upper soil horizons. Radiocarbon dating of organic matter of the B horizons developed in tills, volcanic ash, and colluvial volcanic sand includes ages for both the soil-organic residue and the humic-acid fraction, with differences from 140 to 660 yr. The dating provides minimum ages of about 27,000 yr for the Marqués glaciation and about 25,000 yr for the Santo Tomás glaciation. Dates for the overlying tephra indicate a complex volcanic history for at least another 15,000 yr. Comparison of the Ajusco glacial sequence with that on Iztaccíhuatl to the east suggests that the Marqués and Santo Tomás glaciations may be equivalent to the Diamantes glaciation First and Second advances, the Albergue to the Alcalican glaciations, and the

  9. Amino acid racemization dating of fossil bones, I. inter-laboratory comparison of racemization measurements

    USGS Publications Warehouse

    Bada, J.L.; Hoopes, E.; Darling, D.; Dungworth, G.; Kessels, H.J.; Kvenvolden, K.A.; Blunt, D.J.

    1979-01-01

    Enantiomeric measurements for aspartic acid, glutamic acid, and alanine in twenty-one different fossil bone samples have been carried out by three different laboratories using different analytical methods. These inter-laboratory comparisons demonstrate that D/L aspartic acid measurements are highly reproducible, whereas the enantiomeric measurements for the other amino acids show a wide variation between the three laboratories. At present, aspartic acid measurements are the most suitable for racemization dating of bone because of their superior analytical precision. ?? 1979.

  10. Effect of k-tuple length on sample-comparison with high-throughput sequencing data.

    PubMed

    Wang, Ying; Lei, Xiaoye; Wang, Shun; Wang, Zicheng; Song, Nianfeng; Zeng, Feng; Chen, Ting

    2016-01-22

    The high-throughput metagenomic sequencing offers a powerful technique to compare the microbial communities. Without requiring extra reference sequences, alignment-free models with short k-tuple (k = 2-10 bp) yielded promising results. Short k-tuples describe the overall statistical distribution, but is hard to capture the specific characteristics inside one microbial community. Longer k-tuple contains more abundant information. However, because the frequency vector of long k-tuple(k ≥ 30 bp) is sparse, the statistical measures designed for short k-tuples are not applicable. In our study, we considered each tuple as a meaningful word and then each sequencing data as a document composed of the words. Therefore, the comparison between two sequencing data is processed as "topic analysis of documents" in text mining. We designed a pipeline with long k-tuple features to compare metagenomic samples combined using algorithms from text mining and pattern recognition. The pipeline is available at http://culotuple.codeplex.com/. Experiments show that our pipeline with long k-tuple features: ①separates genomes with high similarity; ②outperforms short k-tuple models in all experiments. When k ≥ 12, the short k-tuple measures are not applicable anymore. When k is between 20 and 40, long k-tuple pipeline obtains much better grouping results; ③is free from the effect of sequencing platforms/protocols. ③We obtained meaningful and supported biological results on the 40-tuples selected for comparison. PMID:26721429

  11. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  12. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  13. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach.

    PubMed

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (DeltaG (min)). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate DeltaG (min). This effectively excludes inappropriate sequences before DeltaG (min) is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (DeltaG (exp)) of 126 sequences correlated well with DeltaG (min) (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java.

  14. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken; SNL,

    2016-07-12

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  15. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    SciTech Connect

    Patel, Kamlesh D; SNL,

    2012-06-01

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  16. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  17. Complete amino acid sequence of a human monocyte chemoattractant, a putative mediator of cellular immune reactions.

    PubMed Central

    Robinson, E A; Yoshimura, T; Leonard, E J; Tanaka, S; Griffin, P R; Shabanowitz, J; Hunt, D F; Appella, E

    1989-01-01

    In a study of the structural basis for leukocyte specificity of chemoattractants, we determined the complete amino acid sequence of human glioma-derived monocyte chemotactic factor (GDCF-2), a peptide that attracts human monocytes but not neutrophils. The choice of a tumor cell product for analysis was dictated by its relative abundance and an amino acid composition indistinguishable from that of lymphocyte-derived chemotactic factor (LDCF), the agonist thought to account for monocyte accumulation in cellular immune reactions. By a combination of Edman degradation and mass spectrometry, it was established that GDCF-2 comprises 76 amino acid residues, commencing at the N terminus with pyroglutamic acid. The peptide contains four half-cystines, at positions 11, 12, 36, and 52, which create a pair of loops, clustered at the disulfide bridges. The relative positions of the half-cystines are almost identical to those of monocyte-derived neutrophil chemotactic factor (MDNCF), a peptide of similar mass but with only 24% sequence identity to GDCF. Thus, GDCF and MDNCF have a similar gross secondary structure because of the loops formed by the clustered disulfides, and their different leukocyte specificities are most likely determined by the large differences in primary sequence. PMID:2648385

  18. Amino acid sequences of lower vertebrate parvalbumins and their evolution: parvalbumins of boa, turtle, and salamander.

    PubMed

    Maeda, N; Zhu, D X; Fitch, W M

    1984-11-01

    One major parvalbumin each was isolated from the skeletal muscle of two reptiles, a boa snake, Boa constrictor, and a map turtle, Graptemys geographica, while two parvalbumins were isolated from an amphibian, the salamander Amphiuma means. The amino acid sequences of all four parvalbumins were determined from the sequences of their tryptic peptides, which were ordered partially by homology to other parvalbumins. Phylogenetic study of these and 16 other parvalbumin sequences revealed that the turtle parvalbumin belongs to beta lineage, while the salamander sequences belong, one each, to the alpha and beta lineages defined by Goodman and Pechère (1977). Boa parvalbumin, however, while belonging to the beta lineage, clusters within the fish in all reasonably parsimonious trees. The most parsimonious trees show many parallel or back mutations in the evolution of many parvalbumin residues, although the residues responsible for Ca2+ binding are very well conserved. These most parsimonious trees show an actinopterygian rather than a crossoptyrigian origin of the tetrapods in both the alpha and beta groups. One of two electric eel parvalbumins is evolving more than 10 times faster than its paralogous partner, suggesting it may be on its way to becoming a pseudogene. It is concluded that varying rates of amino acid replacement, much homoplasy, considerable gene duplication, plus complicated lineages make the set of parvalbumin sequences unsuitable for systematic study of the origin of the tetrapods and other higher-taxa divergence, although it may be suitable within a genus or family.

  19. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  20. Nucleotide and amino acid sequences of human intestinal alkaline phosphatase: close homology to placental alkaline phosphatase

    SciTech Connect

    Henthorn, P.S.; Raducha, M.; Edwards, Y.H.; Weiss, M.J.; Slaughter, C.; Lafferty, M.A.; Harris, H.

    1987-03-01

    A cDNA clone for human adult intestinal alkaline phosphatase (ALP) (orthophosphoric-monoester phosphohydrolase (alkaline optimum); EC 3.1.3.1) was isolated from a lambdagt11 expression library. The cDNA insert of this clone is 2513 base pairs in length and contains an open reading frame that encodes a 528-amino acid polypeptide. This deduced polypeptide contains the first 40 amino acids of human intestinal ALP, as determined by direct protein sequencing. Intestinal ALP shows 86.5% amino acid identity to placental (type 1) ALP and 56.6% amino acid identity to liver/bone/kidney ALP. In the 3'-untranslated regions, intestinal and placental ALP cDNAs are 73.5% identical (excluding gaps). The evolution of this multigene enzyme family is discussed.

  1. De novo Sequencing, Characterization, and Comparison of Inflorescence Transcriptomes of Cornus canadensis and C. florida (Cornaceae)

    PubMed Central

    Zhang, Jian; Franks, Robert G.; Liu, Xiang; Kang, Ming; Keebler, Jonathan E. M.; Schaff, Jennifer E.; Huang, Hong-Wen; Xiang, Qiu-Yun (Jenny)

    2013-01-01

    Background Transcriptome sequencing analysis is a powerful tool in molecular genetics and evolutionary biology. Here we report the results of de novo 454 sequencing, characterization, and comparison of inflorescence transcriptomes of two closely related dogwood species, Cornus canadensis and C. florida (Cornaceae). Our goals were to build a preliminary source of genome sequence data, and to identify genes potentially expressed differentially between the inflorescence transcriptomes for these important horticultural species. Results The sequencing of cDNAs from inflorescence buds of C. canadensis (cc) and C. florida (cf), and normalized cDNAs from leaves of C. canadensis resulted in 251799 (ccBud), 96245 (ccLeaf) and 114648 (cfBud) raw reads, respectively. The de novo assembly of the high quality (HQ) reads resulted in 36088, 17802 and 21210 unigenes for ccBud, ccLeaf and cfBud. A reference transcriptome for C. canadensis was built by assembling HQ reads of ccBud and ccLeaf, containing 40884 unigenes. Reference mapping and comparative analyses found 10926 sequences were putatively specific to ccBud, and 6979 putatively specific to cfBud. Putative differentially expressed genes between ccBud and cfBud that are related to flower development and/or stress response were identified among 7718 shared sequences by ccBud and cfBud. Bi-directional BLAST found 87 (41.83% of 208) of Arabidopsis genes related to inflorescence development had putative orthologs in the dogwood transcriptomes. Comparisons of the shared sequences by ccBud and cfBud yielded 65931 high quality SNPs between two species. The twenty unigenes with the most SNPs are listed as potential genetic markers for evolutionary studies. Conclusions The data provide an important, although preliminary, information platform for functional genomics and evolutionary developmental biology in Cornus. The study identified putative candidates potentially involved in the genetic regulation of inflorescence evolution and

  2. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  3. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  4. In Silico Genome Comparison and Distribution Analysis of Simple Sequences Repeats in Cassava

    PubMed Central

    Vásquez, Andrea; López, Camilo

    2014-01-01

    We conducted a SSRs density analysis in different cassava genomic regions. The information obtained was useful to establish comparisons between cassava's SSRs genomic distribution and those of poplar, flax, and Jatropha. In general, cassava has a low SSR density (~50 SSRs/Mbp) and has a high proportion of pentanucleotides, (24,2 SSRs/Mbp). It was found that coding sequences have 15,5 SSRs/Mbp, introns have 82,3 SSRs/Mbp, 5′ UTRs have 196,1 SSRs/Mbp, and 3′ UTRs have 50,5 SSRs/Mbp. Through motif analysis of cassava's genome SSRs, the most abundant motif was AT/AT while in intron sequences and UTRs regions it was AG/CT. In addition, in coding sequences the motif AAG/CTT was also found to occur most frequently; in fact, it is the third most used codon in cassava. Sequences containing SSRs were classified according to their functional annotation of Gene Ontology categories. The identified SSRs here may be a valuable addition for genetic mapping and future studies in phylogenetic analyses and genomic evolution. PMID:25374887

  5. Reaction sequences in simulated neutralized current acid waste slurry during processing with formic acid

    SciTech Connect

    Smith, H.D.; Wiemers, K.D.; Langowski, M.H.; Powell, M.R.; Larson, D.E.

    1993-11-01

    The Hanford Waste Vitrification Plant (HWVP) is being designed for the Department of Energy to immobilize high-level and transuranic wastes as glass for permanent disposal. Pacific Northwest Laboratory is supporting the HWVP design activities by conducting laboratory-scale studies using a HWVP simulated waste slurry. Conditions which affect the slurry processing chemistry were evaluated in terms of offgas composition and peak generation rate and changes in slurry composition. A standard offgas profile defined in terms of three reaction phases, decomposition of H{sub 2}CO{sub 3}, destruction of NO{sub 2}{sup {minus}}, and production of H{sub 2} and NH{sub 3} was used as a baseline against which changes were evaluated. The test variables include nitrite concentration, acid neutralization capacity, temperature, and formic acid addition rate. Results to date indicate that pH is an important parameter influencing the N{sub 2}O/NO{sub x} generation ratio; nitrite can both inhibit and activate rhodium as a catalyst for formic acid decomposition to CO{sub 2} and H{sub 2}; and a separate reduced metal phase forms in the reducing environment. These data are being compiled to provide a basis for predicting the HWVP feed processing chemistry as a function of feed composition and operation variables, recommending criteria for chemical adjustments, and providing guidelines with respect to important control parameters to consider during routine and upset plant operation.

  6. The Evidence for α-Linolenic Acid and Cardiovascular Disease Benefits: Comparisons with Eicosapentaenoic Acid and Docosahexaenoic Acid12

    PubMed Central

    Fleming, Jennifer A.; Kris-Etherton, Penny M.

    2014-01-01

    Our understanding of the cardiovascular disease (CVD) benefits of α-linolenic acid (ALA, 18:3n–3) has advanced markedly during the past decade. It is now evident that ALA benefits CVD risk. The expansion of the ALA evidence base has occurred in parallel with ongoing research on eicosapentaenoic acid (EPA, 20:5n–3) and docosahexaenoic acid (DHA, 22:6n–3) and CVD. The available evidence enables comparisons to be made for ALA vs. EPA + DHA for CVD risk reduction. The epidemiologic evidence suggests comparable benefits of plant-based and marine-derived n–3 (omega-3) PUFAs. The clinical trial evidence for ALA is not as extensive; however, there have been CVD event benefits reported. Those that have been reported for EPA + DHA are stronger because only EPA + DHA differed between the treatment and control groups, whereas in the ALA studies there were diet differences beyond ALA between the treatment and control groups. Despite this, the evidence suggests many comparable CVD benefits of ALA vs. EPA + DHA. Thus, we believe that it is time to revisit what the contemporary dietary recommendation should be for ALA to decrease the risk of CVD. Our perspective is that increasing dietary ALA will decrease CVD risk; however, randomized controlled clinical trials are necessary to confirm this and to determine what the recommendation should be. With a stronger evidence base, the nutrition community will be better positioned to revise the dietary recommendation for ALA for CVD risk reduction. PMID:25398754

  7. The complete amino acid sequence of lectin-C from the roots of pokeweed (Phytolacca americana).

    PubMed

    Yamaguchi, K; Mori, A; Funatsu, G

    1995-07-01

    The complete amino acid sequence of pokeweed lectin-C (PL-C) consisting of 126 residues has been determined. PL-C is an acidic simple protein with molecular mass of 13,747 Da and consists of three cysteine-rich domains with 51-63% homology. PL-C shows homology to chitin-binding proteins such as wheat germ agglutinin, and all eight cysteine residues in the three domains of PL-C are completely conserved in all other chitin-binding domains.

  8. Amino-acid sequence of a cooperative, dimeric myoglobin from the gastropod mollusc, Buccinum undatum L.

    PubMed

    Wen, D; Laursen, R A

    1994-10-19

    The complete amino-acid sequence of a dimeric myoglobin from the radular mussel of the gastropod mollusc, Buccinum undatum L. has been determined. The globin, which shows cooperative binding of oxygen, contains 146 amino acids, is N-terminal aminoacetylated, and has histidine residues at position 65 and 97, corresponding to the heme-binding histidines seen in mammalian myoglobins. It shows about 75% and 50% homology, respectively, with the dimeric molluscan myoglobins from Busycon canaliculatum and Cerithidea rhizophorarum, the former of which also shows weak cooperatively, but much less similarity to other species of myoglobin and hemoglobin.

  9. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  10. Amino acid sequence differences in pancreatic ribonucleases from water buffalo breeds from Indonesia and Italy.

    PubMed

    Sidik, A; Martena, B; Beintema, J J

    1979-12-01

    The amino acid sequences of the pancreatic ribonucleases from river-breed water buffaloes from Italy and swamp-breed water buffaloes from Indonesia differ at three positions. One of the differences involves a replacement of asparagine-34, with covalently attached carbohydrate on all molecules, in the river-breed enzyme by serine in the swamp-breed enzyme. The ribonuclease content of the pancreas differs considerably between breeds and is lower in river buffaloes. A ribonuclease preparation from two swamp buffaloes contained a minor glycosylated component. Preliminary evidence was obtained that the amino acid sequence of this component has factors in common with the main component of the swamp-breed ribonuclease and with the river-breed enzyme.

  11. Stereochemical Sequence Ion Selectivity: Proline versus Pipecolic-acid-containing Protonated Peptides

    NASA Astrophysics Data System (ADS)

    Abutokaikah, Maha T.; Guan, Shanshan; Bythell, Benjamin J.

    2016-10-01

    Substitution of proline by pipecolic acid, the six-membered ring congener of proline, results in vastly different tandem mass spectra. The well-known proline effect is eliminated and amide bond cleavage C-terminal to pipecolic acid dominates instead. Why do these two ostensibly similar residues produce dramatically differing spectra? Recent evidence indicates that the proton affinities of these residues are similar, so are unlikely to explain the result [Raulfs et al., J. Am. Soc. Mass Spectrom. 25, 1705-1715 (2014)]. An additional hypothesis based on increased flexibility was also advocated. Here, we provide a computational investigation of the "pipecolic acid effect," to test this and other hypotheses to determine if theory can shed additional light on this fascinating result. Our calculations provide evidence for both the increased flexibility of pipecolic-acid-containing peptides, and structural changes in the transition structures necessary to produce the sequence ions. The most striking computational finding is inversion of the stereochemistry of the transition structures leading to "proline effect"-type amide bond fragmentation between the proline/pipecolic acid-congeners: R (proline) to S (pipecolic acid). Additionally, our calculations predict substantial stabilization of the amide bond cleavage barriers for the pipecolic acid congeners by reduction in deleterious steric interactions and provide evidence for the importance of experimental energy regime in rationalizing the spectra.

  12. On human disease-causing amino acid variants: statistical study of sequence and structural patterns

    PubMed Central

    Alexov, Emil

    2015-01-01

    Statistical analysis was carried out on large set of naturally occurring human amino acid variations and it was demonstrated that there is a preference for some amino acid substitutions to be associated with diseases. At an amino acid sequence level, it was shown that the disease-causing variants frequently involve drastic changes of amino acid physico-chemical properties of proteins such as charge, hydrophobicity and geometry. Structural analysis of variants involved in diseases and being frequently observed in human population showed similar trends: disease-causing variants tend to cause more changes of hydrogen bond network and salt bridges as compared with harmless amino acid mutations. Analysis of thermodynamics data reported in literature, both experimental and computational, indicated that disease-causing variants tend to destabilize proteins and their interactions, which prompted us to investigate the effects of amino acid mutations on large databases of experimentally measured energy changes in unrelated proteins. Although the experimental datasets were linked neither to diseases nor exclusory to human proteins, the observed trends were the same: amino acid mutations tend to destabilize proteins and their interactions. Having in mind that structural and thermodynamics properties are interrelated, it is pointed out that any large change of any of them is anticipated to cause a disease. PMID:25689729

  13. Comparison of Rapid Methods for Analysis of Bacterial Fatty Acids

    PubMed Central

    Moss, C. Wayne; Lambert, M. A.; Merwin, W. H.

    1974-01-01

    When rapid gas-liquid chromatography methods for determination of bacterial fatty acids were compared, results showed that saponification was required for total fatty acid analysis. Transesterification with boron-trihalide reagents (BF3-CH3OH, BCl3-CH3OH) caused extensive degradation of cyclopropane acids and was less effective than saponification in releasing cellular hydroxy fatty acids. Digestion of cells with tetramethylammonium hydroxide was unsatisfactory because of extraneous gas-liquid chromatography peaks and because of lower recovery of branched-chain and hydroxy fatty acids. A simple, rapid saponification procedure which can be used for total cellular fatty acid analysis of freshly grown cells is described. PMID:4844271

  14. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  15. Structure of the fully modified left-handed cyclohexene nucleic acid sequence GTGTACAC.

    PubMed

    Robeyns, Koen; Herdewijn, Piet; Van Meervelt, Luc

    2008-02-13

    CeNA oligonucleotides consist of a phosphorylated backbone where the deoxyribose sugars are replaced by cyclohexene moieties. The X-ray structure determination and analysis of a fully modified octamer sequence GTGTACAC, which is the first crystal structure of a carbocyclic-based nucleic acid, is presented. This particular sequence was built with left-handed building blocks and crystallizes as a left-handed double helix. The helix can be characterized as belonging to the (mirrored) A-type family. Crystallographic data were processed up to 1.53 A, and the octamer sequence crystallizes in the space group R32. The sugar puckering is found to adopt the 3H2 half-chair conformation which mimics the C3'-endo conformation of the ribose sugar. The double helices stack on top of each other to form continuous helices, and static disorder is observed due to this end-to-end stacking.

  16. Amino acid sequence of a protease inhibitor isolated from Sarcophaga bullata determined by mass spectrometry.

    PubMed

    Papayannopoulos, I A; Biemann, K

    1992-02-01

    The amino acid sequence of a protease inhibitor isolated from the hemolymph of Sarcophaga bullata larvae was determined by tandem mass spectrometry. Homology considerations with respect to other protease inhibitors with known primary structures assisted in the choice of the procedure followed in the sequence determination and in the alignment of the various peptides obtained from specific chemical cleavage at cysteines and enzyme digests of the S. bullata protease inhibitor. The resulting sequence of 57 residues is as follows: Val Asp Lys Ser Ala Cys Leu Gln Pro Lys Glu Val Gly Pro Cys Arg Lys Ser Asp Phe Val Phe Phe Tyr Asn Ala Asp Thr Lys Ala Cys Glu Glu Phe Leu Tyr Gly Gly Cys Arg Gly Asn Asp Asn Arg Phe Asn Thr Lys Glu Glu Cys Glu Lys Leu Cys Leu.

  17. Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)

    PubMed Central

    Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping

    2014-01-01

    Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054

  18. Some properties and amino acid sequence of plastocyanin from a green alga, Ulva arasakii.

    PubMed

    Yoshizaki, F; Fukazawa, T; Mishina, Y; Sugimura, Y

    1989-08-01

    Plastocyanin was purified from a multicellular, marine green alga, Ulva arasakii, by conventional methods to homogeneity. The oxidized plastocyanin showed absorption maxima at 252, 276.8, 460, 595.3, and 775 nm, and shoulders at 259, 265, 269, and 282.5 nm; the ratio A276.8/A595.3 was 1.5. The midpoint redox potential was determined to be 0.356 V at pH 7.0 with a ferri- and ferrocyanide system. The molecular weight was estimated to be 10,200 and 11,000 by SDS-PAGE and by gel filtration, respectively. U. arasakii also has a small amount of cytochrome c6, like Enteromorpha prolifera. The amino acid sequence of U. arasakii plastocyanin was determined by Edman degradation and by carboxypeptidase digestion of the plastocyanin, six tryptic peptides, and five staphylococcal protease peptides. The plastocyanin contained 98 amino acid residues, giving a molecular weight of 10,236 including one copper atom. The complete sequence is as follows: AQIVKLGGDDGALAFVPSKISVAAGEAIEFVNNAGFPHNIVFDEDAVPAGVDADAISYDDYLNSKGETV VRKLSTPGVY G VYCEPHAGAGMKMTITVQ. The sequence of U. arasakii plastocyanin is closet to that of the E. prolifera protein (85% homology). A phylogenetic tree of five algal and two higher plant plastocyanins was constructed by comparing the amino acid differences. The branching order is considered to be as follows: a blue-green alga, unicellular green algae, multicellular green algae, and higher plants. PMID:2509442

  19. Complete amino acid sequence of chitinase-A from leaves of pokeweed (Phytolacca americana).

    PubMed

    Yamagami, T; Tanigawa, M; Ishiguro, M; Funatsu, G

    1998-04-01

    The complete amino acid sequence of pokeweed leaf chitinase-A was determined. First all 11 tryptic peptides from the reduced and S-carboxymethylated form of the enzyme were sequenced. Then the same form of the enzyme was cleaved with cyanogen bromide, giving three fragments. The fragments were digested with chymotrypsin or Staphylococcus aureus V8 protease. Last, the 11 tryptic peptides were put in order. Of seven cysteine residues, six were linked by disulfide bonds (between Cys25 and Cys74, Cys89 and Cys98, and Cys195 and Cys208); Cys176 was free. The enzyme consisted of 208 amino acid residues and had a molecular weight of 22,391. It consisted of only one polypeptide chain without a chitin-binding domain. The length of the chain was almost the same as that of the catalytic domains of class IL chitinases. These findings suggested that this enzyme is a new kind of class IIL chitinase, although its sequence resembles that of catalytic domains of class IL chitinases more than that of the class IIL chitinases reported so far. Discussion on the involvement of specific tryptophan residue in the active site of PLC-A is also given based on the sequence similarity with rye seed chitinase-c.

  20. [MOLECULAR EVOLUTION OF ION CHANNELS: AMINO ACID SEQUENCES AND 3D STRUCTURES].

    PubMed

    Korkosh, V S; Zhorov, B S; Tikhonov, D B

    2016-01-01

    An integral part of modern evolutionary biology is comparative analysis of structure and function of macromolecules such as proteins. The first and critical step to understand evolution of homologous proteins is their amino acid sequence alignment. However, standard algorithms fop not provide unambiguous sequence alignments for proteins of poor homology. More reliable results can be obtained by comparing experimental 3D structures obtained at atomic resolution, for instance, with the aid of X-ray structural analysis. If such structures are lacking, homology modeling is used, which may take into account indirect experimental data on functional roles of individual amino-acid residues. An important problem is that the sequence alignment, which reflects genetic modifications, does not necessarily correspond to the functional homology. The latter depends on three-dimensional structures which are critical for natural selection. Since alignment techniques relying only on the analysis of primary structures carry no information on the functional properties of proteins, including 3D structures into consideration is very important. Here we consider several examples involving ion channels and demonstrate that alignment of their three-dimensional structures can significantly improve sequence alignments obtained by traditional methods.

  1. nWayComp: a genome-wide sequence comparison tool for multiple strains/species of phylogenetically related microorganisms.

    PubMed

    Yao, Jiqiang; Lin, Hong; Doddapaneni, Harshavardhan; Civerolo, Edwin L

    2007-01-01

    The increasing number of whole genomic sequences of microorganisms has led to the complexity of genome-wide annotation and gene sequence comparison among multiple microorganisms. To address this problem, we have developed nWayComp software that compares DNA and protein sequences of phylogenetically-related microorganisms. This package integrates a series of bioinformatics tools such as BLAST, ClustalW, ALIGN, PHYLIP and PRIMER3 for sequence comparison. It searches for homologous sequences among multiple organisms and identifies genes that are unique to a particular organism. The homologous gene sets are then ranked in the descending order of the sequence similarity. For each set of homologous sequences, a table of sequence identity among homologous genes along with sequence variations such as SNPs and INDELS is developed, and a phylogenetic tree is constructed. In addition, a common set of primers that can amplify all the homologous sequences are generated. The nWayComp package provides users with a quick and convenient tool to compare genomic sequences among multiple organisms at the whole-genome level. PMID:17688445

  2. Comparison of the complete genome sequences of Pseudomonas syringae pv. syringae B728a and pv. tomato DC3000

    SciTech Connect

    Feil, H; Feil, W S; Chain, P; Larimer, F; DiBartolo, G; Copeland, A; Lykidis, A; Trong, S; Nolan, M; Goltsman, E; Thiel, J; Malfatti, S; Loper, J E; Lapidus, A; Detter, J C; Land, M; Richardson, P M; Kyrpides, N C; Ivanova, N; Lindow, S E

    2005-07-14

    The complete genomic sequence of Pseudomonas syringae pathovar syringae B728a (Pss B728a), has been determined and is compared with that of Pseudomonas syringae pv. tomato DC3000 (Pst DC3000). The two pathovars of this economically important species of plant pathogenic bacteria differ in host range and other interactions with plants, with Pss having a more pronounced epiphytic stage of growth and higher abiotic stress tolerance and Pst DC3000 having a more pronounced apoplastic growth habitat. The Pss B728a genome (6.1 megabases) contains a circular chromosome and no plasmid, whereas the Pst DC3000 genome is 6.5 mbp in size, composed of a circular chromosome and two plasmids. While a high degree of similarity exists between the two sequenced Pseudomonads, 976 protein-encoding genes are unique to Pss B728a when compared to Pst DC3000, including large genomic islands likely to contribute to virulence and host specificity. Over 375 repetitive extragenic palindromic sequences (REPs) unique to Pss B728a when compared to Pst DC3000 are widely distributed throughout the chromosome except in 14 genomic islands, which generally had lower GC content than the genome as a whole. Content of the genomic islands vary, with one containing a prophage and another the plasmid pKLC102 of P. aeruginosa PAO1. Among the 976 genes of Pss B728a with no counterpart in Pst DC3000 are those encoding for syringopeptin (SP), syringomycin (SR), indole acetic acid biosynthesis, arginine degradation, and production of ice nuclei. The genomic comparison suggests that several unique genes for Pss B728a such as ectoine synthase, DNA repair, and antibiotic production may contribute to epiphytic fitness and stress tolerance of this organism.

  3. Comparison of the complete genome sequences of Pseudomonas syringae pv. syringae B728a and pv. tomato DC3000

    SciTech Connect

    Feil, Helene; Feil, William; Chain, Patrick S. G.; Larimer, Frank W; DiBartolo, Genevieve; Copeland, A; Lykidis, A; Trong, Stephen; Nolan, Matt; Goltsman, Eugene; Thiel, James; Malfatti, Stephanie; Loper, Joyce E.; Detter, J C; Lapidus, Alla L.; Land, Miriam L; Richardson, P M; Kyrpides, Nikos C; Ivanova, N; Lindow, Steven E.

    2005-01-01

    The complete genomic sequence of Pseudomonas syringae pv. syringae B728a (Pss B728a) has been determined and is compared with that of A syringae pv. tomato DC3000 (Pst DC3000). The two pathovars of this economically important species of plant pathogenic bacteria differ in host range and other interactions with plants, with Pss having a more pronounced epiphytic stage of growth and higher abiotic stress tolerance and Pst DC3000 having a more pronounced apoplastic growth habitat. The Pss B728a genome (6.1 Mb) contains a circular chromosome and no plasmid, whereas the Pst DC3000 genome is 6.5 mbp in size, composed of a circular chromosome and two plasmids. Although a high degree of similarity exists between the two sequenced Pseudomonads, 976 protein-encoding genes are unique to Pss B728a when compared with Pst DC3000, including large genomic islands likely to contribute to virulence and host specificity. Over 375 repetitive extragenic palindromic sequences unique to Pss B728a when compared with Pst DC3000 are widely distributed throughout the chromosome except in 14 genomic islands, which generally had lower GC content than the genome as a whole. Content of the genomic islands varies, with one containing a prophage and another the plasmid pKLC102 of Pseudomonas aeruginosa PAO1. Among the 976 genes of Pss B728a with no counterpart in Pst DC3000 are those encoding for syringopeptin, syringomycin, indole acetic acid biosynthesis, arginine degradation, and production of ice nuclei. The genomic comparison suggests that several unique genes for Pss B728a such as ectoine synthase, DNA repair, and antibiotic production may contribute to the epiphytic fitness and stress tolerance of this organism.

  4. Implicit Sequence Learning in Dyslexia: A Within-Sequence Comparison of First- and Higher-Order Information

    ERIC Educational Resources Information Center

    Du, Wenchong; Kelly, Steve W.

    2013-01-01

    The present study examines implicit sequence learning in adult dyslexics with a focus on comparing sequence transitions with different statistical complexities. Learning of a 12-item deterministic sequence was assessed in 12 dyslexic and 12 non-dyslexic university students. Both groups showed equivalent standard reaction time increments when the…

  5. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  6. BeadCons: detection of nucleic acid sequences by flow cytometry.

    PubMed

    Horejsh, Douglas; Martini, Federico; Capobianchi, Maria Rosaria

    2005-11-01

    Molecular beacons are single-stranded nucleic acid structures with a terminal fluorophore and a distal, terminal quencher. These molecules are typically used in real-time PCR assays, but have also been conjugated with solid matrices. This unit describes protocols related to molecular beacon-conjugated beads (BeadCons), whose specific hybridization with complementary target sequences can be resolved by cytometry. Assay sensitivity is achieved through the concentration of fluorescence signal on discrete particles. By using molecular beacons with different fluorophores and microspheres of different sizes, it is possible to construct a fluid array system with each bead corresponding to a specific target nucleic acid. Methods are presented for the design, construction, and use of BeadCons for the specific, multiplexed detection of unlabeled nucleic acids in solution. The use of bead-based detection methods will likely lead to the design of new multiplex molecular diagnostic tools.

  7. Measuring nanometer distances in nucleic acids using a sequence-independent nitroxide probe

    PubMed Central

    Qin, Peter Z; Haworth, Ian S; Cai, Qi; Kusnetzow, Ana K; Grant, Gian Paola G; Price, Eric A; Sowa, Glenna Z; Popova, Anna; Herreros, Bruno; He, Honghang

    2008-01-01

    This protocol describes the procedures for measuring nanometer distances in nucleic acids using a nitroxide probe that can be attached to any nucleotide within a given sequence. Two nitroxides are attached to phosphorothioates that are chemically substituted at specific sites of DNA or RNA. Inter-nitroxide distances are measured using a four-pulse double electron–electron resonance technique, and the measured distances are correlated to the parent structures using a Web-accessible computer program. Four to five days are needed for sample labeling, purification and distance measurement. The procedures described herein provide a method for probing global structures and studying conformational changes of nucleic acids and protein/nucleic acid complexes. PMID:17947978

  8. [Partial sequence homology of FtsZ in phylogenetics analysis of lactic acid bacteria].

    PubMed

    Zhang, Bin; Dong, Xiu-zhu

    2005-10-01

    FtsZ is a structurally conserved protein, which is universal among the prokaryotes. It plays a key role in prokaryote cell division. A partial fragment of the ftsZ gene about 800bp in length was amplified and sequenced and a partial FtsZ protein phylogenetic tree for the lactic acid bacteria was constructed. By comparing the FtsZ phylogenetic tree with the 16S rDNA tree, it was shown that the two trees were similar in topology. Both trees revealed that Pediococcus spp. were closely related with L. casei group of Lactobacillus spp. , but less related with other lactic acid cocci such as Enterococcus and Streptococcus. The results also showed that the discriminative power of FtsZ was higher than that of 16S rDNA for either inter-species or inter-genus and could be a very useful tool in species identification of lactic acid bacteria. PMID:16342751

  9. Comparison of ribotyping and sequence-based typing for discriminating among isolates of Bordetella bronchiseptica.

    PubMed

    Register, Karen B; Nicholson, Tracy L; Brunelle, Brian W

    2016-10-01

    PvuII ribotyping and MLST are each highly discriminatory methods for genotyping Bordetella bronchiseptica, but a direct comparison between these approaches has not been undertaken. The goal of this study was to directly compare the discriminatory power of PvuII ribotyping and MLST, using a single set of geographically and genetically diverse strains, and to determine whether subtyping based on repeat region sequences of the pertactin gene (prn) provides additional resolution. One hundred twenty-two isolates were analyzed, representing 11 mammalian or avian hosts, sourced from the United States, Europe, Israel and Australia. Thirty-two ribotype patterns were identified; one isolate could not be typed. In comparison, all isolates were typeable by MLST and a total of 30 sequence types was identified. An analysis based on Simpson's Index of Diversity (SID) revealed that ribotyping and MLST are nearly equally discriminatory, with SIDs of 0.920 for ribotyping and 0.919 for MLST. Nonetheless, for ten ribotypes and eight MLST sequence types, the alternative method discriminates among isolates that otherwise type identically. Pairing prn repeat region typing with ribotyping yielded 54 genotypes and increased the SID to 0.954. Repeat region typing combined with MLST resulted in 47 genotypes and an SID of 0.944. Given the technical and practical advantages of MLST over ribotyping, and the nominal difference in their SIDs, we conclude MLST is the preferred primary typing tool. We recommend the combination of MLST and prn repeat region typing as a high-resolution, objective and standardized approach valuable for investigating the population structure and epidemiology of B. bronchiseptica. PMID:27542997

  10. Substrate-Driven Mapping of the Degradome by Comparison of Sequence Logos

    PubMed Central

    Fuchs, Julian E.; von Grafenstein, Susanne; Huber, Roland G.; Kramer, Christian; Liedl, Klaus R.

    2013-01-01

    Sequence logos are frequently used to illustrate substrate preferences and specificity of proteases. Here, we employed the compiled substrates of the MEROPS database to introduce a novel metric for comparison of protease substrate preferences. The constructed similarity matrix of 62 proteases can be used to intuitively visualize similarities in protease substrate readout via principal component analysis and construction of protease specificity trees. Since our new metric is solely based on substrate data, we can engraft the protease tree including proteolytic enzymes of different evolutionary origin. Thereby, our analyses confirm pronounced overlaps in substrate recognition not only between proteases closely related on sequence basis but also between proteolytic enzymes of different evolutionary origin and catalytic type. To illustrate the applicability of our approach we analyze the distribution of targets of small molecules from the ChEMBL database in our substrate-based protease specificity trees. We observe a striking clustering of annotated targets in tree branches even though these grouped targets do not necessarily share similarity on protein sequence level. This highlights the value and applicability of knowledge acquired from peptide substrates in drug design of small molecules, e.g., for the prediction of off-target effects or drug repurposing. Consequently, our similarity metric allows to map the degradome and its associated drug target network via comparison of known substrate peptides. The substrate-driven view of protein-protein interfaces is not limited to the field of proteases but can be applied to any target class where a sufficient amount of known substrate data is available. PMID:24244149

  11. Comparison of ribotyping and sequence-based typing for discriminating among isolates of Bordetella bronchiseptica.

    PubMed

    Register, Karen B; Nicholson, Tracy L; Brunelle, Brian W

    2016-10-01

    PvuII ribotyping and MLST are each highly discriminatory methods for genotyping Bordetella bronchiseptica, but a direct comparison between these approaches has not been undertaken. The goal of this study was to directly compare the discriminatory power of PvuII ribotyping and MLST, using a single set of geographically and genetically diverse strains, and to determine whether subtyping based on repeat region sequences of the pertactin gene (prn) provides additional resolution. One hundred twenty-two isolates were analyzed, representing 11 mammalian or avian hosts, sourced from the United States, Europe, Israel and Australia. Thirty-two ribotype patterns were identified; one isolate could not be typed. In comparison, all isolates were typeable by MLST and a total of 30 sequence types was identified. An analysis based on Simpson's Index of Diversity (SID) revealed that ribotyping and MLST are nearly equally discriminatory, with SIDs of 0.920 for ribotyping and 0.919 for MLST. Nonetheless, for ten ribotypes and eight MLST sequence types, the alternative method discriminates among isolates that otherwise type identically. Pairing prn repeat region typing with ribotyping yielded 54 genotypes and increased the SID to 0.954. Repeat region typing combined with MLST resulted in 47 genotypes and an SID of 0.944. Given the technical and practical advantages of MLST over ribotyping, and the nominal difference in their SIDs, we conclude MLST is the preferred primary typing tool. We recommend the combination of MLST and prn repeat region typing as a high-resolution, objective and standardized approach valuable for investigating the population structure and epidemiology of B. bronchiseptica.

  12. Dickeya species relatedness and clade structure determined by comparison of recA sequences.

    PubMed

    Parkinson, Neil; Stead, David; Bew, Janice; Heeney, John; Tsror Lahkim, Leah; Elphinstone, John

    2009-10-01

    Using sequences from the recA locus, we have produced a phylogeny of 188 Dickeya strains from culture collections and identified species relatedness and subspecies clade structure within the genus. Of the six recognized species, Dickeya paradisiaca, D. chrysanthemi and D. zeae were discriminated with long branch lengths. The clade containing the D. paradisiaca type strain included just one additional strain, isolated from banana in Colombia. Strains isolated from Chrysanthemum and Parthenium species made up most of the clade containing the D. chrysanthemi type strain, and the host range of this species was extended to include potato. The D. zeae clade had the largest number of sequevars and branched into two major sister clades that contained all of the Zea mays isolates, and were identified as phylotypes PI and PII. The host range was increased from six to 13 species, including potato. The recA sequence of an Australian sugar-cane strain was sufficiently distinct to rank as a new species-level branch. In contrast to these species, Dickeya dadantii, D. dianthicola and D. dieffenbachiae were distinguished with shorter branch lengths, indicating relatively closer relatedness. The recA sequence for the type strain of D. dadantii clustered separately from other strains of the species. However, sequence comparison of three additional loci revealed that the D. dadantii type strain grouped together with the six other D. dadantii strains that were sequenced. Analysis of all four loci indicated that the D. dadantii strains were most closely related to D. dieffenbachiae. Three further branches (DUC-1, -2 and -3) were associated with these three species, which all diverged from a common origin and can be considered as a species complex. The large clade containing the D. dianthicola type strain comprised 58 strains and had little sequence diversity. One sequevar accounted for the majority of these strains, which were isolated nearly exclusively from eight hosts from Europe

  13. Comparison of the effects of three different (-)-hydroxycitric acid preparations on food intake in rats: response.

    PubMed

    Preuss, Harry G; Bagchi, Manashi; Bagchi, Debasis

    2006-01-01

    A response to Louter-van de Haar J, Wielinga PY, Scheurink AJ, Nieuwenhuizen AG: Comparison of the effects of three different (-)-hydroxycitric acid preparations on food intake in rats. Nutr Metabol 2005, 2:23. PMID:16846513

  14. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    PubMed

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  15. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    PubMed

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  16. N-terminal amino acid sequences and some characteristics of fibrinolytic/hemorrhagic metalloproteinases purified from Bothrops jararaca venom.

    PubMed

    Maruyama, Masugi; Sugiki, Masahiko; Anai, Keita; Yoshida, Etsuo

    2002-08-01

    We determined the N-terminal amino acid sequences of the fibrinolytic/hemorrhagic metalloproteinases (jararafibrases I, III and IV) purified from Bothrops jararaca venom. The N-terminal amino acid sequences of jararafibrase I and its degradation products were identical to those of jararhagin, another hemorrhagic metalloproteinase purified from the same snake venom. Together with enzymatic and immunological properties, we concluded that those two enzymes are identical. The N-terminal amino acid sequence of jararafibrase III was quite similar to C-type lectin isolated from Crotalus atrox, and the protein had a hemagglutinating activity on intact rat red blood cells. PMID:12165326

  17. Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.

    PubMed

    Xu, Chunrui; Sun, Dandan; Liu, Shenghui; Zhang, Yusen

    2016-10-01

    In this contribution we introduced a novel graphical method to compare protein sequences. By mapping a protein sequence into 3D space based on codons and physicochemical properties of 20 amino acids, we are able to get a unique P-vector from the 3D curve. This approach is consistent with wobble theory of amino acids. We compute the distance between sequences by their P-vectors to measure similarities/dissimilarities among protein sequences. Finally, we use our method to analyze four datasets and get better results compared with previous approaches. PMID:27375218

  18. Isolation and amino acid sequence of crustacean hyperglycemic hormone precursor-related peptides.

    PubMed

    Tensen, C P; Verhoeven, A H; Gaus, G; Janssen, K P; Keller, R; Van Herp, F

    1991-01-01

    The crustacean hyperglycemic hormone (CHH) is synthesized as part of a larger preprohormone in which the sequence of CHH is N-terminally flanked by a peptide for which the name CPRP (CHH precursor-related peptide) is proposed. Both CHH and CPRP are present in the sinus gland, the neurohemal organ of neurosecretory cells located in the eyestalk of decapod crustaceans. This paper describes the isolation and sequence analysis of CPRPs isolated from sinus glands of the crab Carcinus maenas, the crayfish Orconectes limosus and the lobster Homarus americanus. The published sequence of "peptide H" isolated from the land crab, Cardisoma carnifex, has now been recognized as a CPRP in this species. Sequence comparison reveals a high level of identity for the N-terminal region (residues 1-13) between all four peptides, while identity in the C-terminal domain is high between lobster and crayfish CPRP on the one hand, and between both crab species on the other. Conserved N-terminal residues include a putative monobasic processing site at position 11, which suggests that CPRP may be a biosynthetic intermediate from which a potentially bioactive decapeptide can be derived.

  19. Purification to homogeneity and amino acid sequence analysis of two anionic species of human interleukin 1

    PubMed Central

    1986-01-01

    Two anionic species of human IL-1 have been purified to homogeneity. These molecules were characterized as having pI of 5.4 and 5.2 and molecular weights identical to IL-1/6.8 (17,500). The specific activities of IL-1/5.4 and IL-1/5.2, as measured in the mouse thymocyte co-mitogenic assay, were identical to that of IL-1/6.8, namely 1.2 X 10(7) U/mg, with half-maximal stimulation observed at 2 X 10(-11) M. IL- 1/5.4 and IL-1/5.2 were found to be antigenically distinct from IL- 1/6.8 in an ELISA. IL-1/5.4 was structurally distinct from IL-1/6.8 based on reverse-phase HPLC or CNBr peptides. Intact IL-1/5.2 and three intact CNBr peptides of IL-1/5.4 were sequenced, with the identification of 74 amino acid residues. These sequences were found to correspond exactly with the amino acid sequence deduced from the IL-1- alpha cDNA reported by March et al. PMID:3487613

  20. Protein meta-functional signatures from combining sequence, structure, evolution, and amino acid property information.

    PubMed

    Wang, Kai; Horst, Jeremy A; Cheng, Gong; Nickle, David C; Samudrala, Ram

    2008-09-26

    Protein function is mediated by different amino acid residues, both their positions and types, in a protein sequence. Some amino acids are responsible for the stability or overall shape of the protein, playing an indirect role in protein function. Others play a functionally important role as part of active or binding sites of the protein. For a given protein sequence, the residues and their degree of functional importance can be thought of as a signature representing the function of the protein. We have developed a combination of knowledge- and biophysics-based function prediction approaches to elucidate the relationships between the structural and the functional roles of individual residues and positions. Such a meta-functional signature (MFS), which is a collection of continuous values representing the functional significance of each residue in a protein, may be used to study proteins of known function in greater detail and to aid in experimental characterization of proteins of unknown function. We demonstrate the superior performance of MFS in predicting protein functional sites and also present four real-world examples to apply MFS in a wide range of settings to elucidate protein sequence-structure-function relationships. Our results indicate that the MFS approach, which can combine multiple sources of information and also give biological interpretation to each component, greatly facilitates the understanding and characterization of protein function.

  1. A nomenclature for the mammalian flavin-containing monooxygenase gene family based on amino acid sequence identities.

    PubMed

    Lawton, M P; Cashman, J R; Cresteil, T; Dolphin, C T; Elfarra, A A; Hines, R N; Hodgson, E; Kimura, T; Ozols, J; Phillips, I R

    1994-01-01

    A nomenclature based on comparisons of amino acid sequences is proposed for the members of the mammalian flavin-containing monooxygenase (FMO) gene family. This nomenclature is based on evidence of a single gene family composed of five genes. The percentage identities of the amino acid sequences of the five known forms of mammalian FMO are between 52 and 57% in rabbit and between 50 and 58% across species lines. The identities of all orthologs are greater than 82%. There is no evidence for multiple, highly related forms of the enzyme or for more than one mammalian FMO gene family. In the proposed system, the mammalian flavin-containing monooxygenase gene family is designated as "FMO" and the individual genes are distinguished by an Arabic numeral. The FMOs known as the "liver" and "lung" enzymes become FMO1 and FMO2, and the more recently described forms of the enzymes become FMO3, FMO4, and FMO5. Human FMO gene designations, FMO1 and FMO3, remain unchanged, but the gene designated FMO2 becomes FMO4. Following convention, the genes and cDNA designations will be italicized and the mRNA and protein designations will be nonitalicized. The purpose of the proposed nomenclature is to provide for the unambiguous identification of orthologous forms of mammalian FMOs, regardless of the species or tissue in question. The proposed classification considers only members of the mammalian flavin-containing monooxygenase gene family and has no bearing on the generally accepted definition of a multisubstrate flavin-containing monooxygenase.

  2. Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data

    PubMed Central

    2014-01-01

    Background The rapid evolution in high-throughput sequencing (HTS) technologies has opened up new perspectives in several research fields and led to the production of large volumes of sequence data. A fundamental step in HTS data analysis is the mapping of reads onto reference sequences. Choosing a suitable mapper for a given technology and a given application is a subtle task because of the difficulty of evaluating mapping algorithms. Results In this paper, we present a benchmark procedure to compare mapping algorithms used in HTS using both real and simulated datasets and considering four evaluation criteria: computational resource and time requirements, robustness of mapping, ability to report positions for reads in repetitive regions, and ability to retrieve true genetic variation positions. To measure robustness, we introduced a new definition for a correctly mapped read taking into account not only the expected start position of the read but also the end position and the number of indels and substitutions. We developed CuReSim, a new read simulator, that is able to generate customized benchmark data for any kind of HTS technology by adjusting parameters to the error types. CuReSim and CuReSimEval, a tool to evaluate the mapping quality of the CuReSim simulated reads, are freely available. We applied our benchmark procedure to evaluate 14 mappers in the context of whole genome sequencing of small genomes with Ion Torrent data for which such a comparison has not yet been established. Conclusions A benchmark procedure to compare HTS data mappers is introduced with a new definition for the mapping correctness as well as tools to generate simulated reads and evaluate mapping quality. The application of this procedure to Ion Torrent data from the whole genome sequencing of small genomes has allowed us to validate our benchmark procedure and demonstrate that it is helpful for selecting a mapper based on the intended application, questions to be addressed, and the

  3. Bacteria obtained from a sequencing batch reactor that are capable of growth on dehydroabietic acid.

    PubMed

    Mohn, W W

    1995-06-01

    Eleven isolates capable of growth on the resin acid dehydroabietic acid (DhA) were obtained from a sequencing batch reactor designed to treat a high-strength process stream from a paper mill. The isolates belonged to two groups, represented by strains DhA-33 and DhA-35, which were characterized. In the bioreactor, bacteria like DhA-35 were more abundant than those like DhA-33. The population in the bioreactor of organisms capable of growth on DhA was estimated to be 1.1 x 10(6) propagules per ml, based on a most-probable-number determination. Analysis of small-subunit rRNA partial sequences indicated that DhA-33 was most closely related to Sphingomonas yanoikuyae (Sab = 0.875) and that DhA-35 was most closely related to Zoogloea ramigera (Sab = 0.849). Both isolates additionally grew on other abietanes, i.e., abietic and palustric acids, but not on the pimaranes, pimaric and isopimaric acids. For DhA-33 and DhA-35 with DhA as the sole organic substrate, doubling times were 2.7 and 2.2 h, respectively, and growth yields were 0.30 and 0.25 g of protein per g of DhA, respectively. Glucose as a cosubstrate stimulated growth of DhA-33 on DhA and stimulated DhA degradation by the culture. Pyruvate as a cosubstrate did not stimulate growth of DhA-35 on DhA and reduced the specific rate of DhA degradation of the culture. DhA induced DhA and abietic acid degradation activities in both strains, and these activities were heat labile. Cell suspensions of both strains consumed DhA at a rate of 6 mumol mg of protein-1 h-1.(ABSTRACT TRUNCATED AT 250 WORDS)

  4. Development of a SCAR (sequence-characterised amplified region) marker for acid resistance-related gene in Lactobacillus plantarum.

    PubMed

    Liu, Shu-Wen; Li, Kai; Yang, Shi-Ling; Tian, Shu-Fen; He, Ling

    2015-03-01

    A sequence characterised amplified region marker was developed to determine an acid resistance-related gene in Lactobacillus plantarum. A random amplified polymorphic DNA marker named S116-680 was reported to be closely related to the acid resistance of the strains. The DNA band corresponding to this marker was cloned and sequenced with the induction of specific designed PCR primers. The results of PCR test helped to amplify a clear specific band of 680 bp in the tested acid-resistant strains. S116-680 marker would be useful to explore the acid-resistant mechanism of L. plantarum and to screen desirable malolactic fermentation strains.

  5. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  6. Guanine nucleotide-binding proteins that enhance choleragen ADP-ribosyltransferase activity: nucleotide and deduced amino acid sequence of an ADP-ribosylation factor cDNA.

    PubMed Central

    Price, S R; Nightingale, M; Tsai, S C; Williamson, K C; Adamik, R; Chen, H C; Moss, J; Vaughan, M

    1988-01-01

    Three (two soluble and one membrane) guanine nucleotide-binding proteins (G proteins) that enhance ADP-ribosylation of the Gs alpha stimulatory subunit of the adenylyl cyclase (EC 4.6.1.1) complex by choleragen have recently been purified from bovine brain. To further define the structure and function of these ADP-ribosylation factors (ARFs), we isolated a cDNA clone (lambda ARF2B) from a bovine retinal library by screening with a mixed heptadecanucleotide probe whose sequence was based on the partial amino acid sequence of one of the soluble ARFs from bovine brain. Comparison of the deduced amino acid sequence of lambda ARF2B with sequences of peptides from the ARF protein (total of 60 amino acids) revealed only two differences. Whether these are cloning artifacts or reflect the existence of more than one ARF protein remains to be determined. Deduced amino acid sequences of ARF, Go alpha (the alpha subunit of a G protein that may be involved in regulation of ion fluxes), and c-Ha-ras gene product p21 show similarities in regions believed to be involved in guanine nucleotide binding and GTP hydrolysis. ARF apparently lacks a site analogous to that ADP-ribosylated by choleragen in G-protein alpha subunits. Although both the ARF proteins and the alpha subunits bind guanine nucleotides and serve as choleragen substrates, they must interact with the toxin A1 peptide in different ways. In addition to serving as an ADP-ribose acceptor, ARF interacts with the toxin in a manner that modifies its catalytic properties. PMID:3135549

  7. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  8. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  9. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  10. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  11. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  12. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  13. A new antifungal peptide from the seeds of Phytolacca americana: characterization, amino acid sequence and cDNA cloning.

    PubMed

    Shao, F; Hu, Z; Xiong, Y M; Huang, Q Z; WangCG; Zhu, R H; Wang, D C

    1999-03-19

    An antifungal peptide from seeds of Phytolacca americana, designated PAFP-s, has been isolated. The peptide is highly basic and consists of 38 residues with three disulfide bridges. Its molecular mass of 3929.0 was determined by mass spectrometry. The complete amino acid sequence was obtained from automated Edman degradation, and cDNA cloning was successfully performed by 3'-RACE. The deduced amino acid sequence of a partial cDNA corresponded to the amino acid sequence from chemical sequencing. PAFP-s exhibited a broad spectrum of antifungal activity, and its activities differed among various fungi. PAFP-s displayed no inhibitory activity towards Escherichia coli. PAFP-s shows significant sequence similarities and the same cysteine motif with Mj-AMPs, antimicrobial peptides from seeds of Mirabilis jalapa belonging to the knottin-type antimicrobial peptide.

  14. Amino acid sequence and variant forms of favin, a lectin from Vicia faba.

    PubMed

    Hopp, T P; Hemperly, J J; Cunningham, B A

    1982-04-25

    We have determined the complete amino acid sequence (182 residues) of the beta chain of favin, the glucose-binding lectin from fava beans (Vicia faba), and have established that the carbohydrate moiety is attached to Asn 168. Together with the sequence of the alpha chain previously reported (Hemperly, J. J., Hopp, T. P., Becker, J. W., and Cunningham, B. A. (1979) J. Biol. Chem. 254, 6803-6810), these data complete the analysis of the primary structure of the lectin. We have also examined minor polypeptides that appear in all preparations of favin. Two lower molecular weight species (Mr = 9,500-11,600) appear to be fragments of the beta chain resulting from cleavage following Asn 76, whereas six high molecular weight forms (Mr = 25,000 or greater) appear to include aggregates of the beta chain and possibly some alternative products of chain processing. PMID:7068646

  15. Pyrosequencing on templates generated by asymmetric nucleic acid sequence-based amplification (asymmetric-NASBA).

    PubMed

    Jia, Huning; Chen, Zhiyao; Wu, Haiping; Ye, Hui; Yan, Zhengyu; Zhou, Guohua

    2011-12-21

    Pyrosequencing is an ideal tool for verifying the sequence of amplicons. To enable pyrosequencing on amplicons from nucleic acid sequence-based amplification (NASBA), asymmetric NASBA with unequal concentrations of T7 promoter primer and reverse transcription primer was proposed. By optimizing the ratio of two primers and the concentration of dNTPs and NTPs, the amount of single-stranded cDNA in the amplicons from asymmetric NASBA was found increased 12 times more than the conventional NASBA through the real-time detection of a molecular beacon specific to cDNA of interest. More than 20 bases have been successfully detected by pyrosequencing on amplicons from asymmetric NASBA using Human parainfluenza virus (HPIV) as an amplification template. The primary results indicate that the combination of NASBA with a pyrosequencing system is practical, and should open a new field in clinical diagnosis.

  16. Efficient Query-by-Content Audio Retrieval by Locality Sensitive Hashing and Partial Sequence Comparison

    NASA Astrophysics Data System (ADS)

    Yu, Yi; Joe, Kazuki; Downie, J. Stephen

    This paper investigates suitable indexing techniques to enable efficient content-based audio retrieval in large acoustic databases. To make an index-based retrieval mechanism applicable to audio content, we investigate the design of Locality Sensitive Hashing (LSH) and the partial sequence comparison. We propose a fast and efficient audio retrieval framework of query-by-content and develop an audio retrieval system. Based on this framework, four different audio retrieval schemes, LSH-Dynamic Programming (DP), LSH-Sparse DP (SDP), Exact Euclidian LSH (E2LSH)-DP, E2LSH-SDP, are introduced and evaluated in order to better understand the performance of audio retrieval algorithms. The experimental results indicate that compared with the traditional DP and the other three compititive schemes, E2LSH-SDP exhibits the best tradeoff in terms of the response time, retrieval accuracy and computation cost.

  17. Nonprotein Amino Acids from Spark Discharges and Their Comparison with the Murchison Meteorite Amino Acids

    PubMed Central

    Wolman, Yecheskel; Haverland, William J.; Miller, Stanley L.

    1972-01-01

    All the nonprotein amino acids found in the Murchison meteorite are products of the action of electric discharge on a mixture of methane, nitrogen, and water with traces of ammonia. These amino acids include α-amino-n-butyric acid, α-aminoisobutyric acid, norvaline, isovaline, pipecolic acid, β-alanine, β-amino-n-butyric acid, β-aminoisobutyric acid, γ-aminobutyric acid, sarcosine, N-ethylglycine, and N-methylalanine. In addition, norleucine, alloisoleucine, N-propylglycine, N-isopropylglycine, N-methyl-β-alanine, N-ethyl-β-alanine α,β-diaminopropionic acid, isoserine, α,γ-diaminobutyric acid, and α-hydroxy-γ-aminobutyric acid are produced by the electric discharge, but have not been found in the meteorite. PMID:16591973

  18. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  19. Human immunoglobulin subclasses. Partial amino acid sequence of the constant region of a γ4 chain

    PubMed Central

    Pink, J. R. L.; Buttery, S. H.; De Vries, G. M.; Milstein, C.

    1970-01-01

    The heavy chain of a human myeloma protein (Vin) belonging to the γ4 subclass was subjected to tryptic digestion after reduction and carboxymethylation. Cyanogen bromide fragments were also prepared and all 19 tryptic peptides that account for one of them (the Fc-like fragment) were studied. Selected peptic peptides were isolated and provided evidence for the order of 15 of the tryptic peptides. In addition the sequence of two large peptic peptides derived from two sections of the molecule including all the interchain bridges is presented. Comparison with published data on other chains allows us to propose a sequence of γ4 chains that extends from just before the presumed starting point of the invariable region (at about residue 113) to the C-terminal end of the chain (approx. residue 446), except for a section of about 50 residues. The results of the comparison suggest that the immunoglobulin subclasses have a recent independent evolutionary origin in different species. Implications for complement fixation and for the evolutionary origin of antibody diversity are also discussed. PMID:4192699

  20. The amino-acid sequences of sculpin islet somatostatin-28 and peptide YY.

    PubMed

    Cutfield, S M; Carne, A; Cutfield, J F

    1987-04-01

    Two pancreatic peptides, somatostatin-28 and peptide YY, have been isolated from the Brockmann bodies of the teleost fish Cottus scorpius (daddy sculpin). Following purification by reverse-phase HPLC, each peptide was sequenced completely through to the carboxyl-terminus by gas-phase Edman degradation. Somatostatin-28 was the major form of somatostatin detected and is similar to the gene II product from anglerfish. Peptide YY (36 amino acids) more closely resembles porcine neuropeptide YY and intestinal peptide YY than it does the pancreatic polypeptides. PMID:2883025

  1. Sequence selective recognition of double-stranded RNA using triple helix-forming peptide nucleic acids.

    PubMed

    Zengeya, Thomas; Gupta, Pankaj; Rozners, Eriks

    2014-01-01

    Noncoding RNAs are attractive targets for molecular recognition because of the central role they play in gene expression. Since most noncoding RNAs are in a double-helical conformation, recognition of such structures is a formidable problem. Herein, we describe a method for sequence-selective recognition of biologically relevant double-helical RNA (illustrated on ribosomal A-site RNA) using peptide nucleic acids (PNA) that form a triple helix in the major grove of RNA under physiologically relevant conditions. Protocols for PNA preparation and binding studies using isothermal titration calorimetry are described in detail.

  2. Sequence selective double strand DNA cleavage by peptide nucleic acid (PNA) targeting using nuclease S1.

    PubMed Central

    Demidov, V; Frank-Kamenetskii, M D; Egholm, M; Buchardt, O; Nielsen, P E

    1993-01-01

    A novel method for sequence specific double strand DNA cleavage using PNA (peptide nucleic acid) targeting is described. Nuclease S1 digestion of double stranded DNA gives rise to double strand cleavage at an occupied PNA strand displacement binding site, and under optimized conditions complete cleavage can be obtained. The efficiency of this cleavage is more than 10 fold enhanced when a tandem PNA site is targeted, and additionally enhanced if this site is in trans rather than in cis orientation. Thus in effect, the PNA targeting makes the single strand specific nuclease S1 behave like a pseudo restriction endonuclease. Images PMID:8502550

  3. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  4. WinGene/WinPep: user-friendly software for the analysis of amino acid sequences.

    PubMed

    Hennig, L

    1999-06-01

    WinGene1.0/WinPep1.2 is a pair of Microsoft Windows programs designed to read nucleotide or amino acid sequence data. These versatile programs have the following capabilities: (i) searches for open reading frames and their translation, (ii) assisting the design of primers for PCR and (iii) calculation of molecular weight, isoelectric point and molar absorbtion coefficients of polypeptides. Furthermore, hydropathic plots and helical wheel displays are easily produced. The programs run with an intuitive Windows interface, contain a comprehensive help file and enable data exchange with other applications by means of the Copy&Paste command. The software is free for academic and noncommercial users.

  5. Complete genome sequence of Lactococcus lactis IO-1, a lactic acid bacterium that utilizes xylose and produces high levels of L-lactic acid.

    PubMed

    Kato, Hiroaki; Shiwa, Yuh; Oshima, Kenshiro; Machii, Miki; Araya-Kojima, Tomoko; Zendo, Takeshi; Shimizu-Kadota, Mariko; Hattori, Masahira; Sonomoto, Kenji; Yoshikawa, Hirofumi

    2012-04-01

    We report the complete genome sequence of Lactococcus lactis IO-1 (= JCM7638). It is a nondairy lactic acid bacterium, produces nisin Z, ferments xylose, and produces predominantly L-lactic acid at high xylose concentrations. From ortholog analysis with other five L. lactis strains, IO-1 was identified as L. lactis subsp. lactis.

  6. Purification and amino acid sequence of aminopeptidase P from pig kidney.

    PubMed

    Vergas Romero, C; Neudorfer, I; Mann, K; Schäfer, W

    1995-04-01

    Aminopeptidase P from kidney cortex was purified in high yield (recovery greater than or equal to 20%) by a series of column chromatographic steps after solubilization of the membrane-bound glycoprotein with n-butanol. A coupled enzymic assay, using Gly-Pro-Pro-NH-Nap as substrate and dipeptidyl-peptidase IV as auxilliary enzyme, was used to monitor the purification. The purification procedure yielded two forms of aminopeptidase P differing in their carbohydrate composition (glycoforms). Both enzyme preparations were homogeneous as assessed by SDS/PAGE silver staining, and isoelectric focusing. Both forms possessed the same substrate specificity, catalysed the same reaction, and consisted of identical protein chains. The amino acid sequence determined by Edman degradation and mass spectrometry consisted of 623 amino acids. Six N-glycosylation sites, all contained in the N-terminal half of the protein, were characterized. PMID:7744038

  7. Mass spectrometric detection of the amino acid sequence polymorphism of the hepatitis C virus antigen.

    PubMed

    Kaysheva, A L; Ivanov, Yu D; Frantsuzov, P A; Krohin, N V; Pavlova, T I; Uchaikin, V F; Konev, V А; Kovalev, O B; Ziborov, V S; Archakov, A I

    2016-03-01

    A method for detection and identification of the hepatitis C virus antigen (HCVcoreAg) in human serum with consideration for possible amino acid substitutions is proposed. The method is based on a combination of biospecific capturing and concentrating of the target protein on the surface of the chip for atomic force microscope (AFM chip) with subsequent protein identification by tandem mass spectrometric (MS/MS) analysis. Biospecific AFM-capturing of viral particles containing HCVcoreAg from serum samples was performed by use of AFM chips with monoclonal antibodies (anti-HCVcore) covalently immobilized on the surface. Biospecific complexes were registered and counted by AFM. Further MS/MS analysis allowed to reliably identify the HCVcoreAg in the complexes formed on the AFM chip surface. Analysis of MS/MS spectra, with the account taken of the possible polymorphisms in the amino acid sequence of the HCVcoreAg, enabled us to increase the number of identified peptides.

  8. Comparison of three next-generation sequencing platforms for metagenomic sequencing and identification of pathogens in blood

    PubMed Central

    2014-01-01

    Background The introduction of benchtop sequencers has made adoption of whole genome sequencing possible for a broader community of researchers than ever before. Concurrently, metagenomic sequencing (MGS) is rapidly emerging as a tool for interrogating complex samples that defy conventional analyses. In addition, next-generation sequencers are increasingly being used in clinical or related settings, for instance to track outbreaks. However, information regarding the analytical sensitivity or limit of detection (LoD) of benchtop sequencers is currently lacking. Furthermore, the specificity of sequence information at or near the LoD is unknown. Results In the present study, we assess the ability of three next-generation sequencing platforms to identify a pathogen (viral or bacterial) present in low titers in a clinically relevant sample (blood). Our results indicate that the Roche-454 Titanium platform is capable of detecting Dengue virus at titers as low as 1X102.5 pfu/mL, corresponding to an estimated 5.4X104 genome copies/ml maximum. The increased throughput of the benchtop sequencers, the Ion Torrent PGM and Illumina MiSeq platforms, enabled detection of viral genomes at concentrations as low as 1X104 genome copies/mL. Platform-specific biases were evident in sequence read distributions as well as viral genome coverage. For bacterial samples, only the MiSeq platform was able to provide sequencing reads that could be unambiguously classified as originating from Bacillus anthracis. Conclusion The analytical sensitivity of all three platforms approaches that of standard qPCR assays. Although all platforms were able to detect pathogens at the levels tested, there were several noteworthy differences. The Roche-454 Titanium platform produced consistently longer reads, even when compared with the latest chemistry updates for the PGM platform. The MiSeq platform produced consistently greater depth and breadth of coverage, while the Ion Torrent was unequaled for speed of

  9. Genetic mapping of expressed sequences in onion and in silico comparisons with rice show scant colinearity.

    PubMed

    Martin, William J; McCallum, John; Shigyo, Masayoshi; Jakse, Jernej; Kuhl, Joseph C; Yamane, Naoko; Pither-Joyce, Meeghan; Gokce, Ali Fuat; Sink, Kenneth C; Town, Christopher D; Havey, Michael J

    2005-10-01

    The Poales (which include the grasses) and Asparagales [which include onion (Allium cepa L.) and other Allium species] are the two most economically important monocot orders. Enormous genomic resources have been developed for the grasses; however, their applicability to other major monocot groups, such as the Asparagales, is unclear. Expressed sequence tags (ESTs) from onion that showed significant similarities (80% similarity over at least 70% of the sequence) to single positions in the rice genome were selected. One hundred new genetic markers developed from these ESTs were added to the intraspecific map derived from the BYG15-23xAC43 segregating family, producing 14 linkage groups encompassing 1,907 cM at LOD 4. Onion linkage groups were assigned to chromosomes using alien addition lines of Allium fistulosum L. carrying single onion chromosomes. Visual comparisons of genetic linkage in onion with physical linkage in rice revealed scant colinearity; however, short regions of colinearity could be identified. Our results demonstrate that the grasses may not be appropriate genomic models for other major monocot groups such as the Asparagales; this will make it necessary to develop genomic resources for these important plants. PMID:16025250

  10. Comparison of amino acids interaction with gold nanoparticle.

    PubMed

    Ramezani, Fatemeh; Amanlou, Massoud; Rafii-Tabar, Hashem

    2014-04-01

    The study of nanomaterial/biomolecule interface is an important emerging field in bionanoscience, and additionally in many biological processes such as hard-tissue growth and cell-surface adhesion. To have a deeper understanding of the amino acids/gold nanoparticle assemblies, the adsorption of these amino acids on the gold nanoparticles (GNPs) has been investigated via molecular dynamics simulation. In these simulations, all the constituent atoms of the nanoparticles were considered to be dynamic. The geometries of amino acids, when adsorbed on the nanoparticle, were studied and their flexibilities were compared with one another. The interaction of each of 20 amino acids was considered with 3 and 8 nm gold GNPs.

  11. Whole-genome sequence comparison as a method for improving bacterial species definition.

    PubMed

    Zhang, Wen; Du, Pengcheng; Zheng, Han; Yu, Weiwen; Wan, Li; Chen, Chen

    2014-01-01

    We compared pairs of 1,226 bacterial strains with whole genome sequences and calculated their average nucleotide identity (ANI) between genomes to determine whether whole genome comparison can be directly used for bacterial species definition. We found that genome comparisons of two bacterial strains from the same species (SGC) have a significantly higher ANI than those of two strains from different species (DGC), and that the ANI between the query and the reference genomes can be used to determine whether two genomes come from the same species. Bacterial species definition based on ANI with a cut-off value of 0.92 matched well (81.5%) with the current bacterial species definition. The ANI value was shown to be consistent with the standard for traditional bacterial species definition, and it could be used in bacterial taxonomy for species definition. A new bioinformatics program (ANItools) was also provided in this study for users to obtain the ANI value of any two bacterial genome pairs (http://genome.bioinfo-icdc.org/). This program can match a query strain to all bacterial genomes, and identify the highest ANI value of the strain at the species, genus and family levels respectively, providing valuable insights for species definition.

  12. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid

    PubMed Central

    Tan, Siyuan; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  13. WAViS server for handling, visualization and presentation of multiple alignments of nucleotide or amino acids sequences.

    PubMed

    Zika, Radek; Paces, Jan; Pavlícek, Adam; Paces, Václav

    2004-07-01

    Web Alignment Visualization Server contains a set of web-tools designed for quick generation of publication-quality color figures of multiple alignments of nucleotide or amino acids sequences. It can be used for identification of conserved regions and gaps within many sequences using only common web browsers. The server is accessible at http://wavis.img.cas.cz.

  14. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution.

  15. Cloning, DNA sequencing and heterologous expression of the gene for thermostable N-acylamino acid racemase from Amycolatopsis sp. TS-1-60 in Escherichia coli.

    PubMed

    Tokuyama, S; Hatano, K

    1995-03-01

    The gene encoding the novel enzyme N-acylamino acid racemase (AAR) was cloned in recombinant phage lambda-4 from the DNA library of Amycolatopsis sp. TS-1-60, a rare actinomycete, using antiserum against the enzyme. The cloned gene was subcloned and transformed in Escherichia coli JM105 using pUC118 as a vector. The AAR gene consists of an open-reading frame of 1104 nucleotides, which specifies a 368-amino-acid protein with a molecular mass of 39411Da. The molecular mass deduced from the AAR gene is in good agreement with the subunit molecular mass (40kDa) of AAR from Amycolatopsis sp. TS-1-60. The guanosine plus cytosine content of the AAR gene was about 70%. Although the AAR gene uses the unusual initiation codon GTG, the gene was expressed in Escherichia coli using the lac promoter of pUC118. The amount of the enzyme produced by the transformant was 16 times that produced by Amycolatopsis sp. TS-1-60. When the unusual initiation codon GTG was changed to ATG, the enzyme productivity of the transformant increased to more than 37 times that of Amycolatopsis sp. TS-1-60. In the comparison of the DNA sequence and the deduced amino acid sequence of AAR with those of known racemases and epimerases in data bases, no significant sequence homology was found. However, AAR resembles mandelate racemase in that requires metal ions for enzyme activity.(ABSTRACT TRUNCATED AT 250 WORDS)

  16. Sequencing and Transcriptional Analysis of the Biosynthesis Gene Cluster of Abscisic Acid-Producing Botrytis cinerea

    PubMed Central

    Gong, Tao; Shu, Dan; Yang, Jie; Ding, Zhong-Tao; Tan, Hong

    2014-01-01

    Botrytis cinerea is a model species with great importance as a pathogen of plants and has become used for biotechnological production of ABA. The ABA cluster of B. cinerea is composed of an open reading frame without significant similarities (bcaba3), followed by the genes (bcaba1 and bcaba2) encoding P450 monooxygenases and a gene probably coding for a short-chain dehydrogenase/reductase (bcaba4). In B. cinerea ATCC58025, targeted inactivation of the genes in the cluster suggested at least three genes responsible for the hydroxylation at carbon atom C-1' and C-4' or oxidation at C-4' of ABA. Our group has identified an ABA-overproducing strain, B. cinerea TB-3-H8. To differentiate TB-3-H8 from other B. cinerea strains with the functional ABA cluster, the DNA sequence of the 12.11-kb region containing the cluster of B. cinerea TB-3-H8 was determined. Full-length cDNAs were also isolated for bcaba1, bcaba2, bcaba3 and bcaba4 from B. cinerea TB-3-H8. Sequence comparison of the four genes and their flanking regions respectively derived from B. cinerea TB-3-H8, B05.10 and T4 revealed that major variations were located in intergenic sequences. In B. cinerea TB-3-H8, the expression profiles of the four function genes under ABA high-yield conditions were also analyzed by real-time PCR. PMID:25268614

  17. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  18. Correlations Between Amino Acids at Different Sites in Local Sequences of Protein Fragments with Given Structural Patterns

    NASA Astrophysics Data System (ADS)

    Lu, Wen; Liu, Hai-yan

    2007-02-01

    Ample evidence suggests that the local structures of peptide fragments in native proteins are to some extent encoded by their local sequences. Detecting such local correlations is important but it is still an open question what would be the most appropriate method. This is partly because conventional sequence analyses treat amino acid preferences at each site of a protein sequence independently, while it is often the inter-site interactions that bring about local sequence-structure correlations. Here a new scheme is introduced to capture the correlation between amino acid preferences at different sites for different local structure types. A library of nine-residue fragments is constructed, and the fragments are divided into clusters based on their local structures. For each local structure cluster or type, chi-square tests are used to identify correlated preferences of amino acid combinations at pairs of sites. A score function is constructed including both the single site amino acid preferences and the dual-site amino acid combination preferences, which can be used to identify whether a sequence fragment would have a strong tendency to form a particular local structure in native proteins. The results show that, given a local structure pattern, dual-site amino acid combinations contain different information from single site amino acid preferences. Representative examples show that many of the statistically identified correlations agree with previously-proposed heuristic rules about local sequence-structure correlations, or are consistent with physical-chemical interactions required to stabilize particular local structures. Results also show that such dual-site correlations in the score function significantly improves the Z-score matching a sequence fragment to its native local structure relative to non-native local structures, and certain local structure types are highly predictable from the local sequence alone if inter-site correlations are considered.

  19. Draft Genome Sequences of Gluconobacter cerinus CECT 9110 and Gluconobacter japonicus CECT 8443, Acetic Acid Bacteria Isolated from Grape Must

    PubMed Central

    Sainz, Florencia

    2016-01-01

    We report here the draft genome sequences of Gluconobacter cerinus strain CECT9110 and Gluconobacter japonicus CECT8443, acetic acid bacteria isolated from grape must. Gluconobacter species are well known for their ability to oxidize sugar alcohols into the corresponding acids. Our objective was to select strains to oxidize effectively d-glucose. PMID:27365351

  20. Molecular cloning, encoding sequence, and expression of vaccinia virus nucleic acid-dependent nucleoside triphosphatase gene.

    PubMed Central

    Rodriguez, J F; Kahn, J S; Esteban, M

    1986-01-01

    A rabbit poxvirus genomic library contained within the expression vector lambda gt11 was screened with polyclonal antiserum prepared against vaccinia virus nucleic acid-dependent nucleoside triphosphatase (NTPase)-I enzyme. Five positive phage clones containing from 0.72- to 2.5-kilobase-pair (kbp) inserts expressed a beta-galactosidase fusion protein that was reactive by immunoblotting with the NTPase-I antibody. Hybridization analysis allowed the location of this gene within the vaccinia HindIIID restriction fragment. From the known nucleotide sequence of the 16-kbp vaccinia HindIIID fragment, we identified a region that contains a 1896-base open reading frame coding for a 631-amino acid protein. Analysis of the complete sequence revealed a highly basic protein, with hydrophilic COOH and NH2 termini, various hydrophobic domains, and no significant homology to other known proteins. Translational studies demonstrate that NTPase-I belongs to a late class of viral genes. This protein is highly conserved among Orthopoxviruses. Images PMID:3025846

  1. Partial amino acid sequences around sulfhydryl groups of soybean beta-amylase.

    PubMed

    Nomura, K; Mikami, B; Morita, Y

    1987-08-01

    Sulfhydryl (SH) groups of soybean beta-amylase were modified with 5-(iodoaceto-amidoethyl)aminonaphthalene-1-sulfonate (IAEDANS) and the SH-containing peptides exhibiting fluorescence were purified after chymotryptic digestion of the modified enzyme. The sequence analysis of the peptides derived from the modification of all SH groups in the denatured enzyme revealed the existence of six SH groups, in contrast to five reported previously. One of them was found to have extremely low reactivity toward SH-reagents without reduction. In the native state, IAEDANS reacted with 2 mol of SH groups per mol of the enzyme (SH1 and SH2) accompanied with inactivation of the enzyme owing to the modification of SH2 located near the active site of this enzyme. The selective modification of SH2 with IAEDANS was attained after the blocking of SH1 with 5,5'-dithiobis-(2-nitrobenzoic acid). The amino acid sequences of the peptides containing SH1 and SH2 were determined to be Cys-Ala-Asn-Pro-Gln and His-Gln-Cys-Gly-Gly-Asn-Val-Gly-Asp-Ile-Val-Asn-Ile-Pro-Ile-Pro-Gln-Trp, respectively.

  2. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

    PubMed

    Blanco-Míguez, Aitor; Gutiérrez-Jácome, Alberto; Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Catalán-García, Sandra; Fdez-Riverola, Florentino; Lourenço, Anália; Sánchez, Borja

    2016-06-01

    Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as "antiproliferative," "antitumoral," or "apoptosis" among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed.

  3. Complete amino acid sequence of a Lolium perenne (perennial rye grass) pollen allergen, Lol p II.

    PubMed

    Ansari, A A; Shenbagamurthi, P; Marsh, D G

    1989-07-01

    The complete amino acid sequence of a Lolium perenne (rye grass) pollen allergen, Lol p II was determined by automated Edman degradation of the protein and selected fragments. Cleavage of the protein by enzymatic and chemical techniques established an unambiguous sequence for the protein. Lol p II contains 97 amino acid residues, with a calculated molecular weight of 10,882. The protein lacks cysteine and glutamine and shows no evidence of glycosylation. Theoretical predictions by Fraga's (Fraga, S. (1982) Can. J. Chem. 60, 2606-2610) and Hopp and Woods' (Hopp, T. P., and Woods, K. R. (1981) Proc. Natl. Acad. Sci. U.S.A. 78, 3824-3828) methods indicate the presence of four hydrophilic regions, which may contribute to sequential or parts of conformational B-cell epitopes. Analysis of amphipathic regions by Berzofsky's method indicates the presence of a highly amphipathic region, which may contain, or contribute to, an Ia/T-cell epitope. This latter segment of Lol p II was found to be highly homologous with an antibody-binding segment of the major rye allergen Lol p I and may explain why immune responsiveness to both the allergens is associated with HLA-DR3.

  4. Molecular cloning, encoding sequence, and expression of vaccinia virus nucleic acid-dependent nucleoside triphosphatase gene.

    PubMed

    Rodriguez, J F; Kahn, J S; Esteban, M

    1986-12-01

    A rabbit poxvirus genomic library contained within the expression vector lambda gt11 was screened with polyclonal antiserum prepared against vaccinia virus nucleic acid-dependent nucleoside triphosphatase (NTPase)-I enzyme. Five positive phage clones containing from 0.72- to 2.5-kilobase-pair (kbp) inserts expressed a beta-galactosidase fusion protein that was reactive by immunoblotting with the NTPase-I antibody. Hybridization analysis allowed the location of this gene within the vaccinia HindIIID restriction fragment. From the known nucleotide sequence of the 16-kbp vaccinia HindIIID fragment, we identified a region that contains a 1896-base open reading frame coding for a 631-amino acid protein. Analysis of the complete sequence revealed a highly basic protein, with hydrophilic COOH and NH2 termini, various hydrophobic domains, and no significant homology to other known proteins. Translational studies demonstrate that NTPase-I belongs to a late class of viral genes. This protein is highly conserved among Orthopoxviruses.

  5. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon.

    PubMed Central

    Yu, J H; Eng, J; Yalow, R S

    1990-01-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled pork insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report we describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. We demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in our immunoassay system is only a few percent of that of human insulin. Squirrel monkey glucagon is identical with the usual glucagon found in Old World mammals, which predicts that the glucagons of other New World monkeys would not differ from the usual Old World mammalian glucagon. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species. PMID:2263627

  6. Complete amino acid sequence of the myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani.

    PubMed

    Jones, B N; Wang, C C; Dwulet, F E; Lehman, L D; Meuth, J L; Bogardt, R A; Gurd, F R

    1979-04-25

    The complete amino acid sequence of the major component myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani, was determined by the automated Edman degradation of several large peptides obtained by specific cleavage of the protein. The acetimidated apomyoglobin was selectively cleaved at its two methionyl residues with cyanogen bromide and at its three arginyl residues by trypsin. By subjecting four of these peptides and the apomyoglobin to automated Edman degradation, over 80% of the primary structure of the protein was obtained. The remainder of the covalent structure was determined by the sequence analysis of peptides that resulted from further digestion of the central cyanogen bromide fragment. This fragment was cleaved at its glutamyl residues with staphylococcal protease and its lysyl residues with trypsin. The action of trypsin was restricted to the lysyl residues by chemical modification of the single arginyl residue of the fragment with 1,2-cyclohexanedione. The primary structure of this myoglobin proved to be identical with that from the Atlantic bottlenosed dolphin and Pacific common dolphin but differs from the myoglobins of the killer whale and pilot whale at two positions. The above sequence identities and differences reflect the close taxonomic relationship of these five species of Cetacea. PMID:454657

  7. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon

    SciTech Connect

    Yu, Jinghua ); Eng, J.; Yalow, R.S. City Univ. of New York, NY )

    1990-12-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled park insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report the authors describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. They demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in their immunoassay system is only a few percent of that of human insulin. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species.

  8. Purification, amino acid sequence and characterisation of kangaroo IGF-I.

    PubMed

    Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z

    1998-01-01

    Insulin-like growth factor-I (IGF-I) and IGF-II have been purified to homogeneity from kangaroo (Macropus fuliginosus) serum, thus this represents the first report of the purification, sequencing and characterisation of marsupial IGFs. N-Terminal protein sequencing reveals that there are six amino acid differences between kangaroo and human IGF-I. Kangaroo IGF-II has been partially sequenced and no differences were found between human and kangaroo IGF-II in the 53 residues identified. Thus the IGFs appear to be remarkably structurally conserved during mammalian radiation. In addition, in vitro characterisation of kangaroo IGF-I demonstrated that the functional properties of human, kangaroo and chicken IGF-I are very similar. In an assay measuring the ability of the proteins to stimulate protein synthesis in rat L6 myoblasts, all IGF-I proteins were found to be equally potent. The ability of all three proteins to compete for binding with radiolabelled human IGF-I to type-1 IGF receptors in L6 myoblasts and in Sminthopsis crassicaudata transformed lung fibroblasts, a marsupial cell line, was comparable. Furthermore, kangaroo and human IGF-I react equally in a human IGF-I RIA using a human reference standard, radiolabelled human IGF-I and a polyclonal antibody raised against recombinant human IGF-I. This study indicates that not only is the primary structure of eutherian and metatherian IGF-I conserved, but also the proteins appear to be functionally similar.

  9. The evolution of proteins from random amino acid sequences: II. Evidence from the statistical distributions of the lengths of modern protein sequences.

    PubMed

    White, S H

    1994-04-01

    This paper continues an examination of the hypothesis that modern proteins evolved from random heteropeptide sequences. In support of the hypothesis, White and Jacobs (1993, J Mol Evol 36:79-95) have shown that any sequence chosen randomly from a large collection of nonhomologous proteins has a 90% or better chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. The goal of the present study was to investigate the possibility that the random-origin hypothesis could explain the lengths of modern protein sequences without invoking specific mechanisms such as gene duplication or exon splicing. The sets of sequences examined were taken from the 1989 PIR database and consisted of 1,792 "super-family" proteins selected to have little sequence identity, 623 E. coli sequences, and 398 human sequences. The length distributions of the proteins could be described with high significance by either of two closely related probability density functions: The gamma distribution with parameter 2 or the distribution for the sum of two exponential random independent variables. A simple theory for the distributions was developed which assumes that (1) protoprotein sequences had exponentially distributed random independent lengths, (2) the length dependence of protein stability determined which of these protoproteins could fold into compact primitive proteins and thereby attain the potential for biochemical activity, (3) the useful protein sequences were preserved by the primitive genome, and (4) the resulting distribution of sequence lengths is reflected by modern proteins. The theory successfully predicts the two observed distributions which can be distinguished by the functional form of the dependence of protein stability on length. The theory leads to three interesting conclusions. First, it predicts that a tetra-nucleotide was the signal for primitive translation termination. This prediction is

  10. A Comparison between Transcriptome Sequencing and 16S Metagenomics for Detection of Bacterial Pathogens in Wildlife

    PubMed Central

    Razzauti, Maria; Galan, Maxime; Bernard, Maria; Maman, Sarah; Klopp, Christophe; Charbonnel, Nathalie; Vayssier-Taussat, Muriel; Eloit, Marc; Cosson, Jean-François

    2015-01-01

    Background Rodents are major reservoirs of pathogens responsible for numerous zoonotic diseases in humans and livestock. Assessing their microbial diversity at both the individual and population level is crucial for monitoring endemic infections and revealing microbial association patterns within reservoirs. Recently, NGS approaches have been employed to characterize microbial communities of different ecosystems. Yet, their relative efficacy has not been assessed. Here, we compared two NGS approaches, RNA-Sequencing (RNA-Seq) and 16S-metagenomics, assessing their ability to survey neglected zoonotic bacteria in rodent populations. Methodology/Principal Findings We first extracted nucleic acids from the spleens of 190 voles collected in France. RNA extracts were pooled, randomly retro-transcribed, then RNA-Seq was performed using HiSeq. Assembled bacterial sequences were assigned to the closest taxon registered in GenBank. DNA extracts were analyzed via a 16S-metagenomics approach using two sequencers: the 454 GS-FLX and the MiSeq. The V4 region of the gene coding for 16S rRNA was amplified for each sample using barcoded universal primers. Amplicons were multiplexed and processed on the distinct sequencers. The resulting datasets were de-multiplexed, and each read was processed through a pipeline to be taxonomically classified using the Ribosomal Database Project. Altogether, 45 pathogenic bacterial genera were detected. The bacteria identified by RNA-Seq were comparable to those detected by 16S-metagenomics approach processed with MiSeq (16S-MiSeq). In contrast, 21 of these pathogens went unnoticed when the 16S-metagenomics approach was processed via 454-pyrosequencing (16S-454). In addition, the 16S-metagenomics approaches revealed a high level of coinfection in bank voles. Conclusions/Significance We concluded that RNA-Seq and 16S-MiSeq are equally sensitive in detecting bacteria. Although only the 16S-MiSeq method enabled identification of bacteria in each

  11. Complexation of NpO2+ with N-methyl-iminodiacetic Acid: in Comparison with Iminodiacetic and Dipicolinic Acids

    SciTech Connect

    Tian, Guoxin; Rao, Linfeng

    2010-10-01

    Complexation of Np(V) with N-methyl-iminodiacetic acid (MIDA) in 1 M NaClO{sub 4} solution was studied with multiple techniques including potentiometry, spectrophotometry, and microcalorimetry. The 1:2 complex, NpO{sub 2}(MIDA){sub 2}{sup 3-} was identified for the first time in aqueous solution. The correlation between its optical absorption properties and symmetry was discussed, in comparison with Np(V) complexes with two structurally related nitrilo-dicarboxylic acids, iminodiacetic acid (IDA) and dipicolinic acid (DPA). The order of the binding strength (DPA > MIDA > IDA) is explained by the difference in the structural and electronic properties of the ligands. In general, the nitrilo-dicarboxylates form stronger complexes with Np(V) than oxy-dicarboxylates due to a much more favorable enthalpy of complexation.

  12. Sequence Design for a Test Tube of Interacting Nucleic Acid Strands.

    PubMed

    Wolfe, Brian R; Pierce, Niles A

    2015-10-16

    We describe an algorithm for designing the equilibrium base-pairing properties of a test tube of interacting nucleic acid strands. A target test tube is specified as a set of desired "on-target" complexes, each with a target secondary structure and target concentration, and a set of undesired "off-target" complexes, each with vanishing target concentration. Sequence design is performed by optimizing the test tube ensemble defect, corresponding to the concentration of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of the test tube. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, the structural ensemble of each on-target complex is hierarchically decomposed into a tree of conditional subensembles, yielding a forest of decomposition trees. Candidate sequences are evaluated efficiently at the leaf level of the decomposition forest by estimating the test tube ensemble defect from conditional physical properties calculated over the leaf subensembles. As optimized subsequences are merged toward the root level of the forest, any emergent defects are eliminated via ensemble redecomposition and sequence reoptimization. After successfully merging subsequences to the root level, the exact test tube ensemble defect is calculated for the first time, explicitly checking for the effect of the previously neglected off-target complexes. Any off-target complexes that form at appreciable concentration are hierarchically decomposed, added to the decomposition forest, and actively destabilized during subsequent forest reoptimization. For target test tubes representative of design challenges in the molecular programming and synthetic biology communities, our test tube design algorithm typically succeeds in achieving a normalized test tube ensemble defect ≤1% at a design cost within an order of magnitude of the cost of test tube analysis.

  13. Sequence-Specific Electrical Purification of Nucleic Acids with Nanoporous Gold Electrodes.

    PubMed

    Daggumati, Pallavi; Appelt, Sandra; Matharu, Zimple; Marco, Maria L; Seker, Erkin

    2016-06-22

    Nucleic-acid-based biosensors have enabled rapid and sensitive detection of pathogenic targets; however, these devices often require purified nucleic acids for analysis since the constituents of complex biological fluids adversely affect sensor performance. This purification step is typically performed outside the device, thereby increasing sample-to-answer time and introducing contaminants. We report a novel approach using a multifunctional matrix, nanoporous gold (np-Au), which enables both detection of specific target sequences in a complex biological sample and their subsequent purification. The np-Au electrodes modified with 26-mer DNA probes (via thiol-gold chemistry) enabled sensitive detection and capture of complementary DNA targets in the presence of complex media (fetal bovine serum) and other interfering DNA fragments in the range of 50-1500 base pairs. Upon capture, the noncomplementary DNA fragments and serum constituents of varying sizes were washed away. Finally, the surface-bound DNA-DNA hybrids were released by electrochemically cleaving the thiol-gold linkage, and the hybrids were iontophoretically eluted from the nanoporous matrix. The optical and electrophoretic characterization of the analytes before and after the detection-purification process revealed that low target DNA concentrations (80 pg/μL) can be successfully detected in complex biological fluids and subsequently released to yield pure hybrids free of polydisperse digested DNA fragments and serum biomolecules. Taken together, this multifunctional platform is expected to enable seamless integration of detection and purification of nucleic acid biomarkers of pathogens and diseases in miniaturized diagnostic devices.

  14. Amino acid sequence analysis and characterization of a ribonuclease from starfish Asterias amurensis.

    PubMed

    Motoyoshi, Naomi; Kobayashi, Hiroko; Itagaki, Tadashi; Inokuchi, Norio

    2016-09-01

    The aim of this study was to phylogenetically characterize the location of the RNase T2 enzyme in the starfish (Asterias amurensis). We isolated an RNase T2 ribonuclease (RNase Aa) from the ovaries of starfish and determined its amino acid sequence by protein chemistry and cloning cDNA encoding RNase Aa. The isolated protein had 231 amino acid residues, a predicted molecular mass of 25,906 Da, and an optimal pH of 5.0. RNase Aa preferentially released guanylic acid from the RNA. The catalytic sites of the RNase T2 family are conserved in RNase Aa; furthermore, the distribution of the cysteine residues in RNase Aa is similar to that in other animal and plant T2 RNases. RNase Aa is cleaved at two points: 21 residues from the N-terminus and 29 residues from the C-terminus; however, both fragments may remain attached to the protein via disulfide bridges, leading to the maintenance of its conformation, as suggested by circular dichroism spectrum analysis. The phylogenetic analysis revealed that starfish RNase Aa is evolutionarily an intermediate between protozoan and oyster RNases. PMID:26920046

  15. Genomic DNA sequence comparison between two inbred soybean cyst nematode biotypes facilitated by massively parallel 454 microbead sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Heterodera glycines, the soybean cyst nematode (SCN), is the most important pathogen of soybean in the Midwestern United States. Genomic DNA sequence information for this nematode is limited and thus progress in devising genomic approaches to control this pathogen has been slow. To remedy this pro...

  16. Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing

    PubMed Central

    2013-01-01

    Background Legionella pneumophila is an opportunistic pathogen of humans where the source of infection is usually from contaminated man-made water systems. When an outbreak of Legionnaires’ disease caused by L. pneumophila occurs, it is necessary to discover the source of infection. A seven allele sequence-based typing scheme (SBT) has been very successful in providing the means to attribute outbreaks of L. pneumophila to a particular source or sources. Particular sequence types described by this scheme are known to exhibit specific phenotypes. For instance some types are seen often in clinical cases but are rarely isolated from the environment and vice versa. Of those causing human disease some types are thought to be more likely to cause more severe disease. It is possible that the genetic basis for these differences are vertically inherited and associated with particular genetic lineages within the population. In order to provide a framework within which to test this hypothesis and others relating to the population biology of L. pneumophila, a set of genomes covering the known diversity of the organism is required. Results Firstly, this study describes a means to group L. pneumophila strains into pragmatic clusters, using a methodology that takes into consideration the genetic forces operating on the population. These clusters can be used as a standardised nomenclature, so those wishing to describe a group of strains can do so. Secondly, the clusters generated from the first part of the study were used to select strains rationally for whole genome sequencing (WGS). The data generated was used to compare phylogenies derived from SBT and WGS. In general the SBT sequence type (ST) accurately reflects the whole genome-based genotype. Where there are exceptions and recombination has resulted in the ST no longer reflecting the genetic lineage described by the whole genome sequence, the clustering technique employed detects these sequence types as being admixed

  17. Microbial Analysis of Bite Marks by Sequence Comparison of Streptococcal DNA

    PubMed Central

    Kennedy, Darnell M.; Stanton, Jo-Ann L.; García, José A.; Mason, Chris; Rand, Christy J.; Kieser, Jules A.; Tompkins, Geoffrey R.

    2012-01-01

    Bite mark injuries often feature in violent crimes. Conventional morphometric methods for the forensic analysis of bite marks involve elements of subjective interpretation that threaten the credibility of this field. Human DNA recovered from bite marks has the highest evidentiary value, however recovery can be compromised by salivary components. This study assessed the feasibility of matching bacterial DNA sequences amplified from experimental bite marks to those obtained from the teeth responsible, with the aim of evaluating the capability of three genomic regions of streptococcal DNA to discriminate between participant samples. Bite mark and teeth swabs were collected from 16 participants. Bacterial DNA was extracted to provide the template for PCR primers specific for streptococcal 16S ribosomal RNA (16S rRNA) gene, 16S–23S intergenic spacer (ITS) and RNA polymerase beta subunit (rpoB). High throughput sequencing (GS FLX 454), followed by stringent quality filtering, generated reads from bite marks for comparison to those generated from teeth samples. For all three regions, the greatest overlaps of identical reads were between bite mark samples and the corresponding teeth samples. The average proportions of reads identical between bite mark and corresponding teeth samples were 0.31, 0.41 and 0.31, and for non-corresponding samples were 0.11, 0.20 and 0.016, for 16S rRNA, ITS and rpoB, respectively. The probabilities of correctly distinguishing matching and non-matching teeth samples were 0.92 for ITS, 0.99 for 16S rRNA and 1.0 for rpoB. These findings strongly support the tenet that bacterial DNA amplified from bite marks and teeth can provide corroborating information in the identification of assailants. PMID:23284761

  18. Fatty acid mobilization and comparison to milk fatty acid content in northern elephant seals.

    PubMed

    Fowler, Melinda A; Debier, Cathy; Mignolet, Eric; Linard, Clementine; Crocker, Daniel E; Costa, Daniel P

    2014-01-01

    A fundamental feature of the life history of true seals, bears and baleen whales is lactation while fasting. This study examined the mobilization of fatty acids from blubber and their subsequent partitioning into maternal metabolism and milk production in northern elephant seals (Mirounga angustirostris). The fatty acid composition of blubber and milk was measured in both early and late lactation. Proportions of fatty acids in milk and blubber were found to display a high degree of similarity both early and late in lactation. Seals mobilized an enormous amount of lipid (~66 kg in 17 days), but thermoregulatory fatty acids, those that remain fluid at low temperatures, were relatively conserved in the outer blubber layer. Despite the stratification, the pattern of mobilization of specific fatty acids conforms to biochemical predictions. Long chain (>20C) monounsaturated fatty acids (MUFAs) were the least mobilized from blubber and the only class of fatty acids that showed a proportional increase in milk in late lactation. Polyunsaturated fatty acids (PUFAs) and saturated fatty acids (SFAs) were more mobilized from the blubber, but neither proportion increased in milk at late lactation. These data suggest that of the long chain MUFA mobilized, the majority is directed to milk synthesis. The mother may preferentially use PUFA and SFA for her own metabolism, decreasing the availability for deposition into milk. The potential impacts of milk fatty acid delivery on pup diving development and thermoregulation are exciting avenues for exploration.

  19. Fatty acid mobilization and comparison to milk fatty acid content in northern elephant seals.

    PubMed

    Fowler, Melinda A; Debier, Cathy; Mignolet, Eric; Linard, Clementine; Crocker, Daniel E; Costa, Daniel P

    2014-01-01

    A fundamental feature of the life history of true seals, bears and baleen whales is lactation while fasting. This study examined the mobilization of fatty acids from blubber and their subsequent partitioning into maternal metabolism and milk production in northern elephant seals (Mirounga angustirostris). The fatty acid composition of blubber and milk was measured in both early and late lactation. Proportions of fatty acids in milk and blubber were found to display a high degree of similarity both early and late in lactation. Seals mobilized an enormous amount of lipid (~66 kg in 17 days), but thermoregulatory fatty acids, those that remain fluid at low temperatures, were relatively conserved in the outer blubber layer. Despite the stratification, the pattern of mobilization of specific fatty acids conforms to biochemical predictions. Long chain (>20C) monounsaturated fatty acids (MUFAs) were the least mobilized from blubber and the only class of fatty acids that showed a proportional increase in milk in late lactation. Polyunsaturated fatty acids (PUFAs) and saturated fatty acids (SFAs) were more mobilized from the blubber, but neither proportion increased in milk at late lactation. These data suggest that of the long chain MUFA mobilized, the majority is directed to milk synthesis. The mother may preferentially use PUFA and SFA for her own metabolism, decreasing the availability for deposition into milk. The potential impacts of milk fatty acid delivery on pup diving development and thermoregulation are exciting avenues for exploration. PMID:24126964

  20. The complete nucleotide sequence of bluetongue virus serotype 1 RNA3 and a comparison with other geographic serotypes from Australia, South Africa and the United States of America, and with other orbivirus isolates.

    PubMed

    Gould, A R

    1987-04-01

    The sequence of the RNA segment 3 of bluetongue virus (BTV) serotype 1 from Australia is presented along with its deduced amino acid sequence. DNA copies of this genome segment were inserted either into the E. coli plasmid pBR322 by homopolymeric tailing or by direct insertion of double-stranded DNA fragments generated by restriction endonuclease cleavage into the appropriate M13 bacteriophage vectors (Vieira, J. and Messing, J., 1982, Gene 19, 259-268). Direct comparisons were made to the nucleotide sequence data of Purdy, M. et al., 1984 (J. Virol. 51, 754-759) and Ghiasi, H. et al., 1985 (Virus Res. 3, 181-190) for the United States of America (US) isolates of BTV, serotypes 10 and 17, respectively. A method for the rapid cloning, sequencing and alignment of orbivirus RNA 3 segments was utilised to compare other geographical isolates of BTV, as well as those of other orbivirus serotypes, in particular, epizootic haemorrhagic disease of deer virus (EHDV) and Warrego. The comparison of this sequence data reveals that BTV isolates can be separated into distinct geographical types which in turn are distinct from the other orbivirus isolates studied. The sequence conservation at the amino acid level for the gene product of RNA3 (VP3) does not enable distinctions to be made amongst the BTV isolates at a geographical level, but does afford easy distinction into the different orbivirus groups. A possible evolutionary schematic is presented for the orbiviruses studied.

  1. A structural and functional comparison of nematode and crustacean PDH-like sequences.

    PubMed

    Meelkop, E; Marco, H G; Janssen, T; Temmerman, L; Vanhove, M P M; Schoofs, L

    2012-03-01

    The elucidation of the whole genome of the nematode Caenorhabditis elegans allowed for the identification of ortholog genes belonging to the pigment dispersing hormone/factor (PDH/PDF) peptide family. Members of this peptide family are known from crustaceans, insects and nematodes and seem to exist exclusively in ecdysozoans where they play a role in different processes, ranging from the dispersion of integumental and eye (retinal) pigments in decapod crustaceans to circadian rhythms in insects and locomotion in C. elegans. Two pdf genes (pdf-1 and pdf-2) encoding three different peptides: PDF-1a, PDF-1b and PDF-2 have been identified in C. elegans. These three C. elegans PDH-like peptides are similar but not identical in primary structure to PDHs from decapod crustaceans. We investigate whether this divergence has an influence on the pigment dispersing function of the peptides in a decapod crustacean, namely the shrimp Palaemon pacificus. We show that C. elegans PDF-1a and b peptides display cross-functional activity by dispersing pigments in the epithelium of P. pacificus at physiological doses. Moreover, by means of a comparative amino acid sequence analysis of nematode and crustacean PDH-like peptides, we can pinpoint several potentially important residues for eliciting pigment dispersing activity in decapod crustaceans. Although there is no sequence information on a receptor for PDH in decapod crustaceans, we postulate that there is general conservation of the PDH/PDF signaling system based on structural similarities of precursor proteins and receptors (including those from a branchiopod crustacean and from C. elegans).

  2. Amino acid sequences of neuropeptides in the sinus gland of the land crab Cardisoma carnifex: a novel neuropeptide proteolysis site.

    PubMed

    Newcomb, R W

    1987-08-01

    The sinus gland is a major neurosecretory structure in Crustacea. Five peptides, labeled C, D, E, F, and I, isolated from the sinus gland of the land crab have been hypothesized to arise from the incomplete proteolysis at two internal sites on a single biosynthetic intermediate peptide "H", based on amino acid composition additivities and pulse-chase radiolabeling studies. The presence of only a single major precursor for the sinus gland peptides implies that peptide H may be synthesized on a common precursor with crustacean hyperglycemic hormone forms, "J" and "L," and a peptide, "K," similar to peptides with molt inhibiting activity. Here I report amino acid sequences of these peptides. The amino terminal sequence of the parent peptide, H, (and the homologous fragments) proved refractory to Edman degradation. Data from amino acid analysis and carboxypeptidase digestion of the naturally occurring fragments and of fragments produced by endopeptidase digestion were used together with Edman degradation to obtain the sequences. Amino acid analysis of fragments of the naturally occurring "overlap" peptides (those produced by internal cleavage at one site on H) was used to obtain the sequences across the cleavage sites. The amino acid sequence of the land crab peptide H is Arg-Ser-Ala-Asp-Gly-Phe-Gly-Arg-Met-Glu-Ser-Leu-Leu-Thr-Ser-Leu-Arg-Gly- Ser-Ala-Glu- Ser-Pro-Ala-Ala-Leu-Gly-Glu-Ala-Ser-Ala-Ala-His-Pro-Leu-Glu. In vivo cleavage at one site involves excision of arginine from the sequence Leu-Arg-Gly, whereas cleavage at the other site involves excision of serine from the sequence Glu-Ser-Leu. Proteolysis at the latter sequence has not been previously reported in intact secretory granules. The aspartate at position 4 is possibly covalently modified.

  3. Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis KOD1 and comparison with Pyrococcus genomes

    PubMed Central

    Fukui, Toshiaki; Atomi, Haruyuki; Kanai, Tamotsu; Matsumi, Rie; Fujiwara, Shinsuke; Imanaka, Tadayuki

    2005-01-01

    The genus Thermococcus, comprised of sulfur-reducing hyperthermophilic archaea, belongs to the order Thermococcales in Euryarchaeota along with the closely related genus Pyrococcus. The members of Thermococcus are ubiquitously present in natural high-temperature environments, and are therefore considered to play a major role in the ecology and metabolic activity of microbial consortia within hot-water ecosystems. To obtain insight into this important genus, we have determined and annotated the complete 2,088,737-base genome of Thermococcus kodakaraensis strain KOD1, followed by a comparison with the three complete genomes of Pyrococcus spp. A total of 2306 coding DNA sequences (CDSs) have been identified, among which half (1165 CDSs) are annotatable, whereas the functions of 41% (936 CDSs) cannot be predicted from the primary structures. The genome contains seven genes for probable transposases and four virus-related regions. Several proteins within these genetic elements show high similarities to those in Pyrococcus spp., implying the natural occurrence of horizontal gene transfer of such mobile elements among the order Thermococcales. Comparative genomics clarified that 1204 proteins, including those for information processing and basic metabolisms, are shared among T. kodakaraensis and the three Pyrococcus spp. On the other hand, among the set of 689 proteins unique to T. kodakaraensis, there are several intriguing proteins that might be responsible for the specific trait of the genus Thermococcus, such as proteins involved in additional pyruvate oxidation, nucleotide metabolisms, unique or additional metal ion transporters, improved stress response system, and a distinct restriction system. PMID:15710748

  4. A comparison between equations describing in vivo MT: The effects of noise and sequence parameters

    NASA Astrophysics Data System (ADS)

    Cercignani, Mara; Barker, Gareth J.

    2008-04-01

    Quantitative models of magnetization transfer (MT) allow the estimation of physical properties of tissue which are thought to reflect myelination, and are therefore likely to be useful for clinical application. Although a model describing a two-pool system under continuous wave-saturation has been available for two decades, generalizing such a model to pulsed MT, and therefore to in vivo applications, is not straightforward, and only recently have a range of equations predicting the outcome of pulsed MT experiments been proposed. These solutions of the 2-pool model are based on differing assumptions and involve differing degrees of complexity, so their individual advantages and limitations are not always obvious. This paper is concerned with the comparison of three differing signal equations. After reviewing the theory behind each of them, their accuracy and precision is investigated using numerical simulations under variable experimental conditions such as degree of T1-weighting of the acquisition sequence and SNR, and the consistency of numerical results is tested using in vivo data. We show that while in conditions of minimal T1-weighting, high SNR, and large duty cycle the solutions of the three equations are consistent, they have a different tolerance to deviations from the basic assumptions behind their development, which should be taken into account when designing a quantitative MT protocol.

  5. Enzyme-free translation of DNA into sequence-defined synthetic polymers structurally unrelated to nucleic acids

    NASA Astrophysics Data System (ADS)

    Niu, Jia; Hili, Ryan; Liu, David R.

    2013-04-01

    The translation of DNA sequences into corresponding biopolymers enables the production, function and evolution of the macromolecules of life. In contrast, methods to generate sequence-defined synthetic polymers with similar levels of control have remained elusive. Here, we report the development of a DNA-templated translation system that enables the enzyme-free translation of DNA templates into sequence-defined synthetic polymers that have no necessary structural relationship with nucleic acids. We demonstrate the efficiency, sequence-specificity and generality of this translation system by oligomerizing building blocks including polyethylene glycol, α-(D)-peptides, and β-peptides in a DNA-programmed manner. Sequence-defined synthetic polymers with molecular weights of 26 kDa containing 16 consecutively coupled building blocks and 90 densely functionalized β-amino acid residues were translated from DNA templates using this strategy. We integrated the DNA-templated translation system developed here into a complete cycle of translation, coding sequence replication, template regeneration and re-translation suitable for the iterated in vitro selection of functional sequence-defined synthetic polymers unrelated in structure to nucleic acids.

  6. Boronic acid functionalized peptidyl synthetic lectins: Combinatorial library design, peptide sequencing, and selective glycoprotein recognition

    PubMed Central

    Bicker, Kevin L.; Sun, Jing; Lavigne, John J.; Thompson, Paul R.

    2011-01-01

    Aberrant glycosylation of cell membrane and secreted glycoproteins is a hallmark of various disease states, including cancer. The natural lectins currently used in the recognition of these glycoproteins are costly, difficult to produce, and unstable towards rigorous use. Herein we describe the design and synthesis of several boronic acid functionalized peptide-based synthetic lectin (SL) libraries, as well as the optimized methodology for obtaining peptide sequences of these SLs. SL libraries were subsequently used to identify SLs with as high as 5-fold selectivity for various glycoproteins. SLs will inevitably find a role in cancer diagnositics, given that they do not suffer from the drawbacks of natural lectins and that the combinatorial nature of these libraries allows for the identification of an SL for nearly any glycosylated biomolecule. PMID:21405093

  7. Kinetics of amyloid aggregation of mammal apomyoglobins and correlation with their amino acid sequences.

    PubMed

    Vilasi, Silvia; Dosi, Roberta; Iannuzzi, Clara; Malmo, Clorinda; Parente, Augusto; Irace, Gaetano; Sirangelo, Ivana

    2006-03-01

    In protein deposition disorders, a normally soluble protein is deposited as insoluble aggregates, referred to as amyloid. The intrinsic effects of specific mutations on the rates of protein aggregation and amyloid formation of unfolded polypeptide chains can be correlated with changes in hydrophobicity, propensity to convert alpha-helical to beta sheet conformation and charge. In this paper, we report the aggregation rates of buffalo, horse and bovine apomyoglobins. The experimental values were compared with the theoretical ones evaluated considering the amino acid differences among the sequences. Our results show that the mutations which play critical roles in the rate-determining step of apomyoglobin aggregation are those located within the N-terminal region of the molecule.

  8. GAWK, a novel human pituitary polypeptide: isolation, immunocytochemical localization and complete amino acid sequence.

    PubMed

    Benjannet, S; Leduc, R; Lazure, C; Seidah, N G; Marcinkiewicz, M; Chrétien, M

    1985-01-16

    During the course of reverse-phase high pressure liquid chromatography (RP-HPLC) purification of a postulated big ACTH (1) from human pituitary gland extracts, a highly purified peptide bearing no resemblance to any known polypeptide was isolated. The complete sequence of this 74 amino acid polypeptide, called GAWK, has been determined. Search on a computer data bank on the possible homology to any known protein or fragment, using a mutation data matrix, failed to reveal any homology greater than 30%. An antibody produced against a synthetic fragment allowed us to detect several immunoreactive forms. The antisera also enabled us to localize the polypeptide, by immunocytochemistry, in the anterior lobe of the pituitary gland.

  9. Evolutionary connections of biological kingdoms based on protein and nucleic acid sequence evidence

    NASA Technical Reports Server (NTRS)

    Dayhoff, M. O.

    1983-01-01

    Prokaryotic and eukaryotic evolutionary trees are developed from protein and nucleic-acid sequences by the methods of numerical taxonomy. Trees are presented for bacterial ferredoxins, 5S ribosomal RNA, c-type cytochromes , cytochromes c2 and c', and 5.8S ribosomal RNA; the implications for early evolution are discussed; and a composite tree showing the branching of the anaerobes, aerobes, archaebacteria, and eukaryotes is shown. Single lines are found for all oxygen-evolving photosynthetic forms and for the salt-loving and high-temperature forms of archaebacteria. It is argued that the eukaryote mitochondria, chloroplasts, and cytoplasmic host material are descended from free-living prokaryotes that formed symbiotic associations, with more than one symbiotic event involved in the evolution of each organelle.

  10. Identification of amino acid sequences in the polyomavirus capsid proteins that serve as nuclear localization signals

    NASA Technical Reports Server (NTRS)

    Chang, D.; Haynes, J. I. Jr; Brady, J. N.; Consigli, R. A.; Spooner, B. S. (Principal Investigator)

    1993-01-01

    The molecular mechanism participating in the transport of newly synthesized proteins from the cytoplasm to the nucleus in mammalian cells is poorly understood. Recently, the nuclear localization signal sequences (NLS) of many nuclear proteins have been identified, and most have been found to be composed of a highly basic amino acid stretch. A genetic "subtractive" and a biochemical "additive" approach were used in our studies to identify the NLS's of the polyomavirus structural capsid proteins. An NLS was identified at the N-terminus (Ala1-Pro-Lys-Arg-Lys-Ser-Gly-Val-Ser-Lys-Cys11) of the major capsid protein VP1 and at the C-terminus (Glu307 -Glu-Asp-Gly-Pro-Glu-Lys-Lys-Lys-Arg-Arg-Leu318) of the VP2/VP3 minor capsid proteins.

  11. Purification, properties and complete amino acid sequence of the ferredoxin from a green alga, Chlamydomonas reinhardtii.

    PubMed

    Schmitter, J M; Jacquot, J P; de Lamotte-Guéry, F; Beauvallet, C; Dutka, S; Gadal, P; Decottignies, P

    1988-03-01

    The ferredoxin was purified from the green alga, Chlamydomonas reinhardtii. The protein showed typical absorption and circular dichroism spectra of a [2Fe-2S] ferredoxin. When compared with spinach ferredoxin, the C. reinhardtii protein was less effective in the catalysis of NADP+ photoreduction, but its activity was higher in the light activation of C. reinhardtii malate dehydrogenase (NADP). The complete amino acid sequence was determined by automated Edman degradation of the whole protein and of peptides obtained by trypsin and chymotrypsin digestions and by CNBr cleavage. The protein consists of 94 residues, with Tyr at both NH2 and COOH termini. The positions of the four cysteines binding the two iron atoms are similar to those found in other [2Fe-2S] ferredoxins. The primary structure of C. reinhardtii ferredoxin showed a great homology (about 80%) with ferredoxins from two other green algae.

  12. Real-time nucleic acid sequence-based amplification in nanoliter volumes.

    PubMed

    Gulliksen, Anja; Solli, Lars; Karlsen, Frank; Rogne, Henrik; Hovig, Eivind; Nordstrøm, Trine; Sirevåg, Reidun

    2004-01-01

    Real-time nucleic acid sequence-based amplification (NASBA) is an isothermal method specifically designed for amplification of RNA. Fluorescent molecular beacon probes enable real-time monitoring of the amplification process. Successful identification, utilizing the real-time NASBA technology, was performed on a microchip with oligonucleotides at a concentration of 1.0 and 0.1 microM, in 10- and 50-nL reaction chambers, respectively. The microchip was developed in a silicon-glass structure. An instrument providing thermal control and an optical detection system was built for amplification readout. Experimental results demonstrate distinct amplification processes. Miniaturized real-time NASBA in microchips makes high-throughput diagnostics of bacteria, viruses, and cancer markers possible, at reduced cost and without contamination.

  13. Real-time nucleic acid sequence-based amplification assay for detection of hepatitis A virus.

    PubMed

    Abd el-Galil, Khaled H; el-Sokkary, M A; Kheira, S M; Salazar, Andre M; Yates, Marylynn V; Chen, Wilfred; Mulchandani, Ashok

    2005-11-01

    A nucleic acid sequence-based amplification (NASBA) assay in combination with a molecular beacon was developed for the real-time detection and quantification of hepatitis A virus (HAV). A 202-bp, highly conserved 5' noncoding region of HAV was targeted. The sensitivity of the real-time NASBA assay was tested with 10-fold dilutions of viral RNA, and a detection limit of 1 PFU was obtained. The specificity of the assay was demonstrated by testing with other environmental pathogens and indicator microorganisms, with only HAV positively identified. When combined with immunomagnetic separation, the NASBA assay successfully detected as few as 10 PFU from seeded lake water samples. Due to its isothermal nature, its speed, and its similar sensitivity compared to the real-time RT-PCR assay, this newly reported real-time NASBA method will have broad applications for the rapid detection of HAV in contaminated food or water.

  14. Detection of infectious salmon anaemia virus by real-time nucleic acid sequence based amplification.

    PubMed

    Starkey, William G; Smail, David A; Bleie, Hogne; Muir, K Fiona; Ireland, Jacqueline H; Richards, Randolph H

    2006-10-17

    We have developed a real-time nucleic acid sequence based amplification (NASBA) procedure for detection of infectious salmon anaemia virus (ISAV). Primers were designed to target a 124 nucleotide region of ISAV genome segment 8. Amplification products were detected in real-time with a molecular beacon (carboxyfluorescin [FAM]-labelled and methyl-red quenched) that recognised an internal region of the target amplicon. Amplification and detection were performed at 41 degrees C for 90 min in a Corbett Research Rotorgene. The real-time NASBA assay was compared to a conventional RT-PCR for ISAV detection. From a panel of 45 clinical samples, both assays detected ISAV in the same 19 samples. Based on the detection of a synthetic RNA target, the real-time NASBA procedure was approximately 100x more sensitive than conventional RT-PCR. These results suggest that real-time NASBA may represent a useful diagnostic procedure for ISAV.

  15. Sequence-defined shuttles for targeted nucleic acid and protein delivery.

    PubMed

    Röder, Ruth; Wagner, Ernst

    2014-01-01

    Molecular medicine opens into a space of novel specific therapeutic agents: intracellularly active drugs such as peptides, proteins or nucleic acids, which are not able to cross cell membranes and enter the intracellular space on their own. Through the development of cell-targeted shuttles for specific delivery, this restriction in delivery has the potential to be converted into an advantage. On the one hand, due to the multiple extra- and intracellular barriers, such carrier systems need to be multifunctional. On the other hand, they must be precise and reproducibly manufactured due to pharmaceutical reasons. Here we review the design of precise sequence-defined delivery carriers, including solid-phase synthesized peptides and nonpeptidic oligomers, or nucleotide-based carriers such as aptamers and origami nanoboxes.

  16. Parameters of proteome evolution from histograms of amino-acid sequence identities of paralogous proteins

    PubMed Central

    Axelsen, Jacob Bock; Yan, Koon-Kiu; Maslov, Sergei

    2007-01-01

    Background The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins. Results We introduce a simple mathematical framework based on a stochastic birth-and-death model that allows one to extract some of this information and apply it to the set of all pairs of paralogous proteins in H. pylori, E. coli, S. cerevisiae, C. elegans, D. melanogaster, and H. sapiens. It was found that the histogram of sequence identities p generated by an all-to-all alignment of all protein sequences encoded in a genome is well fitted with a power-law form ~ p-γ with the value of the exponent γ around 4 for the majority of organisms used in this study. This implies that the intra-protein variability of substitution rates is best described by the Gamma-distribution with the exponent α ≈ 0.33. Different features of the shape of such histograms allow us to quantify the ratio between the genome-wide average deletion/duplication rates and the amino-acid substitution rate. Conclusion We separately measure the short-term ("raw") duplication and deletion rates rdup∗, rdel∗ which include gene copies that will be removed soon after the duplication event and their dramatically reduced long-term counterparts rdup, rdel. High deletion rate among recently duplicated proteins is consistent with a scenario in which they didn't have enough time to significantly change their functional roles and thus are to a large degree disposable. Systematic trends of each of the four duplication/deletion rates with the total number of genes in the genome were analyzed. All but the deletion rate of recent duplicates rdel∗ were shown to systematically increase with Ngenes. Abnormally flat shapes

  17. Comparison of 16S rRNA gene phylogeny and functional tfdA gene distribution in thirty-one different 2,4-dichlorophenoxyacetic acid and 4-chloro-2-methylphenoxyacetic acid degraders.

    PubMed

    Baelum, Jacob; Jacobsen, Carsten S; Holben, William E

    2010-03-01

    31 different bacterial strains isolated using the herbicide 2,4-dichlorophenoxyacetic acid (2,4-D) as the sole source of carbon, were investigated for their ability to mineralize 2,4-D and the related herbicide 4-chloro-2-methylphenoxyacetic acid (MCPA). Most of the strains mineralize 2,4-D considerably faster than MCPA. Three novel primer sets were developed enabling amplification of full-length coding sequences (CDS) of the three known tfdA gene classes known to be involved in phenoxy acid degradation. 16S rRNA genes were also sequenced; and in order to investigate possible linkage between tfdA gene classes and bacterial species, tfdA and 16S rRNA gene phylogeny was compared. Three distinctly different classes of tfdA genes were observed, with class I tfdA sequences further partitioned into the two sub-classes I-a and I-b based on more subtle differences. Comparison of phylogenies derived from 16S rRNA gene sequences and tfdA gene sequences revealed that most class II tfdA genes were encoded by Burkholderia sp., while class I-a, I-b and III genes were found in a more diverse array of bacteria.

  18. Trypsin inhibitors from ridged gourd (Luffa acutangula Linn.) seeds: purification, properties, and amino acid sequences.

    PubMed

    Haldar, U C; Saha, S K; Beavis, R C; Sinha, N K

    1996-02-01

    Two trypsin inhibitors, LA-1 and LA-2, have been isolated from ridged gourd (Luffa acutangula Linn.) seeds and purified to homogeneity by gel filtration followed by ion-exchange chromatography. The isoelectric point is at pH 4.55 for LA-1 and at pH 5.85 for LA-2. The Stokes radius of each inhibitor is 11.4 A. The fluorescence emission spectrum of each inhibitor is similar to that of the free tyrosine. The biomolecular rate constant of acrylamide quenching is 1.0 x 10(9) M-1 sec-1 for LA-1 and 0.8 x 10(9) M-1 sec-1 for LA-2 and that of K2HPO4 quenching is 1.6 x 10(11) M-1 sec-1 for LA-1 and 1.2 x 10(11) M-1 sec-1 for LA-2. Analysis of the circular dichroic spectra yields 40% alpha-helix and 60% beta-turn for La-1 and 45% alpha-helix and 55% beta-turn for LA-2. Inhibitors LA-1 and LA-2 consist of 28 and 29 amino acid residues, respectively. They lack threonine, alanine, valine, and tryptophan. Both inhibitors strongly inhibit trypsin by forming enzyme-inhibitor complexes at a molar ratio of unity. A chemical modification study suggests the involvement of arginine of LA-1 and lysine of LA-2 in their reactive sites. The inhibitors are very similar in their amino acid sequences, and show sequence homology with other squash family inhibitors. PMID:8924202

  19. Microfluidic platform for isolating nucleic acid targets using sequence specific hybridization

    PubMed Central

    Wang, Jingjing; Morabito, Kenneth; Tang, Jay X.; Tripathi, Anubhav

    2013-01-01

    The separation of target nucleic acid sequences from biological samples has emerged as a significant process in today's diagnostics and detection strategies. In addition to the possible clinical applications, the fundamental understanding of target and sequence specific hybridization on surface modified magnetic beads is of high value. In this paper, we describe a novel microfluidic platform that utilizes a mobile magnetic field in static microfluidic channels, where single stranded DNA (ssDNA) molecules are isolated via nucleic acid hybridization. We first established efficient isolation of biotinylated capture probe (BP) using streptavidin-coated magnetic beads. Subsequently, we investigated the hybridization of target ssDNA with BP bound to beads and explained these hybridization kinetics using a dual-species kinetic model. The number of hybridized target ssDNA molecules was determined to be about 6.5 times less than that of BP on the bead surface, due to steric hindrance effects. The hybridization of target ssDNA with non-complementary BP bound to bead was also examined, and non-specific hybridization was found to be insignificant. Finally, we demonstrated highly efficient capture and isolation of target ssDNA in the presence of non-target ssDNA, where as low as 1% target ssDNA can be detected from mixture. The microfluidic method described in this paper is significantly relevant and is broadly applicable, especially towards point-of-care biological diagnostic platforms that require binding and separation of known target biomolecules, such as RNA, ssDNA, or protein. PMID:24404041

  20. Detection of Vibrio cholerae by real-time nucleic acid sequence-based amplification.

    PubMed

    Fykse, Else M; Skogan, Gunnar; Davies, William; Olsen, Jaran Strand; Blatny, Janet M

    2007-03-01

    A multitarget molecular beacon-based real-time nucleic acid sequence-based amplification (NASBA) assay for the specific detection of Vibrio cholerae has been developed. The genes encoding the cholera toxin (ctxA), the toxin-coregulated pilus (tcpA; colonization factor), the ctxA toxin regulator (toxR), hemolysin (hlyA), and the 60-kDa chaperonin product (groEL) were selected as target sequences for detection. The beacons for the five different genetic targets were evaluated by serial dilution of RNA from V. cholerae cells. RNase treatment of the nucleic acids eliminated all NASBA, whereas DNase treatment had no effect, showing that RNA and not DNA was amplified. The specificity of the assay was investigated by testing several isolates of V. cholerae, other Vibrio species, and Bacillus cereus, Salmonella enterica, and Escherichia coli strains. The toxR, groEL, and hlyA beacons identified all V. cholerae isolates, whereas the ctxA and tcpA beacons identified the O1 toxigenic clinical isolates. The NASBA assay detected V. cholerae at 50 CFU/ml by using the general marker groEL and tcpA that specifically indicates toxigenic strains. A correlation between cell viability and NASBA was demonstrated for the ctxA, toxR, and hlyA targets. RNA isolated from different environmental water samples spiked with V. cholerae was specifically detected by NASBA. These results indicate that NASBA can be used in the rapid detection of V. cholerae from various environmental water samples. This method has a strong potential for detecting toxigenic strains by using the tcpA and ctxA markers. The entire assay including RNA extraction and NASBA was completed within 3 h.

  1. Phylogenetic analysis of beta-papillomaviruses as inferred from nucleotide and amino acid sequence data.

    PubMed

    Gottschling, Marc; Köhler, Anja; Stockfleth, Eggert; Nindl, Ingo

    2007-01-01

    Human papillomaviruses (HPV) of the beta-group seem to be involved in the pathogenesis of non-melanoma skin cancer. Papillomaviruses are host specific and are considered closely co-evolving with their hosts. Evolutionary incongruence between early genes and late genes has been reported among oncogenic genital alpha-papillomaviruses and considerably challenge phylogenetic reconstructions. We investigated the relationships of 29 beta-HPV (25 types plus four putative new types, subtypes, or variants) as inferred from codon aligned and amino acid sequence data of the genes E1, E2, E6, E7, L1, and L2 using likelihood, distance, and parsimony approaches. An analysis of a L1 fragment included additional nucleotide and amino acid sequences from seven non-human beta-papillomaviruses. Early genes and late genes evolution did not conflict significantly in beta-papillomaviruses based on partition homogeneity tests (p > or = 0.001). As inferred from the complete genome analyses, beta-papillomaviruses were monophyletic and segregated into four highly supported monophyletic assemblages corresponding to the species 1, 2, 3, and fused 4/5. They basically split into the species 1 and the remainder of beta-papillomaviruses, whose species 3, 4, and 5 constituted the sistergroup of species 2. beta-Papillomaviruses have been isolated from humans, apes, and monkeys, and phylogenetic analyses of the L1 fragment showed non-human papillomaviruses highly polyphyletic nesting within the HPV species. Thus, host and virus phylogenies were not congruent in beta-papillomaviruses, and multiple invasions across species borders may contribute (additionally to host-linked evolution) to their diversification.

  2. A single molecular beacon probe is sufficient for the analysis of multiple nucleic acid sequences.

    PubMed

    Gerasimova, Yulia V; Hayson, Aaron; Ballantyne, Jack; Kolpashchikov, Dmitry M

    2010-08-16

    Molecular beacon (MB) probes are dual-labeled hairpin-shaped oligodeoxyribonucleotides that are extensively used for real-time detection of specific RNA/DNA analytes. In the MB probe, the loop fragment is complementary to the analyte: therefore, a unique probe is required for the analysis of each new analyte sequence. The conjugation of an oligonucleotide with two dyes and subsequent purification procedures add to the cost of MB probes, thus reducing their application in multiplex formats. Here we demonstrate how one MB probe can be used for the analysis of an arbitrary nucleic acid. The approach takes advantage of two oligonucleotide adaptor strands, each of which contains a fragment complementary to the analyte and a fragment complementary to an MB probe. The presence of the analyte leads to association of MB probe and the two DNA strands in quadripartite complex. The MB probe fluorescently reports the formation of this complex. In this design, the MB does not bind the analyte directly; therefore, the MB sequence is independent of the analyte. In this study one universal MB probe was used to genotype three human polymorphic sites. This approach promises to reduce the cost of multiplex real-time assays and improve the accuracy of single-nucleotide polymorphism genotyping.

  3. Comparison of acid anhydrides with carboxylic acids in enantioselective enzymatic esterification of racemic menthol.

    PubMed

    Xu, J; Zhu, J; Kawamoto, T; Atsuo, T; Hu, Y

    1997-01-01

    Optical resolution of racemic menthol has been efficiently achieved by lipase-catalyzed enantioselective esterification in an organic solvent. The performance of the reaction using an acid anhydride as an acyl donor was compared with that using its corresponding free acid. The reactivities of acid anhydrides were found to be higher than their corresponding free acids, but acid anhydrides were also found to be easily hydrolyzed into free acids under the catalysis of the same enzyme. The existence of a too-high concentration of an acid anhydride in a micro-aqueous reaction system will cause dehydration and thus deactivation of the enzyme, and will enhance non-selective esterification of a chiral alcohol, which will reduce the optical purity of the product. All these drawbacks, however, could be effectively overcome in a semi-batch reaction system into which propionic anhydride was continuously fed. This system showed some advantages over a batch reaction system using free propionic acid: the reaction time of dl-menthol was shortened by half, the stability of the enzyme was much enhanced, and the optical purity of the product (l-menthyl ester) was kept at a similarly high level (> 98% ee). PMID:9631262

  4. Comparison of D-gluconic acid production in selected strains of acetic acid bacteria.

    PubMed

    Sainz, F; Navarro, D; Mateo, E; Torija, M J; Mas, A

    2016-04-01

    The oxidative metabolism of acetic acid bacteria (AAB) can be exploited for the production of several compounds, including D-gluconic acid. The production of D-gluconic acid in fermented beverages could be useful for the development of new products without glucose. In the present study, we analyzed nineteen strains belonging to eight different species of AAB to select those that could produce D-gluconic acid from D-glucose without consuming D-fructose. We tested their performance in three different media and analyzed the changes in the levels of D-glucose, D-fructose, D-gluconic acid and the derived gluconates. D-Glucose and D-fructose consumption and D-gluconic acid production were heavily dependent on the strain and the media. The most suitable strains for our purpose were Gluconobacter japonicus CECT 8443 and Gluconobacter oxydans Po5. The strawberry isolate Acetobacter malorum (CECT 7749) also produced D-gluconic acid; however, it further oxidized D-gluconic acid to keto-D-gluconates.

  5. [Comparison between Astragalus membranaceus var. mongholicus and Hedysarum polybotrys based on ITS sequences and metabolomics].

    PubMed

    Jiao, Mei-li; Li, Zhen-yu; Zhang, Fu-sheng; Qin, Xue-mei

    2015-12-01

    Astragalus membranaceus var. mongholicus and Hedysarum polybotrys belong to different genera, but have similar drug efficacy in traditional Chinese medicine theory, and H. polybotrys was used as the legal A. membranaceus var. mongholicus previously. In this study, similarities and differences between them were analyzed via their ITS/ITS2 fragments information. The ITS (internal transcribed spacer) regions were amplified using polymerase chain reaction and then sequenced in two-way. The alignment lengths of ITS regions were 616 bp, in which 508 loci were consistent, and 103 loci were different, accounting for 82.47% and 16.72% of the total ITS nucleotides in length, respectively. As genotype determines phenotype, 1HNMR-based metabolomic approach was further used to reveal the chemical similarities and differences between them. Thirty-four metabolites were identified in the 1H NMR spectra, and twenty-seven metabolites were the common components. Amino acids, carbohydrates and other primary metabolites were similar, while a large difference existed in the flavonoids and astragalosides. This study suggests that A. membranaceus var. mongholicus and H. polybotrys show similarities and differences from molecular and chemical perspectives, which has laid a foundation for elucidating the effective material basis of drug with similar efficacy and resources utilization. PMID:27169287

  6. Exon-intron organization and sequence comparison of human and murine T11 (CD2) genes

    SciTech Connect

    Diamond, D.J.; Clayton, L.K.; Sayre, P.H.; Reinherz, E.L.

    1988-03-01

    Genomic DNA clones containing the human and murine genes coding for the 50-kDa T11 (CD2) T-cell surface glycoprotein were characterized. The human T11 gene is approx. = 12 kilobases long and comprised of five exons. A leader exon (L) contains the 5'-untranslated region and most of the nucleotides defining the signal peptide (amino acids (aa) -24 to -5). Two exons encode the extracellular segment; exon Ex1 is 321 base pairs (bp) long and codes for four residues of the leader peptide and aa 1-103 of the mature protein, and exon Ex2 is 231 bp long and encodes aa 104-180. Exon TM is 123 bp long and codes for the single transmembrane region of the molecule (aa 181-221). Exon C is a large 765-bp exon encoding virtually the entire cytoplasmic domain (aa 222-327) and the 3'-untranslated region. The murine region T11 gene has a similar organization with exon-intron boundaries essentially identical to the human gene. Substantial conservation of nucleotide sequences between species in both 5'- and 3'-gene flanking regions equivalent to that among homologous exons suggests that murine and human genes may be regulated in a similar fashion. The probable relationship of the individual T11 exons to functional and structural protein domains is discussed.

  7. Canine amino acid transport system Xc(-): cDNA sequence, distribution and cystine transport activity in lens epithelial cells.

    PubMed

    Maruo, Takuya; Kanemaki, Nobuyuki; Onda, Ken; Sato, Reiichiro; Ichihara, Nobuteru; Ochiai, Hideharu

    2014-04-01

    The cystine transport activity of a lens epithelial cell line originated from a canine mature cataract was investigated. The distinct cystine transport activity was observed, which was inhibited to 28% by extracellular 1 mM glutamate. The cDNA sequences of canine cysteine/glutamate exchanger (xCT) and 4F2hc were determined. The predicted amino acid sequences were 527 and 533 amino acid polypeptides, respectively. The amino acid sequences of canine xCT and 4F2hc showed high similarities (>80%) to those of humans. The expression of xCT in lens epithelial cell line was confirmed by western blot analysis. RT-PCR analysis revealed high level expression only in the brain, and it was below the detectable level in other tissues.

  8. Human Retroviruses and AIDS. A compilation and analysis of nucleic acid and amino acid sequences: I--II; III--V

    SciTech Connect

    Myers, G.; Korber, B.; Wain-Hobson, S.; Smith, R.F.; Pavlakis, G.N.

    1993-12-31

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (I) HIV and SIV Nucleotide Sequences; (II) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. Information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium.

  9. Lactic acid production from potato peel waste by anaerobic sequencing batch fermentation using undefined mixed culture.

    PubMed

    Liang, Shaobo; McDonald, Armando G; Coats, Erik R

    2015-11-01

    Lactic acid (LA) is a necessary industrial feedstock for producing the bioplastic, polylactic acid (PLA), which is currently produced by pure culture fermentation of food carbohydrates. This work presents an alternative to produce LA from potato peel waste (PPW) by anaerobic fermentation in a sequencing batch reactor (SBR) inoculated with undefined mixed culture from a municipal wastewater treatment plant. A statistical design of experiments approach was employed using set of 0.8L SBRs using gelatinized PPW at a solids content range from 30 to 50 g L(-1), solids retention time of 2-4 days for yield and productivity optimization. The maximum LA production yield of 0.25 g g(-1) PPW and highest productivity of 125 mg g(-1) d(-1) were achieved. A scale-up SBR trial using neat gelatinized PPW (at 80 g L(-1) solids content) at the 3 L scale was employed and the highest LA yield of 0.14 g g(-1) PPW and a productivity of 138 mg g(-1) d(-1) were achieved with a 1 d SRT.

  10. Amino acid sequence surrounding the chondroitin sulfate attachment site of thrombomodulin regulates chondroitin polymerization.

    PubMed

    Izumikawa, Tomomi; Kitagawa, Hiroshi

    2015-05-01

    Thrombomodulin (TM) is a cell-surface glycoprotein and a critical mediator of endothelial anticoagulant function. TM exists as both a chondroitin sulfate (CS) proteoglycan (PG) form and a non-PG form lacking a CS chain (α-TM); therefore, TM can be described as a part-time PG. Previously, we reported that α-TM bears an immature, truncated linkage tetrasaccharide structure (GlcAβ1-3Galβ1-3Galβ1-4Xyl). However, the biosynthetic mechanism to generate part-time PGs remains unclear. In this study, we used several mutants to demonstrate that the amino acid sequence surrounding the CS attachment site influences the efficiency of chondroitin polymerization. In particular, the presence of acidic residues surrounding the CS attachment site was indispensable for the elongation of CS. In addition, mutants defective in CS elongation did not exhibit anti-coagulant activity, as in the case with α-TM. Together, these data support a model for CS chain assembly in which specific core protein determinants are recognized by a key biosynthetic enzyme involved in chondroitin polymerization.

  11. Lactic acid production from potato peel waste by anaerobic sequencing batch fermentation using undefined mixed culture.

    PubMed

    Liang, Shaobo; McDonald, Armando G; Coats, Erik R

    2015-11-01

    Lactic acid (LA) is a necessary industrial feedstock for producing the bioplastic, polylactic acid (PLA), which is currently produced by pure culture fermentation of food carbohydrates. This work presents an alternative to produce LA from potato peel waste (PPW) by anaerobic fermentation in a sequencing batch reactor (SBR) inoculated with undefined mixed culture from a municipal wastewater treatment plant. A statistical design of experiments approach was employed using set of 0.8L SBRs using gelatinized PPW at a solids content range from 30 to 50 g L(-1), solids retention time of 2-4 days for yield and productivity optimization. The maximum LA production yield of 0.25 g g(-1) PPW and highest productivity of 125 mg g(-1) d(-1) were achieved. A scale-up SBR trial using neat gelatinized PPW (at 80 g L(-1) solids content) at the 3 L scale was employed and the highest LA yield of 0.14 g g(-1) PPW and a productivity of 138 mg g(-1) d(-1) were achieved with a 1 d SRT. PMID:25708409

  12. A novel phytase with sequence similarity to purple acid phosphatases is expressed in cotyledons of germinating soybean seedlings.

    PubMed

    Hegeman, C E; Grabau, E A

    2001-08-01

    Phytic acid (myo-inositol hexakisphosphate) is the major storage form of phosphorus in plant seeds. During germination, stored reserves are used as a source of nutrients by the plant seedling. Phytic acid is degraded by the activity of phytases to yield inositol and free phosphate. Due to the lack of phytases in the non-ruminant digestive tract, monogastric animals cannot utilize dietary phytic acid and it is excreted into manure. High phytic acid content in manure results in elevated phosphorus levels in soil and water and accompanying environmental concerns. The use of phytases to degrade seed phytic acid has potential for reducing the negative environmental impact of livestock production. A phytase was purified to electrophoretic homogeneity from cotyledons of germinated soybeans (Glycine max L. Merr.). Peptide sequence data generated from the purified enzyme facilitated the cloning of the phytase sequence (GmPhy) employing a polymerase chain reaction strategy. The introduction of GmPhy into soybean tissue culture resulted in increased phytase activity in transformed cells, which confirmed the identity of the phytase gene. It is surprising that the soybean phytase was unrelated to previously characterized microbial or maize (Zea mays) phytases, which were classified as histidine acid phosphatases. The soybean phytase sequence exhibited a high degree of similarity to purple acid phosphatases, a class of metallophosphoesterases.

  13. Indigenous and introduced potyviruses of legumes and Passiflora spp. from Australia: biological properties and comparison of coat protein nucleotide sequences.

    PubMed

    Coutts, Brenda A; Kehoe, Monica A; Webster, Craig G; Wylie, Stephen J; Jones, Roger A C

    2011-10-01

    Five Australian potyviruses, passion fruit woodiness virus (PWV), passiflora mosaic virus (PaMV), passiflora virus Y, clitoria chlorosis virus (ClCV) and hardenbergia mosaic virus (HarMV), and two introduced potyviruses, bean common mosaic virus (BCMV) and cowpea aphid-borne mosaic virus (CAbMV), were detected in nine wild or cultivated Passiflora and legume species growing in tropical, subtropical or Mediterranean climatic regions of Western Australia. When ClCV (1), PaMV (1), PaVY (8) and PWV (5) isolates were inoculated to 15 plant species, PWV and two PaVY P. foetida isolates infected P. edulis and P. caerulea readily but legumes only occasionally. Another PaVY P. foetida isolate resembled five PaVY legume isolates in infecting legumes readily but not infecting P. edulis. PaMV resembled PaVY legume isolates in legumes but also infected P. edulis. ClCV did not infect P. edulis or P. caerulea and behaved differently from PaVY legume isolates and PaMV when inoculated to two legume species. When complete coat protein (CP) nucleotide (nt) sequences of 33 new isolates were compared with 41 others, PWV (8), HarMV (4), PaMV (1) and ClCV (1) were within a large group of Australian isolates, while PaVY (14), CAbMV (1) and BCMV (3) isolates were in three other groups. Variation among PWV and PaVY isolates was sufficient for division into four clades each (I-IV). A variable block of 56 amino acid residues at the N-terminal region of the CPs of PaMV and ClCV distinguished them from PWV. Comparison of PWV, PaMV and ClCV CP sequences showed that nt identities were both above and below the 76-77% potyvirus species threshold level. This research gives insights into invasion of new hosts by potyviruses at the natural vegetation and cultivated area interface, and illustrates the potential of indigenous viruses to emerge to infect introduced plants. PMID:21744001

  14. Comparison of phenotypic and molecular tests to identify lactic acid bacteria.

    PubMed

    Moraes, Paula Mendonça; Perin, Luana Martins; Júnior, Abelardo Silva; Nero, Luís Augusto

    2013-01-01

    Twenty-nine lactic acid bacteria (LAB) isolates were submitted for identification using Biolog, API50CHL, 16S rDNA sequencing, and species-specific PCR reactions. The identification results were compared, and it was concluded that a polyphasic approach is necessary for proper LAB identification, being the molecular analyzes the most reliable. PMID:24159291

  15. A novel phospholipase A(2) from the venom glands of Bungarus candidus: cloning and sequence-comparison.

    PubMed

    Tsai, Inn-Ho; Hsu, Hwa-Yao; Wang, Ying-Ming

    2002-09-01

    The presence of phospholipase A(2) (PLA(2)) in the venom of Malayan krait (Bungarus candidus) and its structure were studied. The PLA(2) cDNAs from the venom gland of B. candidus (Indonesia origin) were amplified by the polymerase chain reactions (PCR) and cloned. The primers used were based on the cDNA sequences of several homologous B. multicinctus venom PLA(2)s. In addition to the A-chains of beta-bungarotoxins, a novel B. candidus PLA(2) was cloned and its full amino acid sequence deduced. Having totally 125 amino acid residues, the PLA(2) contains a pancreatic loop and is 61% identical to the acidic PLA(2) of king cobra venom. However, the enzyme was not detected from the venom sample. Its structural relationships to other elapid venom PLA(2)s were analyzed with a phylogenetic tree and discussed. PMID:12220723

  16. Microwave-assisted acid and base hydrolysis of intact proteins containing disulfide bonds for protein sequence analysis by mass spectrometry.

    PubMed

    Reiz, Bela; Li, Liang

    2010-09-01

    Controlled hydrolysis of proteins to generate peptide ladders combined with mass spectrometric analysis of the resultant peptides can be used for protein sequencing. In this paper, two methods of improving the microwave-assisted protein hydrolysis process are described to enable rapid sequencing of proteins containing disulfide bonds and increase sequence coverage, respectively. It was demonstrated that proteins containing disulfide bonds could be sequenced by MS analysis by first performing hydrolysis for less than 2 min, followed by 1 h of reduction to release the peptides originally linked by disulfide bonds. It was shown that a strong base could be used as a catalyst for microwave-assisted protein hydrolysis, producing complementary sequence information to that generated by microwave-assisted acid hydrolysis. However, using either acid or base hydrolysis, amide bond breakages in small regions of the polypeptide chains of the model proteins (e.g., cytochrome c and lysozyme) were not detected. Dynamic light scattering measurement of the proteins solubilized in an acid or base indicated that protein-protein interaction or aggregation was not the cause of the failure to hydrolyze certain amide bonds. It was speculated that there were some unknown local structures that might play a role in preventing an acid or base from reacting with the peptide bonds therein.

  17. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

    NASA Astrophysics Data System (ADS)

    McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  18. Nucleotide sequence and genomic organization of Aleutian mink disease parvovirus (ADV): sequence comparisons between a nonpathogenic and a pathogenic strain of ADV.

    PubMed Central

    Bloom, M E; Alexandersen, S; Perryman, S; Lechner, D; Wolfinbarger, J B

    1988-01-01

    A DNA sequence of 4,592 nucleotides (nt) was derived for the nonpathogenic ADV-G strain of Aleutian mink disease parvovirus (ADV). The 3'(left) end of the virion strand contained a 117-nt palindrome that could assume a Y-shaped configuration similar to, but less stable than, that of other parvoviruses. The sequence obtained for the 5' end was incomplete and did not contain the 5' (right) hairpin structure but ended just after a 25-nt A + T-rich direct repeat. Features of ADV genomic organization are (i) major left (622 amino acids) and right (702 amino acids) open reading frames (ORFs) in different translational frames of the plus-sense strand, (ii) two short mid-ORFs, (iii) eight potential promoter motifs (TATA boxes), including ones at 3 and 36 map units, and (iv) six potential polyadenylation sites, including three clustered near the termination of the right ORF. Although the overall homology to other parvoviruses is less than 50%, there are short conserved amino acid regions in both major ORFs. However, two regions in the right ORF allegedly conserved among the parvoviruses were not present in ADV. At the DNA level, ADV-G is 97.5% related to the pathogenic ADV-Utah 1. A total of 22 amino acid changes were found in the right ORF; changes were found in both hydrophilic and hydrophobic regions and generally did not affect the theoretical hydropathy. However, there is a short heterogeneous region at 64 to 65 map units in which 8 out of 11 residues have diverged; this hypervariable segment may be analogous to short amino acid regions in other parvoviruses that determine host range and pathogenicity. These findings suggested that this region may harbor some of the determinants responsible for the differences in pathogenicity of ADV-G and ADV-Utah 1. PMID:2839709

  19. Crosslinked hyaluronic acid dermal fillers: a comparison of rheological properties.

    PubMed

    Falcone, Samuel J; Berg, Richard A

    2008-10-01

    Temporary dermal fillers composed of crosslinked hyaluronic acid (XLHA) are space filling gels that are readily available in the United States and Europe. Several families of dermal fillers based on XLHA are now available and here we compare the physical and rheological properties of these fillers to the clinical effectiveness. The XLHA fillers are prepared with different crosslinkers, using HA isolated from different sources, have different particle sizes, and differ substantially in rheological properties. For these fillers, the magnitude of the complex viscosity, |eta*|, varies by a factor of 20, the magnitude of the complex rigidity modulus, |G*|, and the magnitude of the complex compliance, |J*| vary by a factor of 10, the percent elasticity varies from 58% to 89.9%, and the tan delta varies from 0.11 to 0.70. The available clinical data cannot be correlated with either the oscillatory dynamic or steady flow rotational rheological properties of the various fillers. However, the clinical data appear to correlate strongly with the total concentration of XLHA in the products and to a lesser extent with percent elasticity. Hence, our data suggest the following correlation: dermal filler persistence = [polymer] x [% elasticity] and the clinical persistence of a dermal filler composed of XLHA is dominated by the mass and elasticity of the material implanted. This work predicts that the development of future XLHA dermal filler formulations should focus on increasing the polymer concentration and elasticity to improve the clinical persistence.

  20. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...

  1. Polyvinyl-alcohol-based magnetic beads for rapid and efficient separation of specific or unspecific nucleic acid sequences

    NASA Astrophysics Data System (ADS)

    Oster, Jürgen; Parker, Jeffrey; à Brassard, Lothar

    2001-01-01

    The versatile application of polyvinyl-alcohol-based magnetic M-PVA beads is demonstrated in the separation of genomic DNA, sequence specific nucleic acid purification, and binding of bacteria for subsequent DNA extraction and detection. It is shown that nucleic acids can be obtained in high yield and purity using M-PVA beads, making sample preparation efficient, fast and highly adaptable for automation processes.

  2. Pancreatic ribonucleases of mammals with ruminant-like digestion. Amino-acid sequences of hippopotamus and sloth ribonucleases.

    PubMed

    Havinga, J; Beintema, J J

    1980-09-01

    High levels of pancreatic ribonucleases are found in ruminants, species that have a ruminant-like digestion and several species with coecal digestion. Pancreatic ribonucleases from several independently evolved species with ruminant-like digestion were investigated to test a hypothesis that glycosylation of ribonucleases may have some function in species with coecal digestion and that glycosylation of the enzyme may not be advantageous for ruminants. Ribonucleases from the hippopotamus, two-toed sloth and three-toed sloth were isolated by extraction with sulfuric acid and affinity chromatography. Complete amino acid sequences were determined for the ribonucleases from the hippopotamus and two-toed sloth and a partial sequence for the enzyme from the three-toed sloth. The amino acids 75-78 of hippopotamus ribonuclease were positioned by homology with other artiodactyl ribonucleases. In hippopotamus ribonuclease a heterogeneity was found at position 37, half of the molecules containing glutamine acid the other half lysine. Hippopotamus ribonuclease differs less from pig and bovine ribonuclease than these differ from each other, because more ancestral characteristics have been retained. Although hippopotamus ribonuclease contains all four Asn-X-Ser/Thr sequences previously found to be glycosylation sites in one or more pancreatic ribonucleases, only the sequence Ans-Met-Thr (34-36) is glycosylated in the variant with glutamine at position 37, while the variant with lysine at this position is carbohydrate-free. Both sloth ribonucleases are completely glycosylated at the sequence Ans-Met-Thr (34-36) with a simple type of carbohydrate chain. The amino acid sequence of two-toed sloth ribonuclease shows some interesting coupled replacements.

  3. From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies

    PubMed Central

    Benos, Panayiotis V.; Gatt, Melanie K.; Murphy, Lee; Harris, David; Barrell, Bart; Ferraz, Concepcion; Vidal, Sophie; Brun, Christine; Demaille, Jacques; Cadieu, Edouard; Dreano, Stephane; Gloux, Stéphanie; Lelaure, Valerie; Mottier, Stephanie; Galibert, Francis; Borkova, Dana; Miñana, Belen; Kafatos, Fotis C.; Bolshakov, Slava; Sidén-Kiamos, Inga; Papagiannakis, George; Spanos, Lefteris; Louis, Christos; Madueño, Encarnación; de Pablos, Beatriz; Modolell, Juan; Peter, Annette; Schöttler, Petra; Werner, Meike; Mourkioti, Fotini; Beinert, Nicole; Dowe, Gordon; Schäfer, Ulrich; Jäckle, Herbert; Bucheton, Alain; Callister, Debbie; Campbell, Lorna; Henderson, Nadine S.; McMillan, Paul J.; Salles, Cathy; Tait, Evelyn; Valenti, Phillipe; Saunders, Robert D.C.; Billaud, Alain; Pachter, Lior; Glover, David M.; Ashburner, Michael

    2001-01-01

    We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different transposable elements. We show that an interval between bands 3A2 and 3C2, believed in the 1970s to show a correlation between the number of bands on the polytene chromosomes and the 20 genes identified by conventional genetics, is predicted to contain 45 genes from its DNA sequence. We have determined the insertion sites of P-elements from 111 mutant lines, about half of which are in a position likely to affect the expression of novel predicted genes, thus representing a resource for subsequent functional genomic analysis. We compare the European Drosophila Genome Project sequence with the corresponding part of the independently assembled and annotated Joint Sequence determined through “shotgun” sequencing. Discounting differences in the distribution of known transposable elements between the strains sequenced in the two projects, we detected three major sequence differences, two of which are probably explained by errors in assembly; the origin of the third major difference is unclear. In addition there are eight sequence gaps within the Joint Sequence. At least six of these eight gaps are likely to be sites of transposable elements; the other two are complex. Of the 275 genes in common to both projects, 60% are identical within 1% of their predicted amino-acid sequence and 31% show minor differences such as in choice of translation initiation or termination codons; the remaining 9% show major differences in interpretation. [All of the sequences analyzed in this paper have been deposited in the EMBL-Bank database under the following accession nos.: AL009146, AL009147, AL009171, AL009188–AL009196, AL021067, AL021086, AL021106–AL021108, AL021726, AL

  4. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  5. Identification of tropomyosins as major allergens in antarctic krill and mantis shrimp and their amino acid sequence characteristics.

    PubMed

    Motoyama, Kanna; Suma, Yota; Ishizaki, Shoichiro; Nagashima, Yuji; Lu, Ying; Ushio, Hideki; Shiomi, Kazuo

    2008-01-01

    Tropomyosin represents a major allergen of decapod crustaceans such as shrimps and crabs, and its highly conserved amino acid sequence (>90% identity) is a molecular basis of the immunoglobulin E (IgE) cross-reactivity among decapods. At present, however, little information is available about allergens in edible crustaceans other than decapods. In this study, the major allergen in two species of edible crustaceans, Antarctic krill Euphausia superba and mantis shrimp Oratosquilla oratoria that are taxonomically distinct from decapods, was demonstrated to be tropomyosin by IgE-immunoblotting using patient sera. The cross-reactivity of the tropomyosins from both species with decapod tropomyosins was also confirmed by inhibition IgE immunoblotting. Sequences of the tropomyosins from both species were determined by complementary deoxyribonucleic acid cloning. The mantis shrimp tropomyosin has high sequence identity (>90% identity) with decapod tropomyosins, especially with fast-type tropomyosins. On the other hand, the Antarctic krill tropomyosin is characterized by diverse alterations in region 13-42, the amino acid sequence of which is highly conserved for decapod tropomyosins, and hence, it shares somewhat lower sequence identity (82.4-89.8% identity) with decapod tropomyosins than the mantis shrimp tropomyosin. Quantification by enzyme-linked immunosorbent assay revealed that Antarctic krill contains tropomyosin at almost the same level as decapods, suggesting that its allergenicity is equivalent to decapods. However, mantis shrimp was assumed to be substantially not allergenic because of the extremely low content of tropomyosin. PMID:18521668

  6. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus CFL1, a Lactic Acid Bacterium Isolated from French Handcrafted Fermented Milk

    PubMed Central

    Meneghel, Julie; Irlinger, Françoise; Loux, Valentin; Vidal, Marie; Passot, Stéphanie; Béal, Catherine; Layec, Séverine

    2016-01-01

    Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) is a lactic acid bacterium widely used for the production of yogurt and cheeses. Here, we report the genome sequence of L. bulgaricus CFL1 to improve our knowledge on its stress-induced damages following production and end-use processes. PMID:26941141

  7. Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.

    PubMed

    Rao, H B; Zhu, F; Yang, G B; Li, Z R; Chen, Y Z

    2011-07-01

    Sequence-derived structural and physicochemical features have been extensively used for analyzing and predicting structural, functional, expression and interaction profiles of proteins and peptides. PROFEAT has been developed as a web server for computing commonly used features of proteins and peptides from amino acid sequence. To facilitate more extensive studies of protein and peptides, numerous improvements and updates have been made to PROFEAT. We added new functions for computing descriptors of protein-protein and protein-small molecule interactions, segment descriptors for local properties of protein sequences, topological descriptors for peptide sequences and small molecule structures. We also added new feature groups for proteins and peptides (pseudo-amino acid composition, amphiphilic pseudo-amino acid composition, total amino acid properties and atomic-level topological descriptors) as well as for small molecules (atomic-level topological descriptors). Overall, PROFEAT computes 11 feature groups of descriptors for proteins and peptides, and a feature group of more than 400 descriptors for small molecules plus the derived features for protein-protein and protein-small molecule interactions. Our computational algorithms have been extensively tested and used in a number of published works for predicting proteins of specific structural or functional classes, protein-protein interactions, peptides of specific functions and quantitative structure activity relationships of small molecules. PROFEAT is accessible free of charge at http://bidd.cz3.nus.edu.sg/cgi-bin/prof/protein/profnew.cgi.

  8. Genome Sequence of a Candidate World Health Organization Reference Strain of Zika Virus for Nucleic Acid Testing

    PubMed Central

    Trösemeier, Jan-Hendrik; Musso, Didier; Blümel, Johannes; Thézé, Julien; Pybus, Oliver G.

    2016-01-01

    We report here the sequence of a candidate reference strain of Zika virus (ZIKV) developed on behalf of the World Health Organization (WHO). The ZIKV reference strain is intended for use in nucleic acid amplification (NAT)-based assays for the detection and quantification of ZIKV RNA. PMID:27587826

  9. Draft Genome Sequence of Burkholderia stabilis LA20W, a Trehalose Producer That Uses Levulinic Acid as a Substrate

    PubMed Central

    Sato, Yuya; Koike, Hideaki; Kondo, Susumu; Hori, Tomoyuki; Kanno, Manabu; Kimura, Nobutada; Morita, Tomotake; Kirimura, Kohtaro

    2016-01-01

    Burkholderia stabilis LA20W produces trehalose using levulinic acid (LA) as a substrate. Here, we report the 7.97-Mb draft genome sequence of B. stabilis LA20W, which will be useful in investigations of the enzymes involved in LA metabolism and the mechanism of LA-induced trehalose production. PMID:27491978

  10. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus CFL1, a Lactic Acid Bacterium Isolated from French Handcrafted Fermented Milk.

    PubMed

    Meneghel, Julie; Dugat-Bony, Eric; Irlinger, Françoise; Loux, Valentin; Vidal, Marie; Passot, Stéphanie; Béal, Catherine; Layec, Séverine; Fonseca, Fernanda

    2016-01-01

    Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) is a lactic acid bacterium widely used for the production of yogurt and cheeses. Here, we report the genome sequence of L. bulgaricus CFL1 to improve our knowledge on its stress-induced damages following production and end-use processes. PMID:26941141

  11. Draft Genome Sequence of Acetobacter tropicalis Type Strain NBRC16470, a Producer of Optically Pure d-Glyceric Acid

    PubMed Central

    Koike, Hideaki; Sato, Shun; Morita, Tomotake; Fukuoka, Tokuma

    2014-01-01

    Here we report the 3.7-Mb draft genome sequence of Acetobacter tropicalis NBRC16470T, which can produce optically pure d-glyceric acid (d-GA; 99% enantiomeric excess) from raw glycerol feedstock derived from biodiesel fuel production processes. PMID:25523780

  12. Complete genome sequence of Lactobacillus plantarum ZS2058, a probiotic strain with high conjugated linoleic acid production ability.

    PubMed

    Yang, Bo; Chen, Haiqin; Tian, Fengwei; Zhao, Jianxin; Gu, Zhennan; Zhang, Hao; Chen, Yong Q; Chen, Wei

    2015-11-20

    Lactobacillus plantarum ZS2058 was isolated from sauerkraut and identified to synthesize the beneficial metabolite conjugated linoleic acid. The genome contains a 319,7363-bp chromosome and three plasmids. The sequence will facilitate identification and characterization of the genetic determinants for its putative biological benefits.

  13. Draft Genome Sequence of Cutaneotrichosporon curvatus DSM 101032 (Formerly Cryptococcus curvatus), an Oleaginous Yeast Producing Polyunsaturated Fatty Acids

    PubMed Central

    Hofmeyer, Thomas; Hackenschmidt, Silke; Nadler, Florian; Thürmer, Andrea; Daniel, Rolf

    2016-01-01

    Cutaneotrichosporon curvatus DSM 101032 is an oleaginous yeast that can be isolated from various habitats and is capable of producing substantial amounts of polyunsaturated fatty acids. Here, we present the first draft genome sequence of any C. curvatus species. PMID:27174275

  14. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus CFL1, a Lactic Acid Bacterium Isolated from French Handcrafted Fermented Milk.

    PubMed

    Meneghel, Julie; Dugat-Bony, Eric; Irlinger, Françoise; Loux, Valentin; Vidal, Marie; Passot, Stéphanie; Béal, Catherine; Layec, Séverine; Fonseca, Fernanda

    2016-03-03

    Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) is a lactic acid bacterium widely used for the production of yogurt and cheeses. Here, we report the genome sequence of L. bulgaricus CFL1 to improve our knowledge on its stress-induced damages following production and end-use processes.

  15. Genome Sequence of a Candidate World Health Organization Reference Strain of Zika Virus for Nucleic Acid Testing.

    PubMed

    Trösemeier, Jan-Hendrik; Musso, Didier; Blümel, Johannes; Thézé, Julien; Pybus, Oliver G; Baylis, Sally A

    2016-01-01

    We report here the sequence of a candidate reference strain of Zika virus (ZIKV) developed on behalf of the World Health Organization (WHO). The ZIKV reference strain is intended for use in nucleic acid amplification (NAT)-based assays for the detection and quantification of ZIKV RNA. PMID:27587826

  16. Ultra high-throughput nucleic acid sequencing as a tool for virus discovery in the turkey gut.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recently, the use of the next generation of nucleic acid sequencing technology (i.e., 454 pyrosequencing, as developed by Roche/454 Life Sciences) has allowed an in-depth look at the uncultivated microorganisms present in complex environmental samples, including samples with agricultural importance....

  17. Genome Sequence of a Candidate World Health Organization Reference Strain of Zika Virus for Nucleic Acid Testing.

    PubMed

    Trösemeier, Jan-Hendrik; Musso, Didier; Blümel, Johannes; Thézé, Julien; Pybus, Oliver G; Baylis, Sally A

    2016-01-01

    We report here the sequence of a candidate reference strain of Zika virus (ZIKV) developed on behalf of the World Health Organization (WHO). The ZIKV reference strain is intended for use in nucleic acid amplification (NAT)-based assays for the detection and quantification of ZIKV RNA.

  18. Sequence heterogeneity of cannabidiolic- and tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical phenotype.

    PubMed

    Onofri, Chiara; de Meijer, Etienne P M; Mandolino, Giuseppe

    2015-08-01

    Sequence variants of THCA- and CBDA-synthases were isolated from different Cannabis sativa L. strains expressing various wild-type and mutant chemical phenotypes (chemotypes). Expressed and complete sequences were obtained from mature inflorescences. Each strain was shown to have a different specificity and/or ability to convert the precursor CBGA into CBDA and/or THCA type products. The comparison of the expressed sequences led to the identification of different mutations, all of them due to SNPs. These SNPs were found to relate to the cannabinoid composition of the inflorescence at maturity and are therefore proposed to have a functional significance. The amount of variation was found to be higher within the CBDAS sequence family than in the THCAS family, suggesting a more recent evolution of THCA-forming enzymes from the CBDAS group. We therefore consider CBDAS as the ancestral type of these synthases.

  19. Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

    PubMed

    Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

    2014-01-01

    A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays.

  20. Comparison of amino acid profiles between rats subjected to forced running and voluntary running exercises

    PubMed Central

    OKAME, Rieko; NAKAHARA, Keiko; KATO, Yumiko; BANNAI, Makoto; MURAKAMI, Noboru

    2015-01-01

    It has been suspected that in comparison with glucose or fatty acids, the levels of amino acids may readily change with different forms of exercise. In the present study, we measured the concentrations of amino acids, glucose, triglycerides, total protein and total cholesterol in the blood and/or cerebrospinal fluid (CSF) of rats subjected to forced running exercise on a treadmill, and voluntary running exercise using a wheel, with a constant running distance of 440 m. Rats that performed no running and rats subjected to immobilization stress were used as controls. We observed a few significant changes in the levels of plasma glucose, triglycerides, total protein and total cholesterol in all groups. Whereas, plasma amino acid levels were significantly changed by exercise and stress, especially during the light period. The plasma levels of many amino acids were specifically increased by forced running; some were decreased by immobilization stress. Few amino acids showed similar changes in their levels as a result of voluntary running. In addition, there was a significant difference in the degree of amino acid imbalance between blood and CSF. These results provide the first information on changes in levels of amino acids in plasma and CSF resulting from forced and voluntary exercises. PMID:25715957

  1. Comparison of amino acid profiles between rats subjected to forced running and voluntary running exercises.

    PubMed

    Okame, Rieko; Nakahara, Keiko; Kato, Yumiko; Bannai, Makoto; Murakami, Noboru

    2015-06-01

    It has been suspected that in comparison with glucose or fatty acids, the levels of amino acids may readily change with different forms of exercise. In the present study, we measured the concentrations of amino acids, glucose, triglycerides, total protein and total cholesterol in the blood and/or cerebrospinal fluid (CSF) of rats subjected to forced running exercise on a treadmill, and voluntary running exercise using a wheel, with a constant running distance of 440 m. Rats that performed no running and rats subjected to immobilization stress were used as controls. We observed a few significant changes in the levels of plasma glucose, triglycerides, total protein and total cholesterol in all groups. Whereas, plasma amino acid levels were significantly changed by exercise and stress, especially during the light period. The plasma levels of many amino acids were specifically increased by forced running; some were decreased by immobilization stress. Few amino acids showed similar changes in their levels as a result of voluntary running. In addition, there was a significant difference in the degree of amino acid imbalance between blood and CSF. These results provide the first information on changes in levels of amino acids in plasma and CSF resulting from forced and voluntary exercises.

  2. Amino acid sequence of Coprinus macrorhizus peroxidase and cDNA sequence encoding Coprinus cinereus peroxidase. A new family of fungal peroxidases.

    PubMed

    Baunsgaard, L; Dalbøge, H; Houen, G; Rasmussen, E M; Welinder, K G

    1993-04-01

    Sequence analysis and cDNA cloning of Coprinus peroxidase (CIP) were undertaken to expand the understanding of the relationships of structure, function and molecular genetics of the secretory heme peroxidases from fungi and plants. Amino acid sequencing of Coprinus macrorhizus peroxidase, and cDNA sequencing of Coprinus cinereus peroxidase showed that the mature proteins are identical in amino acid sequence, 343 residues in size and preceded by a 20-residue signal peptide. Their likely identity to peroxidase from Arthromyces ramosus is discussed. CIP has an 8-residue, glycine-rich N-terminal extension blocked with a pyroglutamate residue which is absent in other fungal peroxidases. The presence of pyroglutamate, formed by cyclization of glutamine, and the finding of a minor fraction of a variant form lacking the N-terminal residue, indicate that signal peptidase cleavage is followed by further enzymic processing. CIP is 40-45% identical in amino-acid sequence to 11 lignin peroxidases from four fungal species, and 42-43% identical to the two known Mn-peroxidases. Like these white-rot fungal peroxidases, CIP has an additional segment of approximately 40 residues at the C-terminus which is absent in plant peroxidases. Although CIP is much more similar to horseradish peroxidase (HRP C) in substrate specificity, specific activity and pH optimum than to white-rot fungal peroxidases, the sequences of CIP and HRP C showed only 18% identity. Hence, CIP qualifies as the first member of a new family of fungal peroxidases. The nine invariant residues present in all plant, fungal and bacterial heme peroxidases are also found in CIP. The present data support the hypothesis that only one chromosomal CIP gene exists. In contrast, a large number of secretory plant and fungal peroxidases are expressed from several peroxidase gene clusters. Analyses of three batches of CIP protein and of 49 CIP clones revealed the existence of only two highly similar alleles indicating less

  3. Cry1Aa binding to the cadherin receptor does not require conserved amino acid sequences in the domain II loops

    PubMed Central

    Fujii, Yuki; Tanaka, Shiho; Otsuki, Manami; Hoshino, Yasushi; Morimoto, Chinatsu; Kotani, Takuya; Harashima, Yuko; Endo, Haruka; Yoshizawa, Yasutaka; Sato, Ryoichi

    2012-01-01

    Characterizing the binding mechanism of Bt (Bacillus thuringiensis) Cry toxin to the cadherin receptor is indispensable to understanding the specific insecticidal activity of this toxin. To this end, we constructed 30 loop mutants by randomly inserting four serial amino acids covering all four receptor binding loops (loops α8, 1, 2 and 3) and analysed their binding affinities for Bombyx mori cadherin receptors via Biacore. High binding affinities were confirmed for all 30 mutants containing loop sequences that differed from those of wild-type. Insecticidal activities were confirmed in at least one mutant from loops 1, 2 and 3, suggesting that there is no critical amino acid sequence for the binding of the four loops to BtR175. When two mutations at different loops were integrated into one molecule, no reduction in binding affinity was observed compared with wild-type sequences. Based on these results, we discussed the binding mechanism of Cry toxin to cadherin protein. PMID:23145814

  4. The amino acid sequence of protein SCMK-B2C from the high-sulphur fraction of wool keratin.

    PubMed

    Elleman, T C

    1972-08-01

    1. The amino acid sequence of a protein from the reduced and carboxymethylated high-sulphur fraction of wool has been determined. 2. The sequence of this S-carboxymethylkerateine (SCMK-B2C) of 151 amino acid residues displays much internal homology and an unusual residue distribution. Thus a ten-residue sequence occurs four times near the N-terminus and five times near the C-terminus with few changes. These regions contain much of the molecule's half-cystine, whereas between them there is a region of 19 residues that are mainly small and devoid of cystine and proline. 3. Certain models of the wool fibre based on its mechanical and physical properties propose a matrix of small compact globular units linked together to form beaded chains. The unusual distribution of the component residues of protein SCMK-B2C suggests structures in the wool-fibre matrix compatible with certain features of the proposed models.

  5. Sequence-Specific Recognition of MicroRNAs and Other Short Nucleic Acids with Solid-State Nanopores.

    PubMed

    Zahid, Osama K; Wang, Fanny; Ruzicka, Jan A; Taylor, Ethan W; Hall, Adam R

    2016-03-01

    The detection and quantification of short nucleic acid sequences has many potential applications in studying biological processes, monitoring disease initiation and progression, and evaluating environmental systems, but is challenging by nature. We present here an assay based on the solid-state nanopore platform for the identification of specific sequences in solution. We demonstrate that hybridization of a target nucleic acid with a synthetic probe molecule enables discrimination between duplex and single-stranded molecules with high efficacy. Our approach requires limited preparation of samples and yields an unambiguous translocation event rate enhancement that can be used to determine the presence and abundance of a single sequence within a background of nontarget oligonucleotides. PMID:26824296

  6. Comparison of normalization methods for construction of large, multiplex amplicon pools for next-generation sequencing.

    PubMed

    Harris, J Kirk; Sahl, Jason W; Castoe, Todd A; Wagner, Brandie D; Pollock, David D; Spear, John R

    2010-06-01

    Constructing mixtures of tagged or bar-coded DNAs for sequencing is an important requirement for the efficient use of next-generation sequencers in applications where limited sequence data are required per sample. There are many applications in which next-generation sequencing can be used effectively to sequence large mixed samples; an example is the characterization of microbial communities where sequences per samples are adequate to address research questions. Thus, it is possible to examine hundreds to thousands of samples per run on massively parallel next-generation sequencers. However, the cost savings for efficient utilization of sequence capacity is realized only if the production and management costs associated with construction of multiplex pools are also scalable. One critical step in multiplex pool construction is the normalization process, whereby equimolar amounts of each amplicon are mixed. Here we compare three approaches (spectroscopy, size-restricted spectroscopy, and quantitative binding) for normalization of large, multiplex amplicon pools for performance and efficiency. We found that the quantitative binding approach was superior and represents an efficient scalable process for construction of very large, multiplex pools with hundreds and perhaps thousands of individual amplicons included. We demonstrate the increased sequence diversity identified with higher throughput. Massively parallel sequencing can dramatically accelerate microbial ecology studies by allowing appropriate replication of sequence acquisition to account for temporal and spatial variations. Further, population studies to examine genetic variation, which require even lower levels of sequencing, should be possible where thousands of individual bar-coded amplicons are examined in parallel. PMID:20418443

  7. A comparison of magnetic resonance imaging sequences in evaluating pathological changes in the canine spinal cord.

    PubMed

    Adamiak, Z; Pomianowski, A; Zhalniarovich, Y; Kwiatkowska, M; Jaskólska, M; Bocheńska, A

    2011-01-01

    This paper discusses 28 canine patients subjected to low-field magnetic resonance imaging (MRI) of the spinal cord for neurological indications. The authors describe and compare the used MRI sequences with an indication of the most effective sequences in MRI examinations that require short scanning time. The most effective sequences supporting a quick diagnosis of spinal diseases in dogs were SE (spin echo), FSE (fast spin echo) and 3D HYCE (hybrid contrast enhancement). PMID:21957746

  8. Complete Genome Sequence of the Pokeweed Mosaic Virus (PkMV)-New Jersey Isolate and Its Comparison to PkMV-MD and PkMV-PA.

    PubMed

    Di, Rong

    2016-09-08

    Pokeweed mosaic virus (PkMV) causes systemically mosaic symptoms on pokeweed (Phytolacca americana L.) plants. The genome of the PkMV-NJ (New Jersey) isolate was cloned by PCR and sequenced by the Sanger sequencing method. The sequence comparison indicates that PkMV-NJ is more divergent from the other two sequenced isolates, PkMV-MD and PkMV-PA.

  9. Complete Genome Sequence of the Pokeweed Mosaic Virus (PkMV)-New Jersey Isolate and Its Comparison to PkMV-MD and PkMV-PA

    PubMed Central

    2016-01-01

    Pokeweed mosaic virus (PkMV) causes systemically mosaic symptoms on pokeweed (Phytolacca americana L.) plants. The genome of the PkMV-NJ (New Jersey) isolate was cloned by PCR and sequenced by the Sanger sequencing method. The sequence comparison indicates that PkMV-NJ is more divergent from the other two sequenced isolates, PkMV-MD and PkMV-PA. PMID:27609914

  10. Complete Genome Sequence of the Pokeweed Mosaic Virus (PkMV)-New Jersey Isolate and Its Comparison to PkMV-MD and PkMV-PA.

    PubMed

    Di, Rong

    2016-01-01

    Pokeweed mosaic virus (PkMV) causes systemically mosaic symptoms on pokeweed (Phytolacca americana L.) plants. The genome of the PkMV-NJ (New Jersey) isolate was cloned by PCR and sequenced by the Sanger sequencing method. The sequence comparison indicates that PkMV-NJ is more divergent from the other two sequenced isolates, PkMV-MD and PkMV-PA. PMID:27609914

  11. Application of MLST and pilus gene sequence comparisons to investigate the population structures of Actinomyces naeslundii and Actinomyces oris.

    PubMed

    Henssge, Uta; Do, Thuy; Gilbert, Steven C; Cox, Steven; Clark, Douglas; Wickström, Claes; Ligtenberg, A J M; Radford, David R; Beighton, David

    2011-01-01

    Actinomyces naeslundii and Actinomyces oris are members of the oral biofilm. Their identification using 16S rRNA sequencing is problematic and better achieved by comparison of metG partial sequences. A. oris is more abundant and more frequently isolated than A. naeslundii. We used a multi-locus sequence typing approach to investigate the genotypic diversity of these species and assigned A. naeslundii (n = 37) and A. oris (n = 68) isolates to 32 and 68 sequence types (ST), respectively. Neighbor-joining and ClonalFrame dendrograms derived from the concatenated partial sequences of 7 house-keeping genes identified at least 4 significant subclusters within A. oris and 3 within A. naeslundii. The strain collection we had investigated was an under-representation of the total population since at least 3 STs composed of single strains may represent discrete clusters of strains not well represented in the collection. The integrity of these sub-clusters was supported by the sequence analysis of fimP and fimA, genes coding for the type 1 and 2 fimbriae, respectively. An A. naeslundii subcluster was identified with both fimA and fimP genes and these strains were able to bind to MUC7 and statherin while all other A. naeslundii strains possessed only fimA and did not bind to statherin. An A. oris subcluster harboured a fimA gene similar to that of Actinomyces odontolyticus but no detectable fimP failed to bind significantly to either MUC7 or statherin. These data are evidence of extensive genotypic and phenotypic diversity within the species A. oris and A. naeslundii but the status of the subclusters identified here will require genome comparisons before their phylogenic position can be unequivocally established.

  12. Comparison of Sequencing Platforms for Single Nucleotide Variant Calls in a Human Sample

    PubMed Central

    Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required. PMID:23405114

  13. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein‐Coding Regions

    PubMed Central

    Lelieveld, Stefan H.; Spielmann, Malte; Mundlos, Stefan; Veltman, Joris A.

    2015-01-01

    ABSTRACT For next‐generation sequencing technologies, sufficient base‐pair coverage is the foremost requirement for the reliable detection of genomic variants. We investigated whether whole‐genome sequencing (WGS) platforms offer improved coverage of coding regions compared with whole‐exome sequencing (WES) platforms, and compared single‐base coverage for a large set of exome and genome samples. We find that WES platforms have improved considerably in the last years, but at comparable sequencing depth, WGS outperforms WES in terms of covered coding regions. At higher sequencing depth (95x–160x), WES successfully captures 95% of the coding regions with a minimal coverage of 20x, compared with 98% for WGS at 87‐fold coverage. Three different assessments of sequence coverage bias showed consistent biases for WES but not for WGS. We found no clear differences for the technologies concerning their ability to achieve complete coverage of 2,759 clinically relevant genes. We show that WES performs comparable to WGS in terms of covered bases if sequenced at two to three times higher coverage. This does, however, go at the cost of substantially more sequencing biases in WES approaches. Our findings will guide laboratories to make an informed decision on which sequencing platform and coverage to choose. PMID:25973577

  14. Comparison of sequencing platforms for single nucleotide variant calls in a human sample.

    PubMed

    Ratan, Aakrosh; Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar; Schuster, Stephan C

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required.

  15. Sequence dependent N-terminal rearrangement and degradation of peptide nucleic acid (PNA) in aqueous solution

    NASA Technical Reports Server (NTRS)

    Eriksson, M.; Christensen, L.; Schmidt, J.; Haaima, G.; Orgel, L.; Nielsen, P. E.

    1998-01-01

    The stability of the PNA (peptide nucleic acid) thymine monomer inverted question markN-[2-(thymin-1-ylacetyl)]-N-(2-aminoaminoethyl)glycine inverted question mark and those of various PNA oligomers (5-8-mers) have been measured at room temperature (20 degrees C) as a function of pH. The thymine monomer undergoes N-acyl transfer rearrangement with a half-life of 34 days at pH 11 as analyzed by 1H NMR; and two reactions, the N-acyl transfer and a sequential degradation, are found by HPLC analysis to occur at measurable rates for the oligomers at pH 9 or above. Dependent on the amino-terminal sequence, half-lives of 350 h to 163 days were found at pH 9. At pH 12 the half-lives ranged from 1.5 h to 21 days. The results are discussed in terms of PNA as a gene therapeutic drug as well as a possible prebiotic genetic material.

  16. The cDNA-derived amino acid sequence of hemoglobin II from Lucina pectinata.

    PubMed

    Torres-Mercado, Elineth; Renta, Jessicca Y; Rodríguez, Yolanda; López-Garriga, Juan; Cadilla, Carmen L

    2003-11-01

    Hemoglobin II from the clam Lucina pectinata is an oxygen-reactive protein with a unique structural organization in the heme pocket involving residues Gln65 (E7), Tyr30 (B10), Phe44 (CD1), and Phe69 (E11). We employed the reverse transcriptase-polymerase chain reaction (RT-PCR) and methods to synthesize various cDNA(HbII). An initial 300-bp cDNA clone was amplified from total RNA by RT-PCR using degenerate oligonucleotides. Gene-specific primers derived from the HbII-partial cDNA sequence were used to obtain the 5' and 3' ends of the cDNA by RACE. The length of the HbII cDNA, estimated from overlapping clones, was approximately 2114 bases. Northern blot analysis revealed that the mRNA size of HbII agrees with the estimated size using cDNA data. The coding region of the full-length HbII cDNA codes for 151 amino acids. The calculated molecular weight of HbII, including the heme group and acetylated N-terminal residue, is 17,654.07 Da.

  17. Phylogeny of the Sphaerotilus-Leptothrix group inferred from morphological comparisons, genomic fingerprinting, and 16S ribosomal DNA sequence analyses.

    PubMed

    Siering, P L; Ghiorse, W C

    1996-01-01

    Phase-contrast light microscopy revealed that only one of eight cultivated strains belonging to the Sphaerotilus-Leptothrix group of sheathed bacteria actually produced a sheath in standard growth media. Two Sphaerotilus natans strains produced branched cells, but other morphological characteristics that were used to identify these bacteria were consistent with previously published descriptions. Genomic fingerprints, which were obtained by performing PCR amplification with primers corresponding to enterobacterial repetitive intergenic consensus sequences, were useful for distinguishing between the genera Sphaerotilus and Leptothrix, as well as among individual strains. The complete 16S ribosomal DNA (rDNA) sequences of two strains of "Leptothrix discophora" (strains SP-6 and SS-1) were determined. In addition, partial sequences (approximately 300 nucleotides) of one strain of Leptothrix cholodnii (strain LMG 7171), an unidentified Leptothrix strain (strain NC-1), and four strains of Sphaerotilus natans (strains ATCC 13338T [T = type strain], ATCC 15291, ATCC 29329, and ATCC 29330) were determined. We found that two of the S. natans strains (ATCC 15291 and ATCC 13338T), which differed in morphology and in their genomic fingerprints, had identical sequences in the 300-nucleotide region sequenced. Both parsimony and distance matrix methods were used to infer the evolutionary relationships of the eight strains in a comparison of the 16S rDNA sequences of these organisms with 16S rDNA sequences obtained from ribosomal sequence databases. All of the strains clustered in the Rubrivivax subdivision of the beta subclass of the Proteobacteria, which confirmed previously published conclusions concerning selected individual strains. Additional analyses revealed that all of the S. natans strains clustered in one closely related group, while the Leptothrix strains clustered in two separate lineages that were approximately equidistant from the S. natans cluster. This finding

  18. Phylogeny of the Sphaerotilus-Leptothrix group inferred from morphological comparisons, genomic fingerprinting, and 16S ribosomal DNA sequence analyses.

    PubMed

    Siering, P L; Ghiorse, W C

    1996-01-01

    Phase-contrast light microscopy revealed that only one of eight cultivated strains belonging to the Sphaerotilus-Leptothrix group of sheathed bacteria actually produced a sheath in standard growth media. Two Sphaerotilus natans strains produced branched cells, but other morphological characteristics that were used to identify these bacteria were consistent with previously published descriptions. Genomic fingerprints, which were obtained by performing PCR amplification with primers corresponding to enterobacterial repetitive intergenic consensus sequences, were useful for distinguishing between the genera Sphaerotilus and Leptothrix, as well as among individual strains. The complete 16S ribosomal DNA (rDNA) sequences of two strains of "Leptothrix discophora" (strains SP-6 and SS-1) were determined. In addition, partial sequences (approximately 300 nucleotides) of one strain of Leptothrix cholodnii (strain LMG 7171), an unidentified Leptothrix strain (strain NC-1), and four strains of Sphaerotilus natans (strains ATCC 13338T [T = type strain], ATCC 15291, ATCC 29329, and ATCC 29330) were determined. We found that two of the S. natans strains (ATCC 15291 and ATCC 13338T), which differed in morphology and in their genomic fingerprints, had identical sequences in the 300-nucleotide region sequenced. Both parsimony and distance matrix methods were used to infer the evolutionary relationships of the eight strains in a comparison of the 16S rDNA sequences of these organisms with 16S rDNA sequences obtained from ribosomal sequence databases. All of the strains clustered in the Rubrivivax subdivision of the beta subclass of the Proteobacteria, which confirmed previously published conclusions concerning selected individual strains. Additional analyses revealed that all of the S. natans strains clustered in one closely related group, while the Leptothrix strains clustered in two separate lineages that were approximately equidistant from the S. natans cluster. This finding

  19. Comparison of peak shape in hydrophilic interaction chromatography using acidic salt buffers and simple acid solutions.

    PubMed

    Heaton, James C; Russell, Joseph J; Underwood, Tim; Boughtflower, Robert; McCalley, David V

    2014-06-20

    The retention and peak shape of neutral, basic and acidic solutes was studied on hydrophilic interaction chromatography (HILIC) stationary phases that showed both strong and weak ionic retention characteristics, using aqueous-acetonitrile mobile phases containing either formic acid (FA), ammonium formate (AF) or phosphoric acid (PA). The effect of organic solvent concentration on the results was also studied. Peak shape was good for neutrals under most mobile phase conditions. However, peak shapes for ionised solutes, particularly for basic compounds, were considerably worse in FA than AF. Even neutral compounds showed deterioration in performance with FA when the mobile phase water concentration was reduced. The poor performance in FA cannot be entirely attributed to the negative impact of ionic retention on ionised silanols on the underlying silica base materials, as results using PA at lower pH (where their ionisation is suppressed) were inferior to those in AF. Besides the moderating influence of the salt cation on ionic retention, it is likely that salt buffers improve peak shape due to the increased ionic strength of the mobile phase and its impact on the formation of the water layer on the column surface.

  20. Genome sequence of the deep-sea gamma-proteobacterium Idiomarina loihiensis reveals amino acid fermentation as a source of carbon and energy.

    PubMed

    Hou, Shaobin; Saw, Jimmy H; Lee, Kit Shan; Freitas, Tracey A; Belisle, Claude; Kawarabayasi, Yutaka; Donachie, Stuart P; Pikina, Alla; Galperin, Michael Y; Koonin, Eugene V; Makarova, Kira S; Omelchenko, Marina V; Sorokin, Alexander; Wolf, Yuri I; Li, Qing X; Keum, Young Soo; Campbell, Sonia; Denery, Judith; Aizawa, Shin-Ichi; Shibata, Satoshi; Malahoff, Alexander; Alam, Maqsudul

    2004-12-28

    We report the complete genome sequence of the deep-sea gamma-proteobacterium, Idiomarina loihiensis, isolated recently from a hydrothermal vent at 1,300-m depth on the Loihi submarine volcano, Hawaii. The I. loihiensis genome comprises a single chromosome of 2,839,318 base pairs, encoding 2,640 proteins, four rRNA operons, and 56 tRNA genes. A comparison of I. loihiensis to the genomes of other gamma-proteobacteria reveals abundance of amino acid transport and degradation enzymes, but a loss of sugar transport systems and certain enzymes of sugar metabolism. This finding suggests that I. loihiensis relies primarily on amino acid catabolism, rather than on sugar fermentation, for carbon and energy. Enzymes for biosynthesis of purines, pyrimidines, the majority of amino acids, and coenzymes are encoded in the genome, but biosynthetic pathways for Leu, Ile, Val, Thr, and Met are incomplete. Auxotrophy for Val and Thr was confirmed by in vivo experiments. The I. loihiensis genome contains a cluster of 32 genes encoding enzymes for exopolysaccharide and capsular polysaccharide synthesis. It also encodes diverse peptidases, a variety of peptide and amino acid uptake systems, and versatile signal transduction machinery. We propose that the source of amino acids for I. loihiensis growth are the proteinaceous particles present in the deep sea hydrothermal vent waters. I. loihiensis would colonize these particles by using the secreted exopolysaccharide, digest these proteins, and metabolize the resulting peptides and amino acids. In summary, the I. loihiensis genome reveals an integrated mechanism of metabolic adaptation to the constantly changing deep-sea hydrothermal ecosystem. PMID:15596722

  1. Comparison of Sample Preparation Methods Used for the Next-Generation Sequencing of Mycobacterium tuberculosis.

    PubMed

    Tyler, Andrea D; Christianson, Sara; Knox, Natalie C; Mabon, Philip; Wolfe, Joyce; Van Domselaar, Gary; Graham, Morag R; Sharma, Meenu K

    2016-01-01

    The advent and widespread application of next-generation sequencing (NGS) technologies to the study of microbial genomes has led to a substantial increase in the number of studies in which whole genome sequencing (WGS) is applied to the analysis of microbial genomic epidemiology. However, microorganisms such as Mycobacterium tuberculosis (MTB) present unique problems for sequencing and downstream analysis based on their unique physiology and the composition of their genomes. In this study, we compare the quality of sequence data generated using the Nextera and TruSeq isolate preparation kits for library construction prior to Illumina sequencing-by-synthesis. Our results confirm that MTB NGS data quality is highly dependent on the purity of the DNA sample submitted for sequencing and its guanine-cytosine content (or GC-content). Our data additionally demonstrate that the choice of library preparation method plays an important role in mitigating downstream sequencing quality issues. Importantly for MTB, the Illumina TruSeq library preparation kit produces more uniform data quality than the Nextera XT method, regardless of the quality of the input DNA. Furthermore, specific genomic sequence motifs are commonly missed by the Nextera XT method, as are regions of especially high GC-content relative to the rest of the MTB genome. As coverage bias is highly undesirable, this study illustrates the importance of appropriate protocol selection when performing NGS studies in order to ensure that sound inferences can be made regarding mycobacterial genomes. PMID:26849565

  2. 5S ribosomal ribonucleic acid sequences in Bacteroides and Fusobacterium: evolutionary relationships within these genera and among eubacteria in general

    NASA Technical Reports Server (NTRS)

    Van den Eynde, H.; De Baere, R.; Shah, H. N.; Gharbia, S. E.; Fox, G. E.; Michalik, J.; Van de Peer, Y.; De Wachter, R.

    1989-01-01

    The 5S ribosomal ribonucleic acid (rRNA) sequences were determined for Bacteroides fragilis, Bacteroides thetaiotaomicron, Bacteroides capillosus, Bacteroides veroralis, Porphyromonas gingivalis, Anaerorhabdus furcosus, Fusobacterium nucleatum, Fusobacterium mortiferum, and Fusobacterium varium. A dendrogram constructed by a clustering algorithm from these sequences, which were aligned with all other hitherto known eubacterial 5S rRNA sequences, showed differences as well as similarities with respect to results derived from 16S rRNA analyses. In the 5S rRNA dendrogram, Bacteroides clustered together with Cytophaga and Fusobacterium, as in 16S rRNA analyses. Intraphylum relationships deduced from 5S rRNAs suggested that Bacteroides is specifically related to Cytophaga rather than to Fusobacterium, as was suggested by 16S rRNA analyses. Previous taxonomic considerations concerning the genus Bacteroides, based on biochemical and physiological data, were confirmed by the 5S rRNA sequence analysis.

  3. Molecular cloning, coding nucleotides and the deduced amino acid sequence of P-450BM-1 from Bacillus megaterium.

    PubMed

    He, J S; Ruettinger, R T; Liu, H M; Fulco, A J

    1989-12-22

    The gene encoding barbiturate-inducible cytochrome P-450BM-1 from Bacillus megaterium ATCC 14581 has been cloned and sequenced. An open reading frame in the 1.9 kb of cloned DNA correctly predicted the NH2-terminal sequence of P-450BM-1 previously determined by protein sequencing, and, in toto, predicted a polypeptide of 410 amino acid residues with an Mr of 47,439. The sequence is most, but less than 27%, similar to that of P-450CAM from Pseudomonas putida, so that P-450BM-1 clearly belongs to a new P-450-gene family, distinct especially from that of the P-450 domain of P-450BM-3, a barbiturate-inducible single polypeptide cytochrome P-450:NADPH-P-450 reductase from the same strain of B. megaterium (Ruettinger, R.T., Wen, L.-P. and Fulco, A.J. (1989) J. Biol. Chem. 264, 10987-10995). PMID:2597681

  4. Sequence analysis and comparison of ribosomal DNA from bovine Neospora to similar coccidial parasites.

    PubMed

    Marsh, A E; Barr, B C; Sverlow, K; Ho, M; Dubey, J P; Conrad, P A

    1995-08-01

    The nuclear small subunit ribosomal RNA (nss-rRNA) gene sequence of Neospora spp. isolated from cattle was analyzed and compared to the sequences from several closely related cyst-forming coccidial parasites. Double-stranded DNA sequencing of 5 bovine Neospora spp. isolates (BPA1-4), 2 Neospora caninum isolates (NC-1 and NC-3), and 3 Toxoplasma gondii isolates (RH, GT-1, CT-1) were performed and compared to each other, as well as to other sequences available in GenBank for the NC-1 isolate, Sarcocystis muris, and Cryptosporidium parvum. There were no nucleotide differences detected between the Neospora spp. isolates from cattle and dogs. Four nucleotide differences were consistently detected when sequences of Neospora spp. isolates were compared to those of the T. gondii isolates. These results indicate that Neospora spp. and T. gondii are closely related, but distinct, species.

  5. Comparison of winding-number sequences for symmetric and asymmetric oscillatory systems.

    PubMed

    Englisch, Volker; Parlitz, Ulrich; Lauterborn, Werner

    2015-08-01

    The bifurcation sets of symmetric and asymmetric periodically driven oscillators are investigated and classified by means of winding numbers. It is shown that periodic windows within chaotic regions are forming winding-number sequences on different levels. These sequences can be described by a simple formula that makes it possible to predict winding numbers at bifurcation points. Symmetric and asymmetric systems follow similar rules for the development of winding numbers within different sequences and these sequences can be combined into a single general rule. The role of the two distinct period-doubling cascades is investigated in the light of the winding-number sequences discovered. Examples are taken from the double-well Duffing oscillator, a special two-parameter Duffing oscillator, and a bubble oscillator. PMID:26382476

  6. Sample Prep, Workflow Automation and Nucleic Acid Fractionation for Next Generation Sequencing

    SciTech Connect

    Roskey, Mark

    2010-06-03

    Mark Roskey of Caliper LifeSciences discusses how the company's technologies fit into the next generation sequencing workflow on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  7. Sequence-specific DNA binding by long hairpin pyrrole-imidazole polyamides containing an 8-amino-3,6-dioxaoctanoic acid unit.

    PubMed

    Sawatani, Yoshito; Kashiwazaki, Gengo; Chandran, Anandhakumar; Asamitsu, Sefan; Guo, Chuanxin; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-08-15

    With the aim of improving aqueous solubility, we designed and synthesized five N-methylpyrrole (Py)-N-methylimidazole (Im) polyamides capable of recognizing 9-bp sequences. Their DNA-binding affinities and sequence specificities were evaluated by SPR and Bind-n-Seq analyses. The design of polyamide 1 was based on a conventional model, with three consecutive Py or Im rings separated by a β-alanine to match the curvature and twist of long DNA helices. Polyamides 2 and 3 contained an 8-amino-3,6-dioxaoctanoic acid (AO) unit, which has previously only been used as a linker within linear Py-Im polyamides or between Py-Im hairpin motifs for tandem hairpin. It is demonstrated herein that AO also functions as a linker element that can extend to 2-bp in hairpin motifs. Notably, although the AO-containing unit can fail to bind the expected sequence, polyamide 4, which has two AO units facing each other in a hairpin form, successfully showed the expected motif and a KD value of 16nM was recorded. Polyamide 5, containing a β-alanine-β-alanine unit instead of the AO of polyamide 2, was synthesized for comparison. The aqueous solubilities and nuclear localization of three of the polyamides were also examined. The results suggest the possibility of applying the AO unit in the core of Py-Im polyamide compounds. PMID:27301681

  8. Evolution of vertebrate IgM: complete amino acid sequence of the constant region of Ambystoma mexicanum mu chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Wiles, M V; Charlemagne, J; Schwager, J

    1992-10-01

    cDNA clones coding for the constant region of the Mexican axolotl (Ambystoma mexicanum) mu heavy immunoglobulin chain were selected from total spleen RNA, using a cDNA polymerase chain reaction technique. The specific 5'-end primer was an oligonucleotide homologous to the JH segment of Xenopus laevis mu chain. One of the clones, JHA/3, corresponded to the complete constant region of the axolotl mu chain, consisting of a 1362-nucleotide sequence coding for a polypeptide of 454 amino acids followed in 3' direction by a 179-nucleotide untranslated region and a polyA+ tail. The axolotl C mu is divided into four typical domains (C mu 1-C mu 4) and can be aligned with the Xenopus C mu with an overall identity of 56% at the nucleotide level. Percent identities were particularly high between C mu 1 (59%) and C mu 4 (71%). The C-terminal 20-amino acid segment which constitutes the secretory part of the mu chain is strongly homologous to the equivalent sequences of chondrichthyans and of other tetrapods, including a conserved N-linked oligosaccharide, the penultimate cysteine and the C-terminal lysine. The four C mu domains of 13 vertebrate species ranging from chondrichthyans to mammals were aligned and compared at the amino acid level. The significant number of mu-specific residues which are conserved into each of the four C mu domains argues for a continuous line of evolution of the vertebrate mu chain. This notion was confirmed by the ability to reconstitute a consistent vertebrate evolution tree based on the phylogenic parsimony analysis of the C mu 4 sequences. PMID:1382992

  9. How is the serial order of a verbal sequence coded? Some comparisons between models.

    PubMed

    Hitch, Graham J; Fastame, Maria Chiara; Flude, Brenda

    2005-01-01

    Current models of verbal short-term memory (STM) propose various mechanisms for serial order. These include a gradient of activation over items, associations between items, and associations between items and their positions relative to the start or end of a sequence. We compared models using a variant of Hebb's procedure in which immediate serial recall of a sequence improves if the sequence is presented more than once. However, instead of repeating a complete sequence, we repeated different aspects of serial order information common to training lists and a subsequent test list. In Experiment 1, training lists repeated all the item-item pairings in the test list, with or without the position-item pairings in the test list. Substantial learning relative to a control condition was observed only when training lists repeated item-item pairs with position-item pairs, and position was defined relative to the start rather than end of a sequence. Experiment 2 attempted to analyse the basis of this learning effect further by repeating fragments of the test list during training, where fragments consisted of either isolated position-item pairings or clusters of both position-item and item-item pairings. Repetition of sequence fragments led to only weak learning effects. However, where learning was observed it was for specific position-item pairings. We conclude that positional cues play an important role in the coding of serial order in memory but that the information required to learn a sequence goes beyond position-item associations. We suggest that whereas STM for a novel sequence is based on positional cues, learning a sequence involves the development of some additional representation of the sequence as a whole.

  10. Comparison of phenolic acids profile and antioxidant potential of six varieties of spelt (Triticum spelta L.).

    PubMed

    Gawlik-Dziki, Urszula; Świeca, Michał; Dziki, Dariusz

    2012-05-01

    Phenolic acids profile and antioxidant activity of six diverse varieties of spelt are reported. Antioxidant activity was assessed using eight methods based on different mechanism of action. Phenolic acids composition of spelt differed significantly between varieties and ranged from 506.6 to 1257.4 μg/g DW. Ferulic and sinapinic acids were the predominant phenolic acids found in spelt. Total ferulic acid content ranged from 144.2 to 691.5 μg/g DW. All analyzed spelt varieties possessed high antioxidant potential. In spite of the fact that bound phenolic acids possessed higher antioxidant activities, analysis of antioxidant potential and their relationship with phenolic acid content showed that free phenolics were more effective. Eight antioxidant methods were integrated to obtain a total antioxidant capacity index that may be used for comparison of total antioxidant capacity of spelt varieties. Total antioxidant potential of spelt cultivars were ordered as follows: Ceralio > Spelt INZ ≈ Ostro > Oberkulmer Rotkorn > Schwabenspelz > Schwabenkorn.

  11. Balbiani ring DNA: sequence comparisons and evolutionary history of a family of hierarchically repetitive protein-coding genes.

    PubMed

    Pustell, J; Kafatos, F C; Wobus, U; Bäumlein, H

    1984-01-01

    All known types of Balbiani ring (BR) genes consist of multiple, tandemly arranged, ca. 180 to 300-bp repeat units that can be divided into a constant region and a subrepeat region. The latter region includes short tandem subrepeats (SRs). Comparison of all available BR sequences using computer methods has enabled us (a) to define more precisely the constant and subrepeat regions, (b) to infer the evolutionary relationships among the various types of BR repeats, (c) to derive a consensus approximation of an ancestral sequence from a small segment of which the highly diverse present-day SRs may have originated, and (d) to detect an underlying substructure in the constant region, evident in the consensus but not in the present-day sequences and possibly corresponding to an original 39-bp DNA segment from which the extant, giant BR sequences may have evolved. We discuss the processes of reduplication, diversification, and homogenization within the hierarchically repetitive BR sequences as examples of how a simple DNA element may evolve into a diverse family of large, protein-coding genes.

  12. Interlaboratory comparison of measurements of acid-volatile sulfide and simultaneously extracted nickel in spiked sediments

    USGS Publications Warehouse

    Brumbaugh, W.G.; Hammerschmidt, C.R.; Zanella, L.; Rogevich, E.; Salata, G.; Bolek, R.

    2011-01-01

    An interlaboratory comparison of acid-volatile sulfide (AVS) and simultaneously extracted nickel (SEM-Ni) measurements of sediments was conducted among five independent laboratories. Relative standard deviations for the seven test samples ranged from 5.6 to 71% (mean=25%) for AVS and from 5.5 to 15% (mean=10%) for SEM-Ni. These results are in stark contrast to a recently published study that indicated AVS and SEM analyses were highly variable among laboratories. ?? 2011 SETAC.

  13. Interlaboratory comparison of measurements of acid-volatile sulfide and simultaneously extracted nickel in spiked sediments

    USGS Publications Warehouse

    Brumbaugh, William G.; Hammerschmidt, Chad R.; Zanella, Luciana; Rogevich, Emily; Salata, Gregory; Bolek, Radoslaw

    2011-01-01

    An interlaboratory comparison of acid-volatile sulfide (AVS) and simultaneously extracted nickel (SEM_Ni) measurements of sediments was conducted among five independent laboratories. Relative standard deviations for the seven test samples ranged from 5.6 to 71% (mean?=?25%) for AVS and from 5.5 to 15% (mean?=?10%) for SEM_Ni. These results are in stark contrast to a recently published study that indicated AVS and SEM analyses were highly variable among laboratories.

  14. Gene structure and amino acid sequence of Latimeria chalumnae (coelacanth) myelin DM20: phylogenetic relation of the fish.

    PubMed

    Tohyama, Y; Kasama-Yoshida, H; Sakuma, M; Kobayashi, Y; Cao, Y; Hasegawa, M; Kojima, H; Tamai, Y; Tanokura, M; Kurihara, T

    1999-07-01

    The structure of Latimeria chalumnae (coelacanth) proteolipid protein/DM20 gene excluding exon 1 was determined, and the amino acid sequence of Latimeria DM20 corresponding to exons 2-7 was deduced. The nucleotide sequence of exon 3 suggests that only DM20 isoform is expressed in Latimeria. The structure of proteolipid protein/DM20 gene is well preserved among human, dog, mouse, and Latimeria. Southern blot analysis indicates that Latimeria DM20 gene is a single-copy gene. When the amino acid sequences of DM20 were compared among various species, Latimeria was more similar to tetrapods than other fishes including lungfish, confirming the previous finding by immunoreactivity (Waehneldt and Malotka 1989 J. Neurochem. 52:1941-1943). However, when phylogenetic trees were constructed from the DM20 sequences, lungfish was clearly the closest to tetrapods. Latimeria was situated outside of lungfish by the maximum likelihood method. The apparent similarity of Latimeria DM20 to tetrapod proteolipid protein/DM20 is explained by the slow amino acid substitution rate of Latimeria DM20.

  15. Application of Two-Part Statistics for Comparison of Sequence Variant Counts

    PubMed Central

    Wagner, Brandie D.; Robertson, Charles E.; Harris, J. Kirk

    2011-01-01

    Investigation of microbial communities, particularly human associated communities, is significantly enhanced by the vast amounts of sequence data produced by high throughput sequencing technologies. However, these data create high-dimensional complex data sets that consist of a large proportion of zeros, non-negative skewed counts, and frequently, limited number of samples. These features distinguish sequence data from other forms of high-dimensional data, and are not adequately addressed by statistical approaches in common use. Ultimately, medical studies may identify targeted interventions or treatments, but lack of analytic tools for feature selection and identification of taxa responsible for differences between groups, is hindering advancement. The objective of this paper is to examine the application of a two-part statistic to identify taxa that differ between two groups. The advantages of the two-part statistic over common statistical tests applied to sequence count datasets are discussed. Results from the t-test, the Wilcoxon test, and the two-part test are compared using sequence counts from microbial ecology studies in cystic fibrosis and from cenote samples. We show superior performance of the two-part statistic for analysis of sequence data. The improved performance in microbial ecology studies was independent of study type and sequence technology used. PMID:21629788

  16. A comparison of ARMS and DNA sequencing for mutation analysis in clinical biopsy samples

    PubMed Central

    2010-01-01

    Background We have compared mutation analysis by DNA sequencing and Amplification Refractory Mutation System™ (ARMS™) for their ability to detect mutations in clinical biopsy specimens. Methods We have evaluated five real-time ARMS assays: BRAF 1799T>A, [this includes V600E and V600K] and NRAS 182A>G [Q61R] and 181C>A [Q61K] in melanoma, EGFR 2573T>G [L858R], 2235-2249del15 [E746-A750del] in non-small-cell lung cancer, and compared the results to DNA sequencing of the mutation 'hot-spots' in these genes in formalin-fixed paraffin-embedded tumour (FF-PET) DNA. Results The ARMS assays maximised the number of samples that could be analysed when both the quality and quantity of DNA was low, and improved both the sensitivity and speed of analysis compared with sequencing. ARMS was more robust with fewer reaction failures compared with sequencing and was more sensitive as it was able to detect functional mutations that were not detected by DNA sequencing. DNA sequencing was able to detect a small number of lower frequency recurrent mutations across the exons screened that were not interrogated using the specific ARMS assays in these studies. Conclusions ARMS was more sensitive and robust at detecting defined somatic mutations than DNA sequencing on clinical samples where the predominant sample type was FF-PET. PMID:20925915

  17. Comparison with Magnetic Resonance Three-Dimensional Sequence for Lumbar Nerve Root with Intervertebral Foramen

    PubMed Central

    Takashima, Hiroyuki; Shishido, Hiroki; Yoshimoto, Mitsunori; Imamura, Rui; Akatsuka, Yoshihiro; Terashima, Yoshinori; Fujiwara, Hiroyoshi; Nagae, Masateru; Kubo, Toshikazu; Yamashita, Toshihiko

    2016-01-01

    Study Design Prospective study based on magnetic resonance (MR) imaging of the lumbar spinal root of the intervertebral foramen. Purpose This study was to compare MR three-dimensional (3D) sequences for the evaluation of the lumbar spinal root of the intervertebral foramen. Overview of Literature The diagnosis of spinal disorders by MR imaging is commonly performed using two-dimensional T1- and T2-weighted images, whereas 3D MR images can be used for acquiring further detailed data using thin slices with multi-planar reconstruction. Methods On twenty healthy volunteers, we investigated the contrast-to-noise ratio (CNR) of the lumbar spinal root of the intervertebral foramen with a 3D balanced sequence. The sequences used were the fast imaging employing steady state acquisition and the coherent oscillatory state acquisition for the manipulation of image contrast (COSMIC). COSMIC can be used with or without fat suppression (FS). We compared these sequence to determine the optimized visualization sequence for the lumbar spinal root of the intervertebral foramen. Results For the CNR between the nerve root and the peripheral tissue, these were no significant differences between the sequences at the entry of foramen. There was a significant difference and the highest CNR was seen with COSMIC-FS for the intra- and extra-foramen. Conclusions In this study, the findings suggest that the COSMIC-FS sequences should be used for the internal or external foramen for spinal root disorders. PMID:26949459

  18. Amino acid sequence and structural properties of protein p12, an African swine fever virus attachment protein.

    PubMed Central

    Alcamí, A; Angulo, A; López-Otín, C; Muñoz, M; Freije, J M; Carrascosa, A L; Viñuela, E

    1992-01-01

    The gene encoding the African swine fever virus protein p12, which is involved in virus attachment to the host cell, has been mapped and sequenced in the genome of the Vero-adapted virus strain BA71V. The determination of the N-terminal amino acid sequence and the hybridization of oligonucleotide probes derived from this sequence to cloned restriction fragments allowed the mapping of the gene in fragment EcoRI-O, located in the central region of the viral genome. The DNA sequence of an EcoRI-XbaI fragment showed an open reading frame which is predicted to encode a polypeptide of 61 amino acids. The expression of this open reading frame in rabbit reticulocyte lysates and in Escherichia coli gave rise to a 12-kDa polypeptide that was immunoprecipitated with a monoclonal antibody specific for protein p12. The hydrophilicity profile indicated the existence of a stretch of 22 hydrophobic residues in the central part that may anchor the protein in the virus envelope. Three forms of the protein with apparent molecular masses of 17, 12, and 10 kDa in sodium dodecyl sulfate-polyacrylamide gel electrophoresis have been observed, depending on the presence of 2-mercaptoethanol and alkylation with 4-vinylpyridine, indicating that disulfide bonds are responsible for the multimerization of the protein. This result was in agreement with the existence of a cysteine-rich domain in the C-terminal region of the predicted amino acid sequence. The protein was synthesized at late times of infection, and no posttranslational modifications such as glycosylation, phosphorylation, or fatty acid acylation were detected. Images PMID:1583732

  19. Cloning and sequencing of the Bet v 1-homologous allergen Fra a 1 in strawberry (Fragaria ananassa) shows the presence of an intron and little variability in amino acid sequence.

    PubMed

    Musidlowska-Persson, Anna; Alm, Rikard; Emanuelsson, Cecilia

    2007-02-01

    The Fra a 1 allergen in strawberry (Fragaria ananassa) is homologous to the major birch pollen allergen Bet v 1, which has numerous isoforms differing in terms of amino acid sequence and immunological impact. To map the extent of sequence differences in the Fra a 1 allergen, PCR cloning and sequencing was applied. Several genomic sequences of Fra a 1, with a length of either 584, 591 or 594 nucleotides, were obtained from three different strawberry varieties. All contained one intron, with the length of either 101 or 110 nucleotides. By sequencing 30 different clones, eight different DNA sequences were obtained, giving in total five potential Fra a 1 protein isoforms, with high sequence similarity (>97% sequence identity) and only seven positions of amino acid variability, which were largely confirmed by mass spectrometry of expressed proteins. We conclude that the sequence variability in the strawberry allergen Fra a 1 is small, within and between strawberry varieties, and that multiple spots, previously detected in 2DE, are presumably due to differences in post-translational modification rather than differences in amino acid sequence. The most abundant Fra a 1 isoform sequence, recombinantly expressed in Escherichia coli after removal of the intron, was recognized by IgE from strawberry allergic patients. It cross-reacted with antibodies to Bet v 1 and the homologous apple allergen Mal d 1 (61 and 78% sequence identity, respectively), and will be used in further analyses of variation in Fra a 1-expression.

  20. Amino acid sequence of an intracellular, phosphate-starvation-induced ribonuclease from cultured tomato (Lycopersicon esculentum) cells.

    PubMed

    Löffler, A; Glund, K; Irie, M

    1993-06-15

    The primary structure of an intracellular ribonuclease (RNase LX) from cultured tomato (Lycopersicon esculentum) cells has been determined. Previous studies have shown that the protein is located inside the tomato cells but outside the vacuoles and that its synthesis is induced after depleting the cells for phosphate [Löffler, A., Abel, S., Jost, W., Beintema, J. J., Glund, K. (1992) Plant Physiol. 98, 1472-1478]. Sequence analysis was carried out by analysis of peptides isolated after enzymatic and chemical cleavage of the protein. RNase LX consists of 213 amino acids and has a molecular mass of 24300 Da and an isoelectric point of 5.33. The enzyme contains 10 half-cystines and there are no potential N-glycosylation sites detectable in the sequence. RNase LX, as compared to an extracellular tomato RNase (RNase LE), which is also phosphate regulated and the amino acid sequence of which was recently established [Jost, W., Bak, H., Glund, K., Terpstra, P. & Beintema, J. J. (1991) Eur. J. Biochem. 198, 1-6] has 60% of all amino acids identical and in identical positions, revealing a high degree of similarity between both proteins. In contrast to RNase LE, RNase LX has a C-terminal extension of nine amino acids. The C-terminal tetrapeptide HDEF may be a retention signal of the protein in the endoplasmic reticulum. PMID:8319673

  1. Complete genome sequence of Enterococcus mundtii QU 25, an efficient L-(+)-lactic acid-producing bacterium.

    PubMed

    Shiwa, Yuh; Yanase, Hiroaki; Hirose, Yuu; Satomi, Shohei; Araya-Kojima, Tomoko; Watanabe, Satoru; Zendo, Takeshi; Chibazakura, Taku; Shimizu-Kadota, Mariko; Yoshikawa, Hirofumi; Sonomoto, Kenji

    2014-08-01

    Enterococcus mundtii QU 25, a non-dairy bacterial strain of ovine faecal origin, can ferment both cellobiose and xylose to produce l-lactic acid. The use of this strain is highly desirable for economical l-lactate production from renewable biomass substrates. Genome sequence determination is necessary for the genetic improvement of this strain. We report the complete genome sequence of strain QU 25, primarily determined using Pacific Biosciences sequencing technology. The E. mundtii QU 25 genome comprises a 3 022 186-bp single circular chromosome (GC content, 38.6%) and five circular plasmids: pQY182, pQY082, pQY039, pQY024, and pQY003. In all, 2900 protein-coding sequences, 63 tRNA genes, and 6 rRNA operons were predicted in the QU 25 chromosome. Plasmid pQY024 harbours genes for mundticin production. We found that strain QU 25 produces a bacteriocin, suggesting that mundticin-encoded genes on plasmid pQY024 were functional. For lactic acid fermentation, two gene clusters were identified-one involved in the initial metabolism of xylose and uptake of pentose and the second containing genes for the pentose phosphate pathway and uptake of related sugars. This is the first complete genome sequence of an E. mundtii strain. The data provide insights into lactate production in this bacterium and its evolution among enterococci.

  2. Rational design of translational pausing without altering the amino acid sequence dramatically promotes soluble protein expression: a strategic demonstration.

    PubMed

    Chen, Wei; Jin, Jingjie; Gu, Wei; Wei, Bo; Lei, Yun; Xiong, Sheng; Zhang, Gong

    2014-11-10

    The production of many pharmaceutical and industrial proteins in prokaryotic hosts is hindered by the insolubility of industrial expression products resulting from misfolding. Even with a correct primary sequence, an improper translation elongation rate in a heterologous expression system is an important cause of misfolding. In silico analysis revealed that most of the endogenous Escherichia coli genes display translational pausing sites that promote correct folding, and almost 1/5 genes have pausing sites at the 3'-termini of their coding sequence. Therefore, we established a novel strategy to efficiently promote the expression of soluble and active proteins without altering the amino acid sequence or expression conditions. This strategy uses the rational design of translational pausing based on structural information solely through synonymous substitutions, i.e. no change on the amino acids sequence. We demonstrated this strategy on a promising antiviral candidate, Cyanovirin-N (CVN), which could not be efficiently expressed in any previously reported system. By introducing silent mutations, we increased the soluble expression level in E. coli by 2000-fold without altering the CVN protein sequence, and the specific activity was slightly higher for the optimized CVN than for the wild-type variant. This strategy introduces new possibilities for the production of bioactive recombinant proteins.

  3. Gastropod arginine kinases from Cellana grata and Aplysia kurodai. Isolation and cDNA-derived amino acid sequences.

    PubMed

    Suzuki, T; Inoue, N; Higashi, T; Mizobuchi, R; Sugimura, N; Yokouchi, K; Furukohri, T

    2000-12-01

    Arginine kinase (AK) was isolated from the radular muscle of the gastropod molluscs Cellana grata (subclass Prosobranchia) and Aplysia kurodai (subclass Opisthobranchia), respectively, by ammonium sulfate fractionation, Sephadex G-75 gel filtration and DEAE-ion exchange chromatography. The denatured relative molecular mass values were estimated to be 40 kDa by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The isolated enzyme from Aplysia gave a Km value of 0.6 mM for arginine and a Vmax value of 13 micromole Pi min(-1) mg protein(-1) for the forward reaction. These values are comparable to other molluscan AKs. The cDNAs encoding Cellana and Aplysia AKs were amplified by polymerase chain reaction, and the nucleotide sequences of 1,608 and 1,239 bp, respectively, were determined. The open reading frame for Cellana AK is 1044 nucleotides in length and encodes a protein with 347 amino acid residues, and that for A. kurodai is 1077 nucleotides and 354 residues. The cDNA-derived amino acid sequences were validated by chemical sequencing of internal lysyl endopeptidase peptides. The amino acid sequences of Cellana and Aplysia AKs showed the highest percent identity (66-73%) with those of the abalone Nordotis and turbanshell Battilus belonging to the same class Gastropoda. These AK sequences still have a strong homology (63-71%) with that of the chiton Liolophura (class Polyplacophora), which is believed to be one of the most primitive molluscs. On the other hand, these AK sequences are less homologous (55-57%) with that of the clam Pseudocardium (class Bivalvia), suggesting that the biological position of the class Polyplacophora should be reconsidered.

  4. Isolation and a partial amino acid sequence of insulin from the islet tissue of cod (Gadus callarias)

    PubMed Central

    Grant, P. T.; Reid, K. B. M.

    1968-01-01

    1. Insulin has been isolated by gel filtration and ion-exchange chromatography from extracts of the discrete islet tissue of cod. The final preparation yielded a single band on electrophoresis at two pH values. The biological potency was 11·5 international units/mg. in mouse-convulsion and other assay procedures. 2. Glycine and methionine were shown to be the N-terminal amino acids of the A and B chains respectively. An estimate of the molecular weight together with amino acid analyses indicated that cod insulin, like the bovine hormone, consists of 51 amino acid residues. In contrast, the amino acid composition differs markedly from bovine insulin. 3. Oxidation of insulin with performic acid yielded the A and B peptide chains, which were separated by ion-exchange chromatography. Sequence studies on smaller peptides isolated from enzymic digests or from dilute acetic acid hydrolysates of the two chains have established the sequential order of 14 of the 21 amino acid residues of the A chain and 25 of the 30 amino acid residues of the B chain. PMID:4866431

  5. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    SciTech Connect

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g. Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.

  6. Heat*seq: an interactive web tool for high-throughput sequencing experiment comparison with public data

    PubMed Central

    Devailly, Guillaume; Mantsoki, Anna; Joshi, Anagha

    2016-01-01

    Summary: Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats. Availability and Implementation: Web application: http://www.heatstarseq.roslin.ed.ac.uk/. Source code: https://github.com/gdevailly. Contact: Guillaume.Devailly@roslin.ed.ac.uk or Anagha.Joshi@roslin.ed.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27378302

  7. Identification, characterization, and complete amino acid sequence of the conjugation-inducing glycoprotein (blepharmone) in the ciliate Blepharisma japonicum

    PubMed Central

    Sugiura, Mayumi; Harumoto, Terue

    2001-01-01

    Conjugation in Blepharisma japonicum is induced by interaction between complementary mating-types I and II, which excrete blepharmone (gamone 1) and blepharismone (gamone 2), respectively. Gamone 1 transforms type II cells such that they can unite, and gamone 2 similarly transforms type I cells. Moreover, each gamone promotes the production of the other gamone. Gamone 2 has been identified as calcium-3-(2′-formylamino-5′-hydroxy-benzoyl) lactate and has been synthesized chemically. Gamone 1 was isolated and characterized as a glycoprotein of 20–30 kDa containing 175 amino acids and 6 sugars. However, the amino acid sequence and arrangement of sugars in this gamone are still unknown. To determine partial amino acid sequences of gamone 1, we established a method of isolation based on the finding that this glycoprotein can be concentrated by a Con A affinity column. Gamone 1 is extremely unstable and loses its biological activity once adsorbed to any of the columns that we tested. By using a Con A affinity column and native PAGE, we detected a 30-kDa protein corresponding to gamone 1 activity and determined the partial amino acid sequences of the four peptides. To isolate gamone 1 cDNA, we isolated mRNA from mating-type I cells stimulated by synthetic gamone 2 and then performed rapid amplification of cDNA ends procedures by using gene-specific primers and cloned cDNA of gamone 1. The cDNA sequence contains an ORF of 305 amino acids and codes a possibly novel protein. We also estimated the arrangement of sugars by comparing the affinity to various lectin columns. PMID:11724922

  8. Identification, characterization, and complete amino acid sequence of the conjugation-inducing glycoprotein (blepharmone) in the ciliate Blepharisma japonicum.

    PubMed

    Sugiura, M; Harumoto, T

    2001-12-01

    Conjugation in Blepharisma japonicum is induced by interaction between complementary mating-types I and II, which excrete blepharmone (gamone 1) and blepharismone (gamone 2), respectively. Gamone 1 transforms type II cells such that they can unite, and gamone 2 similarly transforms type I cells. Moreover, each gamone promotes the production of the other gamone. Gamone 2 has been identified as calcium-3-(2'-formylamino-5'-hydroxy-benzoyl) lactate and has been synthesized chemically. Gamone 1 was isolated and characterized as a glycoprotein of 20-30 kDa containing 175 amino acids and 6 sugars. However, the amino acid sequence and arrangement of sugars in this gamone are still unknown. To determine partial amino acid sequences of gamone 1, we established a method of isolation based on the finding that this glycoprotein can be concentrated by a Con A affinity column. Gamone 1 is extremely unstable and loses its biological activity once adsorbed to any of the columns that we tested. By using a Con A affinity column and native PAGE, we detected a 30-kDa protein corresponding to gamone 1 activity and determined the partial amino acid sequences of the four peptides. To isolate gamone 1 cDNA, we isolated mRNA from mating-type I cells stimulated by synthetic gamone 2 and then performed rapid amplification of cDNA ends procedures by using gene-specific primers and cloned cDNA of gamone 1. The cDNA sequence contains an ORF of 305 amino acids and codes a possibly novel protein. We also estimated the arrangement of sugars by comparing the affinity to various lectin columns.

  9. Nucleotide and Predicted Amino Acid Sequence-Based Analysis of the Avian Metapneumovirus Type C Cell Attachment Glycoprotein Gene: Phylogenetic Analysis and Molecular Epidemiology of U.S. Pneumoviruses

    PubMed Central

    Alvarez, Rene; Lwamba, Humphrey M.; Kapczynski, Darrell R.; Njenga, M. Kariuki; Seal, Bruce S.

    2003-01-01

    A serologically distinct avian metapneumovirus (aMPV) was isolated in the United States after an outbreak of turkey rhinotracheitis (TRT) in February 1997. The newly recognized U.S. virus was subsequently demonstrated to be genetically distinct from European subtypes and was designated aMPV serotype C (aMPV/C). We have determined the nucleotide sequence of the gene encoding the cell attachment glycoprotein (G) of aMPV/C (Colorado strain and three Minnesota isolates) and predicted amino acid sequence by sequencing cloned cDNAs synthesized from intracellular RNA of aMPV/C-infected cells. The nucleotide sequence comprised 1,321 nucleotides with only one predicted open reading frame encoding a protein of 435 amino acids, with a predicted Mr of 48,840. The structural characteristics of the predicted G protein of aMPV/C were similar to those of the human respiratory syncytial virus (hRSV) attachment G protein, including two mucin-like regions (heparin-binding domains) flanking both sides of a CX3C chemokine motif present in a conserved hydrophobic pocket. Comparison of the deduced G-protein amino acid sequence of aMPV/C with those of aMPV serotypes A, B, and D, as well as hRSV revealed overall predicted amino acid sequence identities ranging from 4 to 16.5%, suggesting a distant relationship. However, G-protein sequence identities ranged from 72 to 97% when aMPV/C was compared to other members within the aMPV/C subtype or 21% for the recently identified human MPV (hMPV) G protein. Ratios of nonsynonymous to synonymous nucleotide changes were greater than one in the G gene when comparing the more recent Minnesota isolates to the original Colorado isolate. Epidemiologically, this indicates positive selection among U.S. isolates since the first outbreak of TRT in the United States. PMID:12682171

  10. A test of mink microsatellite markers in the ferret: amplification and sequence comparisons.

    PubMed

    Anistoroaei, R; Christensen, K

    2006-12-01

    Short tandem repeats are a source of highly polymorphic markers in mammalian genomes. Genetic variation at these hypervariable loci is extensively used for linkage analysis and to identify individuals, and is very useful for interpopulation and interspecies studies. Fifty-nine microsatellite markers from American mink were tested in the ferret, under the same conditions as for the mink. Of the 59, 43 of them (73.5%) amplified a ferret sequence; 5 amplification products differed in size from the respective mink sequences. Ten amplified fragments from ferret were sequenced. The sequences that were identical in size to those from mink displayed a high degree of conservation, with some differences at the repeat motif sites. These results could aid cross-utilization of markers between these two species. PMID:17362355

  11. Comparison of Two Serologically Distinct Ribonucleic Acid Bacteriophages II. Properties of the Nucleic Acids and Coat Proteins

    PubMed Central

    Overby, L. R.; Barlow, G. H.; Doi, R. H.; Jacob, Monique; Spiegelman, S.

    1966-01-01

    Overby, L. R. (University of Illinois, Urbana), G. H. Barlow, R. H. Doi, Monique Jacob, and S. Spiegelman. Comparison of two serologically distinct ribonucleic acid bacteriophages. II. Properties of the nucleic acids and coat proteins. J. Bacteriol. 92:739–745. 1966.—The ribonucleic acid (RNA) molecules and coat proteins of two RNA coliphages, MS-2 and Qβ, have been characterized. MS-2 RNA shows an S20,w of 25.8 and a molecular weight by light scattering of 106. The corresponding parameters for Qβ-RNA were 28.9 and 0.9 × 106. A difference in base composition was reflected in the adenine-uracil ratio, which was 0.95 for MS-2 and 0.75 for Qβ. The two RNA preparations are readily separated by chromatography on columns of methylated albumin. Both gave identical bouyant densities in cesium sulfate of 1.64 g/ml. The coat protein subunits were of similar molecular weights: 15,500 (Qβ) and 14,000 (MS-2). They differed, however, in that the Qβ-protein lacked tryptophan and histidine, whereas the MS-2 protein lacked only histidine. Images PMID:5922545

  12. Comparison of sterols and fatty acids in two species of Ganoderma

    PubMed Central

    2012-01-01

    Background Two species of Ganoderma, G. sinense and G. lucidum, are used as Lingzhi in China. Howerver, the content of triterpenoids and polysaccharides, main actives compounds, are significant different, though the extracts of both G. lucidum and G. sinense have antitumoral proliferation effect. It is suspected that other compounds contribute to their antitumoral activity. Sterols and fatty acids have obvious bioactivity. Therefore, determination and comparison of sterols and fatty acids is helpful to elucidate the active components of Lingzhi. Results Ergosterol, a specific component of fungal cell membrane, was rich in G. lucidum and G. sinense. But its content in G. lucidum (median content 705.0 μg·g-1, range 189.1-1453.3 μg·g-1, n = 19) was much higher than that in G. sinense (median content 80.1 μg·g-1, range 16.0-409.8 μg·g-1, n = 13). Hierarchical clustering analysis based on the content of ergosterol showed that 32 tested samples of Ganoderma were grouped into two main clusters, G. lucidum and G. sinense. Hierarchical clustering analysis based on the contents of ten fatty acids showed that two species of Ganoderma had no significant difference though two groups were also obtained. The similarity of two species of Ganoderma in fatty acids may be related to their antitumoral proliferation effect. Conclusions The content of ergosterol is much higher in G. lucidum than in G. sinense. Palmitic acid, linoleic acid, oleic acid, stearic acid are main fatty acids in Ganoderma and their content had no significant difference between G. lucidum and G. sinense, which may contribute to their antitumoral proliferation effect. PMID:22293530

  13. Evaluating the genomic and sequence integrity of human ES cell lines; comparison to normal genomes.

    PubMed

    Funk, Walter D; Labat, Ivan; Sampathkumar, Janani; Gourraud, Pierre-Antoine; Oksenberg, Jorge R; Rosler, Elen; Steiger, Daniel; Sheibani, Nadia; Caillier, Stacy; Stache-Crain, Birgit; Johnson, Julie A; Meisner, Lorraine; Lacher, Markus D; Chapman, Karen B; Park, Myung Jin; Shin, Kyoung-Jin; Drmanac, Rade; West, Michael D

    2012-03-01

    Copy number variation (CNV) is a common chromosomal alteration that can occur during in vitro cultivation of human cells and can be accompanied by the accumulation of mutations in coding region sequences. We describe here a systematic application of current molecular technologies to provide a detailed understanding of genomic and sequence profiles of human embryonic stem cell (hESC) lines that were derived under GMP-compliant conditions. We first examined the overall chromosomal integrity using cytogenetic techniques to determine chromosome count, and to detect the presence of cytogenetically aberrant cells in the culture (mosaicism). Assays of copy number variation, using both microarray and sequence-based analyses, provide a detailed view genomic variation in these lines and shows that in early passage cultures of these lines, the size range and distribution of CNVs are entirely consistent with those seen in the genomes of normal individuals. Similarly, genome sequencing shows variation within these lines that is completely within the range seen in normal genomes. Important gene classes, such as tumor suppressors and genetic disease genes, do not display overtly disruptive mutations that could affect the overall safety of cell-based therapeutics. Complete sequence also allows the analysis of important transplantation antigens, such as ABO and HLA types. The combined application of cytogenetic and molecular technologies provides a detailed understanding of genomic and sequence profiles of GMP produced ES lines for potential use as therapeutic agents. PMID:22265736

  14. AGenDA: gene prediction by cross-species sequence comparison.

    PubMed

    Taher, Leila; Rinner, Oliver; Garg, Saurabh; Sczyrba, Alexander; Morgenstern, Burkhard

    2004-07-01

    Automatic gene prediction is one of the major challenges in computational sequence analysis. Traditional approaches to gene finding rely on statistical models derived from previously known genes. By contrast, a new class of comparative methods relies on comparing genomic sequences from evolutionary related organisms to each other. These methods are based on the concept of phylogenetic footprinting: they exploit the fact that functionally important regions in genomic sequences are usually more conserved than non-functional regions. We created a WWW-based software program for homology-based gene prediction at BiBiServ (Bielefeld Bioinformatics Server). Our tool takes pairs of evolutionary related genomic sequences as input data, e.g. from human and mouse. The server runs CHAOS and DIALIGN to create an alignment of the input sequences and subsequently searches for conserved splicing signals and start/stop codons near regions of local sequence conservation. Genes are predicted based on local homology information and splice signals. The server returns predicted genes together with a graphical representation of the underlying alignment. The program is available at http://bibiserv.TechFak.Uni-Bielefeld.DE/agenda/.

  15. Sequence of the cDNA and 5'-flanking region for human acid alpha-glucosidase, detection of an intron in the 5' untranslated leader sequence, definition of 18-bp polymorphisms, and differences with previous cDNA and amino acid sequences.

    PubMed

    Martiniuk, F; Mehler, M; Tzall, S; Meredith, G; Hirschhorn, R

    1990-03-01

    Acid maltase or acid alpha-glucosidase (GAA) is a lysosomal enzyme that hydrolyzes glycogen to glucose and is deficient in glycogen storage disease type II. Previously, we isolated a partial cDNA (1.9 kb) for human GAA; we have now used this cDNA to isolate and determine sequence in longer cDNAs from four additional independent cDNA libraries. Primer extension studies indicated that the mRNA extended approximately 200 bp 5' of the cDNA sequence obtained. Therefore, we isolated a genomic fragment containing 5' cDNA sequences that overlapped the previous cDNA sequence and extended an additional 24 bp to an initiation codon within a Kozak consensus sequence. The sequence of the genomic clone revealed an intron-exon junction 32 bp 5' to the ATG, indicating that the 5' leader sequence was interrupted by an intron. The remaining 186 bp of 5' untranslated sequence was identified approximately 3 kb upstream. The promoter region upstream from the start site of transcription was GC rich and contained areas of homology to Sp1 binding sites but no identifiable CAAT or TATA box. The combined data gave a nucleotide sequence of 2,856 bp for the coding region from the ATG to a stop codon, predicting a protein of 952 amino acids. The 3' untranslated region contained 555 bp with a polyadenylation signal at 3,385 bp followed by 16 bp prior to a poly(A) tail. This sequence of the GAA coding region differs from that reported by Hoefsloot et al. (1988) in three areas that change a total of 42 amino acids. Direct determination of the amino acid sequence in one of these areas confirmed the nucleotide sequence reported here but also disagreed with the directly determined amino acid sequence reported by Hoefsloot et al. (1988). At two other areas, changes in base pairs predicted new restriction sites that were identified in cDNAs from several independent libraries. The amino acid changes in all three ares increased the homology to rabbit-human isomaltase. Therefore, we believe that our

  16. Comparison of sodium acid sulfate to citric acid to inhibit browning of fresh-cut potatoes.

    PubMed

    Calder, Beth L; Kash, Emily A; Davis-Dentici, Katherine; Bushway, Alfred A

    2011-04-01

    Sodium acid sulfate (SAS) dip treatments were evaluated against a distilled water control and citric acid (CA) to compare its effectiveness in reducing enzymatic browning of raw, French-fry cut potatoes. Two separate studies were conducted with dip concentrations ranging from 0%, 1%, and 3% in experiment 1 to 0%, 2%, and 2.5% in experiment 2 to determine optimal dip concentrations. Russet Burbank potatoes were peeled, sliced, and dipped for 1 min and stored at 3 °C. Color, texture, fry surface pH, and microbiological analyses were conducted on days 0, 7, and 14. The 3% SAS- and CA-treated samples had significantly (p<0.0001) lower pH levels on fry surfaces than all other treatments. Both acidulants had significantly (p≤0.05) lower aerobic plate counts compared to controls in both studies by day 7. However, SAS appeared to be the most effective at the 3% level in maintaining a light fry color up to day 14 and had the highest L-values than all other treatments. The 3% SAS-treated fry slices appeared to have the least change in textural properties over storage time, having a significantly (p=0.0002) higher force value (kg force [kgf]) than the other treatments during experiment 1, without any signs of case-hardening that appeared in the control and CA-treated samples. SAS was just as comparable to CA in reducing surface fry pH and also lowering microbial counts over storage time. According to the results, SAS may be another viable acidulant to be utilized in the fresh-cut fruit and vegetable industry.

  17. Cross-comparison of Protein Recognition of Sialic Acid Diversity on Two Novel Sialoglycan Microarrays*

    PubMed Central

    Padler-Karavani, Vered; Song, Xuezheng; Yu, Hai; Hurtado-Ziola, Nancy; Huang, Shengshu; Muthana, Saddam; Chokhawala, Harshal A.; Cheng, Jiansong; Verhagen, Andrea; Langereis, Martijn A.; Kleene, Ralf; Schachner, Melitta; de Groot, Raoul J.; Lasanajak, Yi; Matsuda, Haruo; Schwab, Richard; Chen, Xi; Smith, David F.; Cummings, Richard D.; Varki, Ajit

    2012-01-01

    DNA and protein arrays are commonly accepted as powerful exploratory tools in research. This has mainly been achieved by the establishment of proper guidelines for quality control, allowing cross-comparison between different array platforms. As a natural extension, glycan microarrays were subsequently developed, and recent advances using such arrays have greatly enhanced our understanding of protein-glycan recognition in nature. However, although it is assumed that biologically significant protein-glycan binding is robustly detected by glycan microarrays, there are wide variations in the methods used to produce, present, couple, and detect glycans, and systematic cross-comparisons are lacking. We address these issues by comparing two arrays that together represent the marked diversity of sialic acid modifications, linkages, and underlying glycans in nature, including some identical motifs. We compare and contrast binding interactions with various known and novel plant, vertebrate, and viral sialic acid-recognizing proteins and present a technical advance for assessing specificity using mild periodate oxidation of the sialic acid chain. These data demonstrate both the diversity of sialic acids and the analytical power of glycan arrays, showing that different presentations in different formats provide useful and complementary interpretations of glycan-binding protein specificity. They also highlight important challenges and questions for the future of glycan array technology and suggest that glycan arrays with similar glycan structures cannot be simply assumed to give similar results. PMID:22549775

  18. A Complex Prime Numerical Representation of Amino Acids for Protein Function Comparison.

    PubMed

    Chen, Duo; Wang, Jiasong; Yan, Ming; Bao, Forrest Sheng

    2016-08-01

    Computationally assessing the functional similarity between proteins is an important task of bioinformatics research. It can help molecular biologists transfer knowledge on certain proteins to others and hence reduce the amount of tedious and costly benchwork. Representation of amino acids, the building blocks of proteins, plays an important role in achieving this goal. Compared with symbolic representation, representing amino acids numerically can expand our ability to analyze proteins, including comparing the functional similarity of them. Among the state-of-the-art methods, electro-ion interaction pseudopotential (EIIP) is widely adopted for the numerical representation of amino acids. However, it could suffer from degeneracy that two different amino acid sequences have the same numerical representation, due to the design of EIIP. In light of this challenge, we propose a complex prime numerical representation (CPNR) of amino acids, inspired by the similarity between a pattern among prime numbers and the number of codons of amino acids. To empirically assess the effectiveness of the proposed method, we compare CPNR against EIIP. Experimental results demonstrate that the proposed method CPNR always achieves better performance than EIIP. We also develop a framework to combine the advantages of CPNR and EIIP, which enables us to improve the performance and study the unique characteristics of different representations. PMID:27249328

  19. A comparison of the gas phase acidities of phospholipid headgroups: experimental and computational studies.

    PubMed

    Thomas, Michael C; Mitchell, Todd W; Blanksby, Stephen J

    2005-06-01

    Proton-bound dimers consisting of two glycerophospholipids with different headgroups were prepared using negative ion electrospray ionization and dissociated in a triple quadrupole mass spectrometer. Analysis of the tandem mass spectra of the dimers using the kinetic method provides, for the first time, an order of acidity for the phospholipid classes in the gas phase of PE < PA < PG < PS < PI. Hybrid density functional calculations on model phospholipids were used to predict the absolute deprotonation enthalpies of the phospholipid classes from isodesmic proton transfer reactions with phosphoric acid. The computational data largely support the experimental acidity trend, with the exception of the relative acidity ranking of the two most acidic phospholipid species. Possible causes of the discrepancy between experiment and theory are discussed and the experimental trend is recommended. The sequence of gas phase acidities for the phospholipid headgroups is found to (1) have little correlation with the relative ionization efficiencies of the phospholipid classes observed in the negative ion electrospray process, and (2) correlate well with fragmentation trends observed upon collisional activation of phospholipid [M - H](-) anions. PMID:15907707

  20. Environmental comparison of biobased chemicals from glutamic acid with their petrochemical equivalents.

    PubMed

    Lammens, Tijs M; Potting, José; Sanders, Johan P M; De Boer, Imke J M

    2011-10-01

    Glutamic acid is an important constituent of waste streams from biofuels production. It is an interesting starting material for the synthesis of biobased chemicals, thereby decreasing the dependency on fossil fuels. The objective of this paper was to compare the environmental impact of four biobased chemicals from glutamic acid with their petrochemical equivalents, that is, N-methylpyrrolidone (NMP), N-vinylpyrrolidone (NVP), acrylonitrile (ACN), and succinonitrile (SCN). A consequential life cycle assessment was performed, wherein glutamic acid was obtained from sugar beet vinasse. The removed glutamic acid was substituted with cane molasses and ureum. The comparison between the four biobased and petrochemical products showed that for NMP and NVP the biobased version had less impact on the environment, while for ACN and SCN the petrochemical version had less impact on the environment. For the latter two an optimized scenario was computed, which showed that the process for SCN can be improved to a level at which it can compete with the petrochemical process. For biobased ACN large improvements are required to make it competitive with its petrochemical equivalent. The results of this LCA and the research preceding it also show that glutamic acid can be a building block for a variety of molecules that are currently produced from petrochemical resources. Currently, most methods to produce biobased products are biotechnological processes based on sugar, but this paper demonstrates that the use of amino acids from low-value byproducts can certainly be a method as well. PMID:21870885

  1. DNA Sequence and Expression Variation of Hop (Humulus lupulus) Valerophenone Synthase (VPS), a Key Gene in Bitter Acid Biosynthesis

    PubMed Central

    Castro, Consuelo B.; Whittock, Lucy D.; Whittock, Simon P.; Leggett, Grey; Koutoulis, Anthony

    2008-01-01

    Background The hop plant (Humulus lupulus) is a source of many secondary metabolites, with bitter acids essential in the beer brewing industry and others having potential applications for human health. This study investigated variation in DNA sequence and gene expression of valerophenone synthase (VPS), a key gene in the bitter acid biosynthesis pathway of hop. Methods Sequence variation was studied in 12 varieties, and expression was analysed in four of the 12 varieties in a series across the development of the hop cone. Results Nine single nucleotide polymorphisms (SNPs) were detected in VPS, seven of which were synonymous. The two non-synonymous polymorphisms did not appear to be related to typical bitter acid profiles of the varieties studied. However, real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR) analysis of VPS expression during hop cone development showed a clear link with the bitter acid content. The highest levels of VPS expression were observed in two triploid varieties, ‘Symphony’ and ‘Ember’, which typically have high bitter acid levels. Conclusions In all hop varieties studied, VPS expression was lowest in the leaves and an increase in expression was consistently observed during the early stages of cone development. PMID:18519445

  2. A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

    PubMed

    García-Remesal, Miguel; Maojo, Victor; Crespo, José

    2010-01-01

    In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences.

  3. A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

    PubMed

    García-Remesal, Miguel; Maojo, Victor; Crespo, José

    2010-01-01

    In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences. PMID:21096556

  4. Direct Comparisons of Illumina vs. Roche 454 Sequencing Technologies on the Same Microbial Community DNA Sample

    PubMed Central

    Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T.

    2012-01-01

    Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ∼90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R2>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ∼1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ∼3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies. PMID:22347999

  5. HBOC multi-gene panel testing: comparison of two sequencing centers.

    PubMed

    Schroeder, Christopher; Faust, Ulrike; Sturm, Marc; Hackmann, Karl; Grundmann, Kathrin; Harmuth, Florian; Bosse, Kristin; Kehrer, Martin; Benkert, Tanja; Klink, Barbara; Mackenroth, Luisa; Betcheva-Krajcir, Elitza; Wimberger, Pauline; Kast, Karin; Heilig, Mechthilde; Nguyen, Huu Phuc; Riess, Olaf; Schröck, Evelin; Bauer, Peter; Rump, Andreas

    2015-07-01

    Multi-gene panels are used to identify genetic causes of hereditary breast and ovarian cancer (HBOC) in large patient cohorts. This study compares the diagnostic workflow in two centers and gives valuable insights into different next-generation sequencing (NGS) strategies. Moreover, we present data from 620 patients sequenced at both centers. Both sequencing centers are part of the German consortium for hereditary breast and ovarian cancer (GC-HBOC). All 620 patients included in this study were selected following standard BRCA1/2 testing guidelines. A set of 10 sequenced genes was analyzed per patient. Twelve samples were exchanged and sequenced at both centers. NGS results were highly concordant in 12 exchanged samples (205/206 variants = 99.51 %). One non-pathogenic variant was missed at center B due to a sequencing gap (no technical coverage). The custom enrichment at center B was optimized during this study; for example, the average number of missing bases was reduced by a factor of four (vers. 1: 1939.41, vers. 4: 506.01 bp). There were no sequencing gaps at center A, but four CCDS exons were not included in the enrichment. Pathogenic mutations were found in 12.10 % (75/620) of all patients: 4.84 % (30/620) in BRCA1, 4.35 % in BRCA2 (27/620), 0.97 % in CHEK2 (6/620), 0.65 % in ATM (4/620), 0.48 % in CDH1 (3/620), 0.32 % in PALB2 (2/620), 0.32 % in NBN (2/620), and 0.16 % in TP53 (1/620). NGS diagnostics for HBOC-related genes is robust, cost effective, and the method of choice for genetic testing in large cohorts. Adding 8 genes to standard BRCA1- and BRCA2-testing increased the mutation detection rate by one-third. PMID:26022348

  6. Comparison of CAGE and RNA-seq transcriptome profiling using clonally amplified and single-molecule next-generation sequencing.

    PubMed

    Kawaji, Hideya; Lizio, Marina; Itoh, Masayoshi; Kanamori-Katayama, Mutsumi; Kaiho, Ai; Nishiyori-Sueki, Hiromi; Shin, Jay W; Kojima-Ishiyama, Miki; Kawano, Mitsuoki; Murata, Mitsuyoshi; Ninomiya-Fukuda, Noriko; Ishikawa-Kato, Sachi; Nagao-Sato, Sayaka; Noma, Shohei; Hayashizaki, Yoshihide; Forrest, Alistair R R; Carninci, Piero

    2014-04-01

    CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences [Roche], Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.

  7. Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray.

    PubMed

    Murakami, Yoshiki; Tanahashi, Toshihito; Okada, Rina; Toyoda, Hidenori; Kumada, Takashi; Enomoto, Masaru; Tamori, Akihiro; Kawada, Norifumi; Taguchi, Y-h; Azuma, Takeshi

    2014-01-01

    MicroRNA (miRNA) expression profiling has proven useful in diagnosing and understanding the development and progression of several diseases. Microarray is the standard method for analyzing miRNA expression profiles; however, it has several disadvantages, including its limited detection of miRNAs. In recent years, advances in genome sequencing have led to the development of next-generation sequencing (NGS) technologies, which significantly advance genome sequencing speed and discovery. In this study, we compared the expression profiles obtained by next generation sequencing (NGS) with the profiles created using microarray to assess if NGS could produce a more accurate and complete miRNA profile. Total RNA from 14 hepatocellular carcinoma tumors (HCC) and 6 matched non-tumor control tissues were sequenced with Illumina MiSeq 50-bp single-end reads. Micro RNA expression profiles were estimated using miRDeep2 software. As a comparison, miRNA expression profiles for 11 out of 14 HCCs were also established by microarray (Agilent human microRNA microarray). The average total sequencing exceeded 2.2 million reads per sample and of those reads, approximately 57% mapped to the human genome. The average correlation for miRNA expression between microarray and NGS and subtraction were 0.613 and 0.587, respectively, while miRNA expression between technical replicates was 0.976. The diagnostic accuracy of HCC, p-value, and AUC were 90.0%, 7.22×10(-4), and 0.92, respectively. In summary, NGS created an miRNA expression profile that was reproducible and comparable to that produced by microarray. Moreover, NGS discovered novel miRNAs that were otherwise undetectable by microarray. We believe that miRNA expression profiling by NGS can be a useful diagnostic tool applicable to multiple fields of medicine.

  8. Identification of novel rice low phytic acid mutations via TILLING by sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Phytic acid (myo-inositol-1,2,3,4,5,6-hexakisphosphate or InsP6) accounts for 75-85% of the total phosphorus in seeds. Low phytic acid (lpa) mutants exhibit decreases in seed InsP6 with corresponding increases in inorganic P which, unlike phytic acid P, is readily utilized by humans and monogastric ...

  9. Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight?

    PubMed Central

    Fetrow, Jacquelyn S.; Siew, Naomi; Di Gennaro, Jeannine A.; Martinez-Yamout, Maria; Dyson, H. Jane; Skolnick, Jeffrey

    2001-01-01

    A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone. PMID:11316881

  10. Zucchini yellow mosaic virus: biological properties, detection procedures and comparison of coat protein gene sequences.

    PubMed

    Coutts, B A; Kehoe, M A; Webster, C G; Wylie, S J; Jones, R A C

    2011-12-01

    Between 2006 and 2010, 5324 samples from at least 34 weed, two cultivated legume and 11 native species were collected from three cucurbit-growing areas in tropical or subtropical Western Australia. Two new alternative hosts of zucchini yellow mosaic virus (ZYMV) were identified, the Australian native cucurbit Cucumis maderaspatanus, and the naturalised legume species Rhyncosia minima. Low-level (0.7%) seed transmission of ZYMV was found in seedlings grown from seed collected from zucchini (Cucurbita pepo) fruit infected with isolate Cvn-1. Seed transmission was absent in >9500 pumpkin (C. maxima and C. moschata) seedlings from fruit infected with isolate Knx-1. Leaf samples from symptomatic cucurbit plants collected from fields in five cucurbit-growing areas in four Australian states were tested for the presence of ZYMV. When 42 complete coat protein (CP) nucleotide (nt) sequences from the new ZYMV isolates obtained were compared to those of 101 complete CP nt sequences from five other continents, phylogenetic analysis of the 143 ZYMV sequences revealed three distinct groups (A, B and C), with four subgroups in A (I-IV) and two in B (I-II). The new Australian sequences grouped according to collection location, fitting within A-I, A-II and B-II. The 16 new sequences from one isolated location in tropical northern Western Australia all grouped into subgroup B-II, which contained no other isolates. In contrast, the three sequences from the Northern Territory fitted into A-II with 94.6-99.0% nt identities with isolates from the United States, Iran, China and Japan. The 23 new sequences from the central west coast and two east coast locations all fitted into A-I, with 95.9-98.9% nt identities to sequences from Europe and Japan. These findings suggest that (i) there have been at least three separate ZYMV introductions into Australia and (ii) there are few changes to local isolate CP sequences following their establishment in remote growing areas. Isolates from A-I and B

  11. Drosophila melanogaster mitochondrial DNA: completion of the nucleotide sequence and evolutionary comparisons.

    PubMed

    Lewis, D L; Farr, C L; Kaguni, L S

    1995-11-01

    The nucleotide sequence of the regions flanking the A+T region of Drosophila melanogaster mitochondrial DNA (mtDNA) has been determined. Included are the genes encoding the transfer RNAs for valine, isoleucine, glutamine and methionine, the small ribosomal RNA and the 5'-coding sequences of the large ribosomal RNA and NADH dehydrogenase subunit II. This completes the nucleotide sequence of the D. melanogaster mitochondrial genome. The circular mtDNA of D. melanogaster varies in size among different populations largely due to length differences in the control region (Fauron & Wolstenholme, 1976; Fauron & Wolstenholme, 1980a, b); the mtDNA region we have sequenced, combined with those sequenced by others, yields a composite genome that is 19,517 bp in length as compared to 16,019 bp for the mtDNA of D. yakuba. D. melanogaster mtDNA exhibits an extreme bias in base composition; it comprises 82.2% deoxyadenylate and thymidylate residues as compared to 78.6% in D. yakuba mtDNA. All genes encoded in the mtDNA of both species are in identical locations and orientations. Nucleotide substitution analysis reveals that tRNA and rRNA genes evolve at less than half the rate of protein coding genes.

  12. Ossification sequence of the avian order anseriformes, with comparison to other precocial birds.

    PubMed

    Maxwell, Erin E

    2008-09-01

    Ossification sequences are poorly known for most amniotes, and yet they represent an important source of morphogenetic, phylogenetic, and life history information. Here, the author describes the ossification sequences of three ducks, the Common Eider Somateria mollissima dresseri, the Pekin Duck Anas platyrhynchos, and the Muscovy Duck Cairina moschata. Sequence differences exist both within and among these species, but are generally minor. The Common Eider has the most ossified skeleton prior to hatching, contrary to what is expected in a subarctic migrant species. This may be attributed to a tradeoff between growth rate and locomotory performance. Growth rate is higher in hatchlings with more cartilaginous skeletons, but this may compromise locomotion. No major ossification sequence differences were observed in the craniofacial skeleton when compared with Galliformes, which suggests that the influence of adult morphology on ossification sequence might be relatively minor in many taxa. Galliformes and Anseriformes, while both highly ossified at hatching, differ in the location of their late-stage ossification centers. In Anseriformes, these are most often located in the appendicular skeleton, whereas in Galliformes they are in the thoracic region and form the ventilatory apparatus.

  13. Comparison of sequences formed in Marine sabkha (subaerial) and salina (Subaqueous) settings-modern and ancient

    SciTech Connect

    Warren, J.K.; St. Kendall, C.G.

    1985-06-01

    Marine evaporites occurring in modern subaqueous (salina) settings and subaerial (sabkha) settings are different. Subaqueous Holocene evaporites occur as shoalingupward lacustrine sequences up to 10 m thick. They are evaporite dominated and are composed primarily of bottom-nucleated crystals that may be deposited as massive, laminated, or rippled units. Each coastal lake is dominated by laminated evaporites with subordinate carbonate sediments. In plan view, they show a well-developed bull's-eye pattern with a sulfate center and a carbonate rim. In contrast, subaerial (sabkha) evaporites occur as part of a laterally prograding, shoaling-upward, peritidal sequence in which the supratidal unit is usually no more than 1 m thick. Sabkha sequences are matrix dominated, not evaporite dominated, with the bulk of the sulfate phase occurring as diagenetic nodules, enteroliths, or diapirlike structures. These sulfates were formed during syndepositional diagenesis by replacement and displacement processes. The various facies of the sequence tend to accumulate in belts parallel with the shoreline. Relative to the sea level or the brine level, sabkhas tend to form over paleotopographic highs whereas salinas tend to occur in paleotopographic lows. Some of the characteristics that distinguish Holocene subaerial and subaqueous evaporite sequences can be used to do the same for similar ancient facies, even when gypsum has been converted to nodular anhydrite. The distinction is important for it can be used by explorationists in the oil industry to define the paleotopography of the associated underlying porous and nonporous carbonates.

  14. Complete amino acid sequence of an acidic, cardiotoxic phospholipase A2 from the venom of Ophiophagus hannah (King Cobra): a novel cobra venom enzyme with "pancreatic loop".

    PubMed

    Huang, M Z; Gopalakrishnakone, P; Chung, M C; Kini, R M

    1997-02-15

    A phospholipase A2 (OHV A-PLA2) from the venom of Ophiophagus hannah (King cobra) is an acidic protein exhibiting cardiotoxicity, myotoxicity, and antiplatelet activity. The complete amino acid sequence of OHV A-PLA2 has been determined using a combination of Edman degradation and mass spectrometric techniques. OHV A-PLA2 is composed of a single chain of 124 amino acid residues with 14 cysteines and a calculated molecular weight of 13719 Da. It contains the loop of residues (62-66) found in pancreatic PLA2s and hence belongs to class IB enzymes. This pancreatic loop is between two proline residues (Pro 59 and Pro 68) and contains several hydrophilic amino acids (Ser and Asp). This region has high degree of conformational flexibility and is on the surface of the molecule, and hence it may be a potential protein-protein interaction site. A relatively low sequence homology is found between OHV A-PLA2 and other known cardiotoxic PLA2s, and hence a contiguous segment could not be identified as a site responsible for the cardiotoxic activity.

  15. Evaluation of a novel food composition database that includes glutamine and other amino acids derived from gene sequencing data

    PubMed Central

    Lenders, CM; Liu, S; Wilmore, DW; Sampson, L; Dougherty, LW; Spiegelman, D; Willett, WC

    2011-01-01

    Objectives To determine the content of glutamine in major food proteins. Subjects/Methods We used a validated 131-food item food frequency questionnaire (FFQ) to identify the foods that contributed the most to protein intake among 70 356 women in the Nurses’ Health Study (NHS, 1984). The content of glutamine and other amino acids in foods was calculated based on protein fractions generated from gene sequencing methods (Swiss Institute of Bioinformatics) and compared with data from conventional (USDA) and modified biochemical (Khun) methods. Pearson correlation coefficients were used to compare the participants’ dietary intakes of amino acids by sequencing and USDA methods. Results The glutamine content varied from 0.01 to to 9.49 g/100 g of food and contributed from 1 to to 33% of total protein for all FFQ foods with protein. When comparing the sequencing and Kuhn’s methods, the proportion of glutamine in meat was 4.8 vs 4.4%. Among NHS participants, mean glutamine intake was 6.84 (s.d.=2.19) g/day and correlation coefficients for amino acid between intakes assessed by sequencing and USDA methods ranged from 0.94 to 0.99 for absolute intake, −0.08 to 0.90 after adjusting for 100 g of protein, and 0.88 to 0.99 after adjusting for 1000 kcal. The between-person coefficient of variation of energy-adjusted intake of glutamine was 16%. Conclusions These data suggest that (1) glutamine content can be estimated from gene sequencing methods and (2) there is a reasonably wide variation in energy-adjusted glutamine intake, allowing for exploration of glutamine consumption and disease. PMID:19756030

  16. The amino acid sequences of the cytochromes c553 from Porphyridium cruentum and Aphanizomenon flos-aquae.

    PubMed

    Sprinkle, J R; Hermodson, M; Krogmann, D W

    1986-01-01

    The amino acid sequences of cytochrome c553 from the eukaryotic red alga Porphyridium cruentum and from the prokaryotic cyanobacterium Aphanizomenon flos-aquae have been determined from the tryptic and cyanogen bromide peptides. The results indicate that a charged region of these proteins has evolved with special rapidity to accomodate a rapid evolution of a binding site in the P700 electron acceptor complex.

  17. A comparison of virus genome sequences with their host silkworm, Bombyx mori.

    PubMed

    Tang, Xu-Dong; Yue, Ya-Jie; Wang, Wei; Li, Nan; Shen, Zhong-Yuan

    2016-01-15

    With the recent availability of the genomes of many viruses and the silkworm, Bombyx mori, as well as a variety of Basic Local Alignment Search Tool (BLAST) programs, a new opportunity to gain insight into the interaction of viruses with the silkworm is possible. This study aims to determine the possible existence of sequence identities between the genomes of viruses and the silkworm and attempts to explain this phenomenon. BLAST searches of the genomes of viruses against the silkworm genome were performed using the resources of the National Center for Biotechnology Information. All studied viruses contained variable numbers of short regions with sequence identity to the genome of the silkworm. The short regions of sequence identity in the genome of the silkworm may be derived from the genomes of viruses in the long history of silkworm-virus interaction. This study is the first to compare these genomes, and may contribute to research on the interaction between viruses and the silkworm.

  18. Genotypic comparison of five isolates of Rickettsia prowazekii by multilocus sequence typing.

    PubMed

    Ge, Hong; Tong, Min; Jiang, Ju; Dasch, Gregory A; Richards, Allen L

    2007-06-01

    Genetic traits of five Rickettsia prowazekii isolates, including the first from Africa and North America, and representatives from human and flying squirrels were compared using multilocus sequence typing. Four rickettsial genes encoding 17 kDa genus-common antigen (17 kDa gene), citrate synthase (gltA), OmpB immunodominant antigen (ompB) and 120 kDa cytoplasmic antigen (sca4) were examined. Sequence identities of 17 kDa gene and gltA were 100% among the isolates. Limited sequence diversity of ompB (0.02-0.11%) and sca4 (0.03-0.20%) was enough to distinguish the isolates, and evaluation of the combined four genes provided a method to easily differentiate R. prowazekii from other rickettsiae. PMID:17419766

  19. Comparison of pulse sequences for R1-based electron paramagnetic resonance oxygen imaging

    NASA Astrophysics Data System (ADS)

    Epel, Boris; Halpern, Howard J.

    2015-05-01

    Electron paramagnetic resonance (EPR) spin-lattice relaxation (SLR) oxygen imaging has proven to be an indispensable tool for assessing oxygen partial pressure in live animals. EPR oxygen images show remarkable oxygen accuracy when combined with high precision and spatial resolution. Developing more effective means for obtaining SLR rates is of great practical, biological and medical importance. In this work we compared different pulse EPR imaging protocols and pulse sequences to establish advantages and areas of applicability for each method. Tests were performed using phantoms containing spin probes with oxygen concentrations relevant to in vivo oxymetry. We have found that for small animal size objects the inversion recovery sequence combined with the filtered backprojection reconstruction method delivers the best accuracy and precision. For large animals, in which large radio frequency energy deposition might be critical, free induction decay and three pulse stimulated echo sequences might find better practical usage.

  20. First draft genome sequencing of indole acetic acid producing and plant growth promoting fungus Preussia sp. BSL10.

    PubMed

    Khan, Abdul Latif; Asaf, Sajjad; Khan, Abdur Rahim; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung

    2016-05-10

    Preussia sp. BSL10, family Sporormiaceae, was actively producing phytohormone (indole-3-acetic acid) and extra-cellular enzymes (phosphatases and glucosidases). The fungus was also promoting the growth of arid-land tree-Boswellia sacra. Looking at such prospects of this fungus, we sequenced its draft genome for the first time. The Illumina based sequence analysis reveals an approximate genome size of 31.4Mbp for Preussia sp. BSL10. Based on ab initio gene prediction, total 32,312 coding sequences were annotated consisting of 11,967 coding genes, pseudogenes, and 221 tRNA genes. Furthermore, 321 carbohydrate-active enzymes were predicted and classified into many functional families. PMID:26995610

  1. The amino acid sequence of GTP:AMP phosphotransferase from beef-heart mitochondria. Extensive homology with cytosolic adenylate kinase.

    PubMed

    Wieland, B; Tomasselli, A G; Noda, L H; Frank, R; Schulz, G E

    1984-09-01

    The amino acid sequence of GTP:AMP phosphotransferase (AK3) from beef-heart mitochondria has been determined, except for one segment of about 33 residues in the middle of the polypeptide chain. The established sequence has been unambiguously aligned to the sequence of cytosolic ATP:AMP phosphotransferase (AK1) from pig muscle, allowing for six insertions and deletions. With 30% of all aligned residues being identical, the homology between AK3 and AK1 is well established. As derived from the known three-dimensional structure of AK1, the missing segment is localized at a small surface area of the molecule, far apart from the active center. The pattern of conserved residues demonstrates that earlier views on substrate binding have to be modified. The observation of three different consecutive N-termini indicates enzyme processing.

  2. A phylogenetic analysis of the brassicales clade based on an alignment-free sequence comparison method.

    PubMed

    Hatje, Klas; Kollmar, Martin

    2012-01-01

    Phylogenetic analyses reveal the evolutionary derivation of species. A phylogenetic tree can be inferred from multiple sequence alignments of proteins or genes. The alignment of whole genome sequences of higher eukaryotes is a computational intensive and ambitious task as is the computation of phylogenetic trees based on these alignments. To overcome these limitations, we here used an alignment-free method to compare genomes of the Brassicales clade. For each nucleotide sequence a Chaos Game Representation (CGR) can be computed, which represents each nucleotide of the sequence as a point in a square defined by the four nucleotides as vertices. Each CGR is therefore a unique fingerprint of the underlying sequence. If the CGRs are divided by grid lines each grid square denotes the occurrence of oligonucleotides of a specific length in the sequence (Frequency Chaos Game Representation, FCGR). Here, we used distance measures between FCGRs to infer phylogenetic trees of Brassicales species. Three types of data were analyzed because of their different characteristics: (A) Whole genome assemblies as far as available for species belonging to the Malvidae taxon. (B) EST data of species of the Brassicales clade. (C) Mitochondrial genomes of the Rosids branch, a supergroup of the Malvidae. The trees reconstructed based on the Euclidean distance method are in general agreement with single gene trees. The Fitch-Margoliash and Neighbor joining algorithms resulted in similar to identical trees. Here, for the first time we have applied the bootstrap re-sampling concept to trees based on FCGRs to determine the support of the branchings. FCGRs have the advantage that they are fast to calculate, and can be used as additional information to alignment based data and morphological characteristics to improve the phylogenetic classification of species in ambiguous cases.

  3. Comparison of Commercially Available Target Enrichment Methods for Next-Generation Sequencing

    PubMed Central

    Bodi, K.; Perera, A. G.; Adams, P. S.; Bintzler, D.; Dewar, K.; Grove, D. S.; Kieleczawa, J.; Lyons, R. H.; Neubert, T. A.; Noll, A. C.; Singh, S.; Steen, R.; Zianni, M.

    2013-01-01

    Isolating high-priority segments of genomes greatly enhances the efficiency of next-generation sequencing (NGS) by allowing researchers to focus on their regions of interest. For the 2010–11 DNA Sequencing Research Group (DSRG) study, we compared outcomes from two leading companies, Agilent Technologies (Santa Clara, CA, USA) and Roche NimbleGen (Madison, WI, USA), which offer custom-targeted genomic enrichment methods. Both companies were provided with the same genomic sample and challenged to capture identical genomic locations for DNA NGS. The target region totaled 3.5 Mb and included 31 individual genes and a 2-Mb contiguous interval. Each company was asked to design its best assay, perform the capture in replicates, and return the captured material to the DSRG-participating laboratories. Sequencing was performed in two different laboratories on Genome Analyzer IIx systems (Illumina, San Diego, CA, USA). Sequencing data were analyzed for sensitivity, specificity, and coverage of the desired regions. The success of the enrichment was highly dependent on the design of the capture probes. Overall, coverage variability was higher for the Agilent samples. As variant discovery is the ultimate goal for a typical targeted sequencing project, we compared samples for their ability to sequence single-nucleotide polymorphisms (SNPs) as a test of the ability to capture both chromosomes from the sample. In the targeted regions, we detected 2546 SNPs with the NimbleGen samples and 2071 with Agilent's. When limited to the regions that both companies included as baits, the number of SNPs was ∼1000 for each, with Agilent and NimbleGen finding a small number of unique SNPs not found by the other. PMID:23814499

  4. Fusion protein predicted amino acid sequence of the first US avian pneumovirus isolate and lack of heterogeneity among other US isolates.

    PubMed

    Seal, B S; Sellers, H S; Meinersmann, R J

    2000-02-01

    Avian pneumovirus (APV) was first isolated from turkeys in the west-central US following emergence of turkey rhinotracheitis (TRT) during 1996. Subsequently, several APV isolates were obtained from the north-central US. Matrix (M) and fusion (F) protein genes of these isolates were examined for sequence heterogeneity and compared with European APV subtypes A and B. Among US isolates the M gene shared greater than 98% nucleotide sequence identity with only one nonsynonymous change occurring in a single US isolate. Although the F gene among US APV isolates shared 98% nucleotide sequence identity, nine conserved substitutions were detected in the predicted amino acid sequence. The predicted amino acid sequence of the US APV isolate's F protein had 72% sequence identity to the F protein of APV subtype A and 71% sequence identity to the F protein of APV subtype B. This compares with 83% sequence identity between the APV subtype A and B predicted amino acid sequences of the F protein. The US isolates were phylogenetically distinguishable from their European counterparts based on F gene nucleotide or predicted amino acid sequences. Lack of sequence heterogeneity among US APV subtypes indicates these viruses have maintained a relatively stable population since the first outbreak of TRT. Phylogenetic analysis of the F protein among APV isolates supports classification of US isolates as a new APV subtype C.

  5. rRNA sequence comparison of Beauveria bassiana, Tolypocladium cylindrosporum, and Tolypocladium extinguens.

    PubMed

    Rakotonirainy, M S; Dutertre, M; Brygoo, Y; Riba, G

    1991-01-01

    Five strains of Tolypocladium cylindrosporum, one strain of Tolypocladium extinguens, and nine strains of Beauveria bassiana were analyzed using a rapid rRNA sequencing technique. The sequences of two highly variable domains (D1 and D2) located at the 5' end of the 28S-like rRNA molecule were determined. The phylogenetic tree computed from the absolute number of nucleotide differences shows the separation between the genus Beauveria and the genus Tolypocladium and points out that T. cylindrosporum and T. extinguens probably do not belong to the same genus.

  6. Amino acid sequence and glycosylation of functional unit RtH2-e from Rapana thomasiana (gastropod) hemocyanin.

    PubMed

    Stoeva, Stanka; Idakieva, Krasimira; Betzel, Christian; Genov, Nicolay; Voelter, Wolfgang

    2002-03-15

    The complete amino acid sequence of Rapana thomasiana hemocyanin functional unit RtH2-e was determined by direct sequencing and matrix-assisted laser desorption ionization mass spectrometry of peptides obtained by cleavage with EndoLysC proteinase, chymotrypsin, and trypsin. The single-polypeptide chain of RtH2-e consists of 413 amino acid residues and contains two consensus sequences NXS/T (positions 11-19 and 127-129), potential sites for N-glycosylation. Monosaccharide analysis of RtH2-e revealed a carbohydrate content of about 1.1% and the presence of xylose, fucose, mannose, and N-acetylglucosamine, demonstrating that only N-linked carbohydrate chains of high-mannose type seem to be present. On basis of the monosaccharide composition and MALDI-MS analysis of native and PNGase-F-treated chymotryptic glycopeptide fragment of RtH2-e the oligosaccharide Man(5)GlcNAc(2), attached to Asn(127), is suggested. Multiple sequence alignments with other molluscan hemocyanin e functional units revealed an identity of 63% to the cephalopod Octopus dofleini and of 69% to the gastropod Haliotis tuberculata. The present results are discussed in view of the recently determined X-ray structure of the functional unit g of the O. dofleini hemocyanin. PMID:11888200

  7. Amino acid sequence and domain structure of entactin. Homology with epidermal growth factor precursor and low density lipoprotein receptor

    PubMed Central

    1988-01-01

    Entactin (nidogen), a 150-kD sulfated glycoprotein, is a major component of basement membranes and forms a highly stable noncovalent complex with laminin. The complete amino acid sequence of mouse entactin has been derived from sequencing of cDNA clones. The 5.9-kb cDNA contains a 3,735-bp open reading frame followed by a 3'- untranslated region of 2.2 kb. The open reading frame encodes a 1,245- residue polypeptide with an unglycosylated Mr of 136,500, a 28-residue signal peptide, two Asn-linked glycosylation sites, and two potential Ca2+-binding sites. Analysis of the deduced amino acid sequence predicts that the molecule consists of two globular domains of 70 and 36 kD separated by a cysteine-rich domain of 28 kD. The COOH-terminal globular domain shows homology to the EGF precursor and the low density lipoprotein receptor. Entactin contains six EGF-type cysteine-rich repeat units and one copy of a cysteine-repeat motif found in thyroglobulin. The Arg-Gly-Asp cell recognition sequence is present in one of the EGF-type repeats, and a synthetic peptide from the putative cell-binding site of entactin was found to promote the attachment of mouse mammary tumor cells. PMID:3264556

  8. Amino acid sequences of lysozymes newly purified from invertebrates imply wide distribution of a novel class in the lysozyme family.

    PubMed

    Ito, Y; Yoshikawa, A; Hotani, T; Fukuda, S; Sugimura, K; Imoto, T

    1999-01-01

    Lysozymes were purified from three invertebrates: a marine bivalve, a marine conch, and an earthworm. The purified lysozymes all showed a similar molecular weight of 13 kDa on SDS/PAGE. Their N-terminal sequences up to the 33rd residue determined here were apparently homologous among them; in addition, they had a homology with a partial sequence of a starfish lysozyme which had been reported before. The complete sequence of the bivalve lysozyme was determined by peptide mapping and subsequent sequence analysis. This was composed of 123 amino acids including as many as 14 cysteine residues and did not show a clear homology with the known types of lysozymes. However, the homology search of this protein on the protein or nucleic acid database revealed two homologous proteins. One of them was a gene product, CELF22 A3.6 of C. elegans, which was a functionally unknown protein. The other was an isopeptidase of a medicinal leech, named destabilase. Thus, a new type of lysozyme found in at least four species across the three classes of the invertebrates demonstrates a novel class of protein/lysozyme family in invertebrates. The bivalve lysozyme, first characterized here, showed extremely high protein stability and hen lysozyme-like enzymatic features.

  9. Amino acid sequences of lysozymes newly purified from invertebrates imply wide distribution of a novel class in the lysozyme family.

    PubMed

    Ito, Y; Yoshikawa, A; Hotani, T; Fukuda, S; Sugimura, K; Imoto, T

    1999-01-01

    Lysozymes were purified from three invertebrates: a marine bivalve, a marine conch, and an earthworm. The purified lysozymes all showed a similar molecular weight of 13 kDa on SDS/PAGE. Their N-terminal sequences up to the 33rd residue determined here were apparently homologous among them; in addition, they had a homology with a partial sequence of a starfish lysozyme which had been reported before. The complete sequence of the bivalve lysozyme was determined by peptide mapping and subsequent sequence analysis. This was composed of 123 amino acids including as many as 14 cysteine residues and did not show a clear homology with the known types of lysozymes. However, the homology search of this protein on the protein or nucleic acid database revealed two homologous proteins. One of them was a gene product, CELF22 A3.6 of C. elegans, which was a functionally unknown protein. The other was an isopeptidase of a medicinal leech, named destabilase. Thus, a new type of lysozyme found in at least four species across the three classes of the invertebrates demonstrates a novel class of protein/lysozyme family in invertebrates. The bivalve lysozyme, first characterized here, showed extremely high protein stability and hen lysozyme-like enzymatic features. PMID:9914527

  10. Conserved Amino Acid Sequence Features in the α Subunits of MoFe, VFe, and FeFe Nitrogenases

    PubMed Central

    Glazer, Alexander N.; Kechris, Katerina J.

    2009-01-01

    Background This study examines the structural features and phylogeny of the α subunits of 69 full-length NifD (MoFe subunit), VnfD (VFe subunit), and AnfD (FeFe subunit) sequences. Methodology/Principal Findings The analyses of this set of sequences included BLAST scores, multiple sequence alignment, examination of patterns of covariant residues, phylogenetic analysis and comparison of the sequences flanking the conserved Cys and His residues that attach the FeMo cofactor to NifD and that are also conserved in the alternative nitrogenases. The results show that NifD nitrogenases fall into two distinct groups. Group I includes NifD sequences from many genera within Bacteria, including all nitrogen-fixing aerobes examined, as well as strict anaerobes and some facultative anaerobes, but no archaeal sequences. In contrast, Group II NifD sequences were limited to a small number of archaeal and bacterial sequences from strict anaerobes. The VnfD and AnfD sequences fall into two separate groups, more closely related to Group II NifD than to Group I NifD. The pattern of perfectly conserved residues, distributed along the full length of the Group I and II NifD, VnfD, and AnfD, confirms unambiguously that these polypeptides are derived from a common ancestral sequence. Conclusions/Significance There is no indication of a relationship between the patterns of covariant residues specific to each of the four groups discussed above that would give indications of an evolutionary pathway leading from one type of nitrogenase to another. Rather the totality of the data, along with the phylogenetic analysis, is consistent with a radiation of Group I and II NifDs, VnfD and AnfD from a common ancestral sequence. All the data presented here strongly support the suggestion made by some earlier investigators that the nitrogenase family had already evolved in the last common ancestor of the Archaea and Bacteria. PMID:19578539

  11. Complete Genome Sequences of Escherichia coli O157:H7 Strains SRCC 1675 and 28RC, Which Vary in Acid Resistance

    PubMed Central

    Baranzoni, Gian Marco; Reichenberger, Erin R.; Kim, Gwang-Hee; Breidt, Frederick; Kay, Kathryn; Oh, Deog-Hwan

    2016-01-01

    The level of acid resistance among Escherichia coli O157:H7 strains varies, and strains with higher resistance to acid may have a lower infectious dose. The complete genome sequences belonging to two strains of Escherichia coli O157:H7 with different levels of acid resistance are presented here. PMID:27469964

  12. Complete genome sequences of Escherichia coli O157:H7 strains SRCC 1675 and 28RC that vary in acid resistance

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The level of acid resistance among Escherichia coli O157:H7 strains varies, and strains with higher resistance to acid may have a lower infectious dose. The complete genome sequences belonging to two strains of Escherichia coli O157:H7 with different levels of acid resistance are presented....

  13. Complete Genome Sequences of Escherichia coli O157:H7 Strains SRCC 1675 and 28RC, Which Vary in Acid Resistance.

    PubMed

    Baranzoni, Gian Marco; Fratamico, Pina M; Reichenberger, Erin R; Kim, Gwang-Hee; Breidt, Frederick; Kay, Kathryn; Oh, Deog-Hwan

    2016-01-01

    The level of acid resistance among Escherichia coli O157:H7 strains varies, and strains with higher resistance to acid may have a lower infectious dose. The complete genome sequences belonging to two strains of Escherichia coli O157:H7 with different levels of acid resistance are presented here. PMID:27469964

  14. Molecular characterization of the body site-specific human epidermal cytokeratin 9: cDNA cloning, amino acid sequence, and tissue specificity of gene expression.

    PubMed

    Langbein, L; Heid, H W; Moll, I; Franke, W W

    1993-12-01

    Differentiation of human plantar and palmar epidermis is characterized by the suprabasal synthesis of a major special intermediate-sized filament (IF) protein, the type I (acidic) cytokeratin 9 (CK 9). Using partial amino acid (aa) sequence information obtained by direct Edman sequencing of peptides resulting from proteolytic digestion of purified CK 9, we synthesized several redundant primers by 'back-translation'. Amplification by polymerase chain reaction (PCR) of cDNAs obtained by reverse transcription of mRNAs from human foot sole epidermis, including 5'-primer extension, resulted in multiple overlapping cDNA clones, from which the complete cDNA (2353 bp) could be constructed. This cDNA encoded the CK 9 polypeptide with a calculated molecular weight of 61,987 and an isoelectric point at about pH 5.0. The aa sequence deduced from cDNA was verified in several parts by comparison with the peptide sequences and showed the typical structure of type I CKs, with a head (153 aa), and alpha-helical coiled-coil-forming rod (306 aa), and a tail (163 aa) domain. The protein displayed the highest homology to human CK 10, not only in the highly conserved rod domain but also in large parts of the head and the tail domains. On the other hand, the aa sequence revealed some remarkable differences from CK 10 and other CKs, even in the most conserved segments of the rod domain. The nuclease digestion pattern seen on Southern blot analysis of human genomic DNA indicated the existence of a unique CK 9 gene. Using CK 9-specific riboprobes for hybridization on Northern blots of RNAs from various epithelia, a mRNA of about 2.4 kb in length could be identified only in foot sole epidermis, and a weaker cross-hybridization signal was seen in RNA from bovine heel pad epidermis at about 2.0 kb. A large number of tissues and cell cultures were examined by PCR of mRNA-derived cDNAs, using CK 9-specific primers. But even with this very sensitive signal amplification, only palmar

  15. Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities

    PubMed Central

    Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan

    2016-01-01

    Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities. PMID:26866802

  16. Comparison of Ribotyping and sequence-based typing for discriminating among isolates of Bordetella bronchiseptica

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aims: Our goal was to compare the discriminatory power of PvuII ribotyping and MLST using a single set of diverse Bordetella bronchiseptica isolates and to determine whether subtyping based on repeat region sequences of the pertactin gene (prn) provides additional resolution. Methods and Results: ...

  17. Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence

    PubMed Central

    Kim, Jae-In; Kim, Taejung

    2016-01-01

    Epipolar resampling is the procedure of eliminating vertical disparity between stereo images. Due to its importance, many methods have been developed in the computer vision and photogrammetry field. However, we argue that epipolar resampling of image sequences, instead of a single pair, has not been studied thoroughly. In this paper, we compare epipolar resampling methods developed in both fields for handling image sequences. Firstly we briefly review the uncalibrated and calibrated epipolar resampling methods developed in computer vision and photogrammetric epipolar resampling methods. While it is well known that epipolar resampling methods developed in computer vision and in photogrammetry are mathematically identical, we also point out differences in parameter estimation between them. Secondly, we tested representative resampling methods in both fields and performed an analysis. We showed that for epipolar resampling of a single image pair all uncalibrated and photogrammetric methods tested could be used. More importantly, we also showed that, for image sequences, all methods tested, except the photogrammetric Bayesian method, showed significant variations in epipolar resampling performance. Our results indicate that the Bayesian method is favorable for epipolar resampling of image sequences. PMID:27011186

  18. The 5'-flanking regions of three pea legumin genes: comparison of the DNA sequences.

    PubMed Central

    Lycett, G W; Croy, R R; Shirsat, A H; Richards, D M; Boulter, D

    1985-01-01

    Approximately 1200 nucleotides of sequence data from the promoter and 5'-flanking regions of each of three pea (Pisum sativum L.) legumin genes (legA, legB and legC) are presented. The promoter regions of all three genes were found to be identical including the "TATA box", and "CAAT box', and sequences showing homology to the SV40 enhancers. The legA sequence begins to diverge from the others about 300bp from the start codon, whereas the other two genes remain identical for another 550bp. The regions of partial homology exhibit deletions or insertions and some short, comparatively well conserved sequences. The significance of these features is discussed in terms of evolutionary mechanisms and their possible functional roles. The legC gene contains a region that may potentially form either of two mutually exclusive stem-loop structures, one of which has a stem 42bp long, which suggests that it could be fairly stable. We suggest that a mechanism of switching between such alternative structures may play some role in gene control or may represent the insertion of a transposable element. PMID:2997721

  19. Bovine herpesvirus-1: comparison and differentiation of vaccine and field strains based on genomic sequence variation.

    PubMed

    Fulton, R W; d'Offay, J M; Eberle, R

    2013-03-01

    Bovine herpesvirus-1 (BoHV-1) causes significant disease in cattle including respiratory, fetal diseases, and reproductive tract infections. Control programs usually include vaccination with a modified live viral (MLV) vaccine. On occasion BoHV-1 strains are isolated from diseased animals or fetuses postvaccination. Currently there are no markers for differentiating MLV strains from field strains of BoHV-1. In this study several BoHV-1 strains were sequenced using whole-genome sequencing technologies and the data analyzed to identify single nucleotide polymorphisms (SNPs). Strains sequenced included the reference BoHV-1 Cooper strain (GenBank Accession JX898220), eight commercial MLV vaccine strains, and 14 field strains from cases presented for diagnosis. Based on SNP analyses, the viruses could be classified into groups having similar SNP patterns. The eight MLV strains could be differentiated from one another although some were closely related to each other. A number of field strains isolated from animals with a history of prior vaccination had SNP patterns similar to specific MLV viruses, while other field isolates were very distinct from all vaccine strains. The results indicate that some BoHV-1 isolates from clinically ill cattle/fetuses can be associated with a prior MLV vaccination history, but more information is needed on the rate of BoHV-1 genome sequence change before irrefutable associations can be drawn. PMID:23333211

  20. Comparison of two approaches for the classification of 16S rRNA gene sequences.

    PubMed

    Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan

    2014-10-01

    The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80 % of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5 %. Up to 1.4 % of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods.