Science.gov

Sample records for acids sequence comparison

  1. Detecting frame shifts by amino acid sequence comparison.

    PubMed

    Claverie, J M

    1993-12-20

    Various amino acid substitution scoring matrices are used in conjunction with local alignments programs to detect regions of similarity and infer potential common ancestry between proteins. The usual scoring schemes derive from the implicit hypothesis that related proteins evolve from a common ancestor by the accumulation of point mutations and that amino acids tend to be progressively substituted by others with similar properties. However, other frequent single mutation events, like nucleotide insertion or deletion and gene inversion, change the translation reading frame and cause previously encoded amino acid sequences to become unrecognizable at once. Here, I derive five new types of scoring matrix, each capable of detecting a specific frame shift (deletion, insertion and inversion in 3 frames) and use them with a regular local alignments program to detect amino acid sequences that may have derived from alternative reading frames of the same nucleotide sequence. Frame shifts are inferred from the sole comparison of the protein sequences. The five scoring matrices were used with the BLASTP program to compare all the protein sequences in the Swissprot database. Surprisingly, the searches revealed hundreds of highly significant frame shift matches, of which many are likely to represent sequencing errors. Others provide some evidence that frame shift mutations might be used in protein evolution as a way to create new amino acid sequences from pre-existing coding regions. PMID:7903399

  2. Nucleotide and derived amino acid sequences of the major porin of Comamonas acidovorans and comparison of porin primary structures.

    PubMed Central

    Gerbl-Rieger, S; Peters, J; Kellermann, J; Lottspeich, F; Baumeister, W

    1991-01-01

    The DNA sequence of the gene which codes for the major outer membrane porin (Omp32) of Comamonas acidovorans has been determined. The structural gene encodes a precursor consisting of 351 amino acid residues with a signal peptide of 19 amino acid residues. Comparisons with amino acid sequences of outer membrane proteins and porins from several other members of the class Proteobacteria and of the Chlamydia trachomatis porin and the Neurospora crassa mitochondrial porin revealed a motif of eight regions of local homology. The results of this analysis are discussed with regard to common structural features of porins. PMID:1848840

  3. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  4. Analysis of the functional domains of biosynthetic threonine deaminase by comparison of the amino acid sequences of three wild-type alleles to the amino acid sequence of biodegradative threonine deaminase.

    PubMed

    Taillon, B E; Little, R; Lawther, R P

    1988-03-31

    The nucleotide sequence of the gene, ilvA, for biosynthetic threonine deaminase (Tda) from Salmonella typhimurium was determined. The deduced amino acid sequence was compared with the deduced amino acid sequences of the biosynthetic Tda from Escherichia coli K-12 (ilvA) and Saccharomyces cerevisiae (ILV1) and the biodegradative Tda from E. coli K-12 (tdc). The comparison indicated the presence of two types of blocks of homologous amino acids. The first type of homology is in the N-terminal portion of all four isozymes of Tda and probably indicates amino acids involved in catalysis. The second type of homology is found in the C-terminal portion of the three biosynthetic isozymes and presumably is involved in either (i) the binding or interaction of the allosteric effector isoleucine with the enzyme, or (ii) subunit interactions. The sites of amino acid changes of two E. coli K-12 ilvA alleles with altered response to isoleucine are consistent with the conclusion that the C-terminal portion of biosynthetic Tda is involved in allosteric regulation. PMID:3290055

  5. Comparison of the nucleotide and amino acid sequences of the RsrI and EcoRI restriction endonucleases.

    PubMed

    Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J

    1989-12-21

    The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms. PMID:2695392

  6. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Sequence of cDNA for rat cystathionine gamma-lyase and comparison of deduced amino acid sequence with related Escherichia coli enzymes.

    PubMed Central

    Erickson, P F; Maxwell, I H; Su, L J; Baumann, M; Glode, L M

    1990-01-01

    A cDNA clone for cystathionine gamma-lyase was isolated from a rat cDNA library in lambda gt11 by screening with a monospecific antiserum. The identity of this clone, containing 600 bp proximal to the 3'-end of the gene, was confirmed by positive hybridization selection. Northern-blot hybridization showed the expected higher abundance of the corresponding mRNA in liver than in brain. Two further cDNA clones from a plasmid pcD library were isolated by colony hybridization with the first clone and were found to contain inserts of 1600 and 1850 bp. One of these was confirmed as encoding cystathionine gamma-lyase by hybridization with two independent pools of oligodeoxynucleotides corresponding to partial amino acid sequence information for cystathionine gamma-lyase. The other clone (estimated to represent all but 8% of the 5'-end of the mRNA) was sequenced and its deduced amino acid sequence showed similarity to those of the Escherichia coli enzymes cystathionine beta-lyase and cystathionine gamma-synthase throughout its length, especially to that of the latter. Images Fig. 1. Fig. 2. Fig. 3. Fig. 5. PMID:2201285

  8. Sequence Comparison and Phylogeny of Nucleotide Sequence of Coat Protein and Nucleic Acid Binding Protein of a Distinct Isolate of Shallot virus X from India.

    PubMed

    Majumder, S; Baranwal, V K

    2011-06-01

    Shallot virus X (ShVX), a type species in the genus Allexivirus of the family Alfaflexiviridae has been associated with shallot plants in India and other shallot growing countries like Russia, Germany, Netherland, and New Zealand. Coat protein (CP) and nucleic acid binding protein (NB) region of the virus was obtained by reverse transcriptase polymerase chain reaction from scales leaves of shallot bulbs. The partial cDNA contained two open reading frames encoding proteins of molecular weights of 28.66 and 14.18 kDa belonging to Flexi_CP super-family and viral NB super-family, respectively. The percent identity and phylogenetic analysis of amino acid sequences of CP and NB region of the virus associated with shallot indicated that it was a distinct isolate of ShVX. PMID:23637504

  9. CSTX-9, a toxic peptide from the spider Cupiennius salei: amino acid sequence, disulphide bridge pattern and comparison with other spider toxins containing the cystine knot structure.

    PubMed

    Schalle, J; Kämpfer, U; Schürch, S; Kuhn-Nentwig, L; Haeberli, S; Nentwig, W

    2001-09-01

    CSTX-9 (68 residues, 7530.9 Da) is one of the most abundant toxic polypeptides in the venom of the wandering spider Cupiennius salei. The amino acid sequence was determined by Edman degradation using reduced and alkylated CSTX-9 and peptides generated by cleavages with endoproteinase Asp-N and trypsin, respectively. Sequence comparison with CSTX-1, the most abundant and the most toxic polypeptide in the crude spider venom, revealed a high degree of similarity (53% identity). By means of limited proteolysis with immobilised trypsin and RP-HPLC, the cystine-containing peptides of CSTX-9 were isolated and the disulphide bridges were assigned by amino acid analysis, Edman degradation and nanospray tandem mass spectrometry. The four disulphide bonds present in CSTX-9 are arranged in the following pattern: 1-4, 2-5, 3-8 and 6-7 (Cys6-Cys21, Cys13-Cys30, Cys20-Cys48, Cys32-Cys46). Sequence comparison of CSTX-1 with CSTX-9 clearly indicates the same disulphide bridge pattern, which is also found in other spider polypeptide toxins, e.g. agatoxins (omega-AGA-IVA, omega-AGA-IVB, mu-AGA-I and mu-AGA-VI) from Agelenopsis aperta, SNX-325 from Segestria florentina and curtatoxins (CT-I, CT-II and CT-III) from Hololena curta. CSTX-1/CSTX-9 belong to the family of ion channel toxins containing the inhibitor cystine knot structural motif. CSTX-9, lacking the lysine-rich C-terminal tail of CSTX-1, exhibits a ninefold lower toxicity to Drosophila melanogaster than CSTX-1. This is in accordance with previous observations of CSTX-2a and CSTX-2b, two truncated forms of CSTX-1 which, like CSTX-9, also lack the C-terminal lysine-rich tail. PMID:11693532

  10. High speed nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  11. COMPARISON OF PHYLOGENETIC RELATIONSHIPS BASED ON PHOSPHOLIPID FATTY ACID PROFILES AND RIBOSOMAL RNA SEQUENCE SIMILARITIES AMONG DISSIMILATORY SULFATE-REDUCING BACTERIA

    EPA Science Inventory

    Twenty-five isolates of dissimilatory sulfate-reducing bacteria were clustered based on similarity analysis of their phospholipid ester-linked fatty acids (PLFA). f these, twenty-three showed the phylogenetic relationships based on the sequence similarity of their 16S rRNA direct...

  12. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  13. A simple method for global sequence comparison.

    PubMed Central

    Pizzi, E; Attimonelli, M; Liuni, S; Frontali, C; Saccone, C

    1992-01-01

    A simple method of sequence comparison, based on a correlation analysis of oligonucleotide frequency distributions, is here shown to be a reliable test of overall sequence similarity. The method does not involve sequence alignment procedures and permits the rapid screening of large amounts of sequence data. It identifies those sequences which deserve more careful analysis of sequence similarity at the level of resolution of the single nucleotide. It uses observed quantities only and does not involve the adoption of any theoretical model. PMID:1738591

  14. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  15. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  16. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  17. Method and apparatus for biological sequence comparison

    DOEpatents

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  18. Method and apparatus for biological sequence comparison

    DOEpatents

    Marr, Thomas G.; Chang, William I-Wei

    1997-01-01

    A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.

  19. Classification and identification of geminiviruses using sequence comparisons.

    PubMed

    Padidam, M; Beachy, R N; Fauquet, C M

    1995-02-01

    The genomes and ORFs of 36 geminiviruses were compared to obtain phylogenetic trees and frequency distributions of all possible pairwise comparisons with an objective to classify geminiviruses. Such comparisons show that geminiviruses form two distinct clusters of leafhopper-transmitted viruses that infect monocots (subgroup I) and whitefly-transmitted viruses that infect dicots (subgroup III), irrespective of the part of the genome considered. Of the two leafhopper-transmitted viruses that infect dicots, tobacco yellow dwarf virus has a sequence most similar to subgroup I viruses, and that of beet curly top virus differed depending upon the ORF considered. The distributions of identities within subgroups are significantly different suggesting that the taxonomic status of a particular isolate within a subgroup can be quantified. All the recognized strains of any one virus have greater than 90% sequence identity. It was observed that the 200 nucleotide intercistronic regions of geminiviruses are more variable than the remainder of the genome. The amino acid sequences of the coat protein (CP) of subgroup III viruses are more conserved than the remainder of the genome. However, a short N-terminal region (60-70 amino acids) of the CP is more variable than the rest of the CP sequence and is a close representation of the genome. PCR primers based on conserved sequences can be used to clone and sequence the N-terminal sequences of the CP of the geminiviruses; this sequence is sufficient to classify a virus isolate. A possible taxonomic structure for geminiviruses is proposed after considering the sequence comparisons and biological properties. PMID:7844548

  20. Amino acid sequence of Salmonella typhimurium branched-chain amino acid aminotransferase.

    PubMed

    Feild, M J; Nguyen, D C; Armstrong, F B

    1989-06-13

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase (transaminase B, EC 2.6.1.42) of Salmonella typhimurium was determined. An Escherichia coli recombinant containing the ilvGEDAY gene cluster of Salmonella was used as the source of the hexameric enzyme. The peptide fragments used for sequencing were generated by treatment with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. The enzyme subunit contains 308 residues and has a molecular weight of 33,920. To determine the coenzyme-binding site, the pyridoxal 5-phosphate containing enzyme was treated with tritiated sodium borohydride prior to trypsin digestion. Peptide map comparisons with an apoenzyme tryptic digest and monitoring radioactivity incorporation allowed identification of the pyridoxylated peptide, which was then isolated and sequenced. The coenzyme-binding site is the lysyl residue at position 159. The amino acid sequence of Salmonella transaminase B is 97.4% identical with that of Escherichia coli, differing in only eight amino acid positions. Sequence comparisons of transaminase B to other known aminotransferase sequences revealed limited sequence similarity (24-33%) when conserved amino acid substitutions are allowed and alignments were forced to occur on the coenzyme-binding site. PMID:2669973

  1. Molecular dynamic simulations of environment and sequence dependent DNA conformations: the development of the BMS nucleic acid force field and comparison with experimental results.

    PubMed

    Langley, D R

    1998-12-01

    Molecular dynamic (MD) simulations using the BMS nucleic acid force field produce environment and sequence dependent DNA conformations that closely mimic experimentally derived structures. The parameters were initially developed to reproduce the potential energy surface, as defined by quantum mechanics, for a set of small molecules that can be used as the building blocks for nucleic acid macromolecules (dimethyl phosphate, cyclopentane, tetrahydrofuran, etc.). Then the dihedral parameters were fine tuned using a series of condensed phase MD simulations of DNA and RNA (in zero added salt, 4M NaCl, and 75% ethanol solutions). In the tuning process the free energy surface for each dihedral was derived from the MD ensemble and fitted to the conformational distributions and populations observed in 87 A- and B-DNA x-ray and 17 B-DNA NMR structures. Over 41 nanoseconds of MD simulations are presented which demonstrate that the force field is capable of producing stable trajectories, in the correct environments, of A-DNA, double stranded A-form RNA, B-DNA, Z-DNA, and a netropsin-DNA complex that closely reproduce the experimentally determined and/or canonical DNA conformations. Frequently the MD averaged structure is closer to the experimentally determined structure than to the canonical DNA conformation. MD simulations of A- to B- and B- to A-DNA transitions are also shown. A-DNA simulations in a low salt environment cleanly convert into the B-DNA conformation and converge into the RMS space sampled by a low salt simulation of the same sequence starting from B-DNA. In MD simulations using the BMS force field the B-form of d(GGGCCC)2 in a 75% ethanol solution converts into the A-form. Using the same methodology, parameters, and conditions the A-form of d(AAATTT)2 correctly converts into the B-DNA conformation. These studies demonstrate that the force field is capable of reproducing both environment and sequence dependent DNA structures. The 41 nanoseconds (nsec) of MD

  2. Adaptive seeds tame genomic sequence comparison.

    PubMed

    Kiełbasa, Szymon M; Wan, Raymond; Sato, Kengo; Horton, Paul; Frith, Martin C

    2011-03-01

    The main way of analyzing biological sequences is by comparing and aligning them to each other. It remains difficult, however, to compare modern multi-billionbase DNA data sets. The difficulty is caused by the nonuniform (oligo)nucleotide composition of these sequences, rather than their size per se. To solve this problem, we modified the standard seed-and-extend approach (e.g., BLAST) to use adaptive seeds. Adaptive seeds are matches that are chosen based on their rareness, instead of using fixed-length matches. This method guarantees that the number of matches, and thus the running time, increases linearly, instead of quadratically, with sequence length. LAST, our open source implementation of adaptive seeds, enables fast and sensitive comparison of large sequences with arbitrarily nonuniform composition. PMID:21209072

  3. Quantitation of HIV-1 RNA viral load using nucleic acid sequence based amplification methodology and comparison with other surrogate markers for disease progression.

    PubMed

    Sitnik, R; Pinho, J R

    1998-01-01

    In this study, HIV-1 viral blood quantitation determined by Nucleic Acid Sequence Based Amplification (NASBA) was compared with other surrogate disease progression markers (antigen p24, CD4/CD8 cell counts and beta-2 microglobulin) in 540 patients followed up at São Paulo, SP, Brazil. HIV-1 RNA detection was statistically associated with the presence of antigen p24, but the viral RNA was also detected in 68% of the antigen p24 negative samples, confirming that NASBA is much more sensitive than the determination of antigen p24. Regarding other surrogate markers, no statistically significant association with the detection of viral RNA was found. The reproducibility of this viral load assay was assessed by 14 runs of the same sample, using different reagents batches. Viral load values in this sample ranged from 5.83 to 6.27 log (CV = 36%), less than the range (0.5 log) established to the determination of significant viral load changes. PMID:9698880

  4. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  5. Amino-Acid Sequence of Porcine Pepsin

    PubMed Central

    Tang, J.; Sepulveda, P.; Marciniszyn, J.; Chen, K. C. S.; Huang, W-Y.; Tao, N.; Liu, D.; Lanier, J. P.

    1973-01-01

    As the culmination of several years of experiments, we propose a complete amino-acid sequence for porcine pepsin, an enzyme containing 327 amino-acid residues in a single polypeptide chain. In the sequence determination, the enzyme was treated with cyanogen bromide. Five resulting fragments were purified. The amino-acid sequence of four of the fragments accounted for 290 residues. Because the structure of a 37-residue carboxyl-terminal fragment was already known, it was not studied. The alignment of these fragments was determined from the sequence of methionyl-peptides we had previously reported. We also discovered the locations of activesite aspartyl residues, as well as the pairing of the three disulfide bridges. A minor component of commercial crystalline pepsin was found to contain two extra amino-acid residues, Ala-Leu-, at the amino-terminus of the molecule. This minor component was apparently derived from a different site of cleavage during the activation of porcine pepsinogen. PMID:4587252

  6. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  7. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  8. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  9. The nucleotide sequence of the mouse immunoglobulin epsilon gene: comparison with the human epsilon gene sequence.

    PubMed Central

    Ishida, N; Ueda, S; Hayashida, H; Miyata, T; Honjo, T

    1982-01-01

    We have determined the nucleotide sequence of the immunoglobulin epsilon gene cloned from newborn mouse DNA. The epsilon gene sequence allows prediction of the amino acid sequence of the constant region of the epsilon chain and comparison of it with sequences of the human epsilon and other mouse immunoglobulin genes. The epsilon gene was shown to be under the weakest selection pressure at the protein level among the immunoglobulin genes although the divergence at the synonymous position is similar. Our results suggest that the epsilon gene may be dispensable, which is in accord with the fact that IgE has only obscure roles in the immune defense system but has an undesirable role as a mediator of hypersensitivity. The sequence data suggest that the human and murine epsilon genes were derived from different ancestors duplicated a long time ago. The amino acid sequence of the epsilon chain is more homologous to those of the gamma chains than the other mouse heavy chains. Two membrane exons, separated by an 80-base intron, were identified 1.7 kb 3' to the CH4 domain of the epsilon gene and shown to conserve a hydrophobic portion similar to those of other heavy chain genes. RNA blot hybridization showed that the epsilon membrane exons are transcribed into two species of mRNA in an IgE hybridoma. Images Fig. 4. PMID:6329728

  10. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  11. Protein sequence comparison and protein evolution

    SciTech Connect

    Pearson, W.R.

    1995-12-31

    This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. This tutorial examines how the information conserved during the evolution of a protein molecule can be used to infer reliably homology, and thus a shared proteinfold and possibly a shared active site or function. The authors start by reviewing a geological/evolutionary time scale. Next they look at the evolution of several protein families. During the tutorial, these families will be used to demonstrate that homologous protein ancestry can be inferred with confidence. They also examine different modes of protein evolution and consider some hypotheses that have been presented to explain the very earliest events in protein evolution. The next part of the tutorial will examine the technical aspects of protein sequence comparison. Both optimal and heuristic algorithms and their associated parameters that are used to characterize protein sequence similarities are discussed. Perhaps more importantly, they survey the statistics of local similarity scores, and how these statistics can both be used to improve the selectivity of a search and to evaluate the significance of a match. They them examine distantly related members of three protein families, the serine proteases, the glutathione transferases, and the G-protein-coupled receptors (GCRs). Finally, the discuss how sequence similarity can be used to examine internal repeated or mosaic structures in proteins.

  12. The amino acid sequence of rabbit muscle triose phosphate isomerase.

    PubMed Central

    Corran, P H; Waley, S G

    1975-01-01

    The amino acid sequence of rabbit muscle triose phosphate isomerase was deduced by characterizing peptides that overlap the tryptic peptides. Thiol groups were modified by oxidation, carboxymethylation or aminoen. About 50 peptides that provided information about overlaps were isolated; the peptides were mostly characterized by their compositions and N-terminal residues. The peptide chains contain 248 amino acid residues, and no evidence for dissimilarity of the two subunits that comprise the native enzyme was found. The sequence of the rabbit muscle enzyme may be compared with that of the coelacanth enzyme (Kolb et al., 1974): 84% of the residues are in identical positions. Similarly, comparison of the sequence with that inferred for the chicken enzyme (Furth et al., 1974) shows that 87% of the residues are in identical positions. Limited though these comparisons are, they suggest that triose phosphate isomerase has one of the lowest rates of evolutionary change. An extended version of the present paper has been deposited as Supplementary Publication SUP 50040 (42 pages) at the British Library (Lending Division) (formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1171682

  13. Single-channel studies on linear gramicidins with altered amino acid sequences. A comparison of phenylalanine, tryptophane, and tyrosine substitutions at positions 1 and 11.

    PubMed Central

    Mazet, J L; Andersen, O S; Koeppe, R E

    1984-01-01

    The relation between chemical structure and permeability characteristics of transmembrane channels has been investigated with the linear gramicidins (A, B, and C), where the amino acid at position 1 was chemically replaced by phenylalanine, tryptophane or tyrosine. The purity of most of the compounds was estimated to be greater than 99.99%. The modifications resulted in a wide range of conductance changes in NaCl solutions: sixfold from tryptophane gramicidin A to tyrosine gramicidin B. The conductance changes induced by a given amino acid substitution at position 1 are not the same as at position 11. The only important change in the Na+ affinity was observed when the first amino acid was tyrosine. No major conformational changes of the polypeptide backbone structure could be detected on the basis of experiments with mixtures of different analogues and valine gramicidin A (except possibly with tyrosine at position 1), as all the compounds investigated could form hybrid channels with valine gramicidin A. The side chains are not in direct contact with the permeating ions. The results were therefore interpreted in terms of modifications of the energy profile for ion movement through the channel, possibly due to an electrostatic interaction between the dipoles of the side chains and ions in the channel. Images FIGURE 1 FIGURE 2 FIGURE 3 PMID:6201199

  14. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  15. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  16. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of the sequence listing in accordance with the requirements in 37 CFR...

  17. Partial amino acid sequence of human factor D:homology with serine proteases.

    PubMed Central

    Volanakis, J E; Bhown, A; Bennett, J C; Mole, J E

    1980-01-01

    Human factor D purified to homogeneity by a modified procedure was subjected to NH2-terminal amino acid sequence analysis by using a modified automated Beckman sequencer. We identified 48 of the first 57 NH2-terminal amino acids in a single sequencer run, using microgram quantities of factor D. The deduced amino acid sequence represents approximately 25% of the primary structure of factor D. This extended NH2-terminal amino acid sequence of factor D was compared to that of other trypsin-related serine proteases. By visual inspection, strong homologies (33--50% identity) were observed with all the serine proteases included in the comparison. Interestingly, factor D showed a higher degree of homology to serine proteases of pancreatic origin than to those of serum origin. Images PMID:6987665

  18. Predicting intrinsic disorder from amino acid sequence.

    PubMed

    Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J; Dunker, A Keith

    2003-01-01

    Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. PMID:14579347

  19. Identification of the bacteriophage T5 dUTPase by protein sequence comparisons.

    PubMed

    Kaliman, A V

    1996-01-01

    It is shown by protein sequence comparisons that a 148 amino acid open reading frame (ORF 148) located at 67% of the bacteriophage T5 genome encodes a protein with strong similarity to known dUTPases. This protein contains five characteristic amino acid sequence motifs that are common to the dUTPase gene family. A similarity in size and high degree of sequence identity strongly suggest that the protein encoded by the ORF 148 of bacteriophage T5 is dUTPase. PMID:8988373

  20. Comparison of Next-Generation Sequencing Systems

    PubMed Central

    Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

    2012-01-01

    With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized. PMID:22829749

  1. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  2. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  3. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  4. Matrix genes of measles virus and canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences.

    PubMed Central

    Bellini, W J; Englund, G; Richardson, C D; Rozenblatt, S; Lazzarini, R A

    1986-01-01

    The nucleotide sequences encoding the matrix (M) proteins of measles virus (MV) and canine distemper virus (CDV) were determined from cDNA clones containing these genes in their entirety. In both cases, single open reading frames specifying basic proteins of 335 amino acid residues were predicted from the nucleotide sequences. Both viral messages were composed of approximately 1,450 nucleotides and contained 400 nucleotides of presumptive noncoding sequences at their respective 3' ends. MV and CDV M-protein-coding regions were 67% homologous at the nucleotide level and 76% homologous at the amino acid level. Only chance homology was observed in the 400-nucleotide trailer sequences. Comparisons of the M protein sequences of MV and CDV with the sequence reported for Sendai virus (B. M. Blumberg, K. Rose, M. G. Simona, L. Roux, C. Giorgi, and D. Kolakofsky, J. Virol. 52:656-663; Y. Hidaka, T. Kanda, K. Iwasaki, A. Nomoto, T. Shioda, and H. Shibuta, Nucleic Acids Res. 12:7965-7973) indicated the greatest homology among these M proteins in the carboxyterminal third of the molecule. Secondary-structure analyses of this shared region indicated a structurally conserved, hydrophobic sequence which possibly interacted with the lipid bilayer. Images PMID:3754588

  5. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  6. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  7. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    David J. States

    1998-08-01

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  8. Sequence comparison via polar coordinates representation and curve tree.

    PubMed

    Dai, Qi; Guo, Xiaodong; Li, Lihua

    2012-01-01

    Sequence comparison has become one of the essential bioinformatics tools in bioinformatics research, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations among the sequences. Existing graphical representation methods have achieved promising results in sequence comparison, but there are some design challenges with the graphical representations and feature-based measures. We reported here a new method for sequence comparison. It considers whole distribution of dual bases and employs polar coordinates method to map a biological sequence into a closed curve. The curve tree was then constructed to numerically characterize the closed curve of biological sequences, and further compared biological sequences by evaluating the distance of the curve tree of the query sequence matching against a corresponding curve tree of the template sequence. The proposed method was tested by phylogenetic analysis, and its performance was further compared with alignment-based methods. The results demonstrate that using polar coordinates representation and curve tree to compare sequences is more efficient. PMID:22001081

  9. Recognition of Yeast Species from Gene Sequence Comparisons

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...

  10. From Artificial Amino Acids to Sequence-Defined Targeted Oligoaminoamides.

    PubMed

    Morys, Stephan; Wagner, Ernst; Lächelt, Ulrich

    2016-01-01

    Artificial oligoamino acids with appropriate protecting groups can be used for the sequential assembly of oligoaminoamides on solid-phase. With the help of these oligoamino acids multifunctional nucleic acid (NA) carriers can be designed and produced in highly defined topologies. Here we describe the synthesis of the artificial oligoamino acid Fmoc-Stp(Boc3)-OH, the subsequent assembly into sequence-defined oligomers and the formulation of tumor-targeted plasmid DNA (pDNA) polyplexes. PMID:27436323

  11. Amino Acid Sequence of Anionic Peroxidase from the Windmill Palm Tree Trachycarpus fortunei

    PubMed Central

    2015-01-01

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications. PMID:25383699

  12. The amino acid sequence of the peptide containing the thiol group of creatine kinase from normal and dystrophic chicken breast muscle. Comparison of some of the immunological properties of the antibodies developed in rabbits against these enzymes

    PubMed Central

    Roy, Buddha P.

    1974-01-01

    The major 14C-labelled peptides from creatine kinase from normal and dystrophic chicken muscle obtained by carboxymethylating the reactive thiol groups with iodo[2-14C]acetic acid and digestion with trypsin were purified by ion-exchange chromatography on Dowex-50 (X2) and by paper electrophoresis. The chromatographic characteristics of the 14C-labelled peptides, their electrophoretic mobilities at pH6.5, and their amino acid compositions were identical for the two enzymes. The sequence of amino acids around the essential thiol groups of creatine kinase from normal and dystrophic chicken muscle was shown to be Ile-Leu-Thr-CmCys-Pro-Ser-Asn-Leu-Gly-Thr-Gly-Leu-Arg (CmCys, carboxymethylcysteine). This sequence is almost identical with that for the creatine kinases in human and ox muscle and bovine brain and is very similar to that of arginine kinase from lobster muscle. Antibodies to the enzymes were raised in rabbits and their reaction with the creatine kinase from normal and dystrophic muscles in interfacial, immunodiffusion and immunoelectrophoretic experiments was studied. The cross-reaction between normal muscle creatine kinase and antisera against the dystrophic muscle enzyme (or vice versa) observed by immunodiffusion and by immunoelectrophoretic experiments further suggests that the enzymes from normal and dystrophic chicken muscle are similar in structure. The results of the present study, the identical amino acid sequence of the peptides containing the reactive thiol group from both the normal and dystrophic chicken muscle enzymes and the immunological similarities of the two enzymes are in accord with the similarity of the two enzymes observed by Roy et al. (1970). ImagesPLATE 1 PMID:4219281

  13. Segments of amino acid sequence similarity in beta-amylases.

    PubMed

    Friedberg, F; Rhodes, C

    1988-01-01

    In alpha-amylases from animals, plants and bacteria and in beta-amylases from plants and bacteria a number of segments exhibit amino acid sequence similarity specific to the alpha or to the beta type, respectively. In the case of the beta-amylases the similar sequence regions are extensive and they are disrupted only by short interspersed dissimilar regions. Close to the C terminus, however, no such sequence similarity exist. PMID:2464171

  14. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  15. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  16. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  18. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  19. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences. PMID:18397498

  20. A method to find palindromes in nucleic acid sequences.

    PubMed

    Anjana, Ramnath; Shankar, Mani; Vaishnavi, Marthandan Kirti; Sekar, Kanagaraj

    2013-01-01

    Various types of sequences in the human genome are known to play important roles in different aspects of genomic functioning. Among these sequences, palindromic nucleic acid sequences are one such type that have been studied in detail and found to influence a wide variety of genomic characteristics. For a nucleotide sequence to be considered as a palindrome, its complementary strand must read the same in the opposite direction. For example, both the strands i.e the strand going from 5' to 3' and its complementary strand from 3' to 5' must be complementary. A typical nucleotide palindromic sequence would be TATA (5' to 3') and its complimentary sequence from 3' to 5' would be ATAT. Thus, a new method has been developed using dynamic programming to fetch the palindromic nucleic acid sequences. The new method uses less memory and thereby it increases the overall speed and efficiency. The proposed method has been tested using the bacterial (3891 KB bases) and human chromosomal sequences (Chr-18: 74366 kb and Chr-Y: 25554 kb) and the computation time for finding the palindromic sequences is in milli seconds. PMID:23515654

  1. Comparison of mitochondrial genome sequences of pangolins (Mammalia, Pholidota).

    PubMed

    Hassanin, Alexandre; Hugot, Jean-Pierre; van Vuuren, Bettine Jansen

    2015-04-01

    The complete mitochondrial genome was sequenced for three species of pangolins, Manis javanica, Phataginus tricuspis, and Smutsia temminckii, and comparisons were made with two other species, Manis pentadactyla and Phataginus tetradactyla. The genome of Manidae contains the 37 genes found in a typical mammalian genome, and the structure of the control region is highly conserved among species. In Manis, the overall base composition differs from that found in African genera. Phylogenetic analyses support the monophyly of the genera Manis, Phataginus, and Smutsia, as well as the basal division between Maninae and Smutsiinae. Comparisons with GenBank sequences reveal that the reference genomes of M. pentadactyla and P. tetradactyla (accession numbers NC_016008 and NC_004027) were sequenced from misidentified taxa, and that a new species of tree pangolin should be described in Gabon. PMID:25746396

  2. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  3. New families in the classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B; Bairoch, A

    1993-01-01

    301 glycosyl hydrolases and related enzymes corresponding to 39 EC entries of the I.U.B. classification system have been classified into 35 families on the basis of amino-acid-sequence similarities [Henrissat (1991) Biochem. J. 280, 309-316]. Approximately half of the families were found to be monospecific (containing only one EC number), whereas the other half were found to be polyspecific (containing at least two EC numbers). A > 60% increase in sequence data for glycosyl hydrolases (181 additional enzymes or enzyme domains sequences have since become available) allowed us to update the classification not only by the addition of more members to already identified families, but also by the finding of ten new families. On the basis of a comparison of 482 sequences corresponding to 52 EC entries, 45 families, out of which 22 are polyspecific, can now be defined. This classification has been implemented in the SWISS-PROT protein sequence data bank. PMID:8352747

  4. Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

    SciTech Connect

    Boore, Jeffrey L.

    2004-11-27

    Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincingly resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.

  5. 3D representations of amino acids—applications to protein sequence comparison and classification

    PubMed Central

    Li, Jie; Koehl, Patrice

    2014-01-01

    The amino acid sequence of a protein is the key to understanding its structure and ultimately its function in the cell. This paper addresses the fundamental issue of encoding amino acids in ways that the representation of such a protein sequence facilitates the decoding of its information content. We show that a feature-based representation in a three-dimensional (3D) space derived from amino acid substitution matrices provides an adequate representation that can be used for direct comparison of protein sequences based on geometry. We measure the performance of such a representation in the context of the protein structural fold prediction problem. We compare the results of classifying different sets of proteins belonging to distinct structural folds against classifications of the same proteins obtained from sequence alone or directly from structural information. We find that sequence alone performs poorly as a structure classifier. We show in contrast that the use of the three dimensional representation of the sequences significantly improves the classification accuracy. We conclude with a discussion of the current limitations of such a representation and with a description of potential improvements. PMID:25379143

  6. On Quantum Algorithm for Multiple Alignment of Amino Acid Sequences

    NASA Astrophysics Data System (ADS)

    Iriyama, Satoshi; Ohya, Masanori

    2009-02-01

    The alignment of genome sequences or amino acid sequences is one of fundamental operations for the study of life. Usual computational complexity for the multiple alignment of N sequences with common length L by dynamic programming is O(LN). This alignment is considered as one of the NP problems, so that it is desirable to find a nice algorithm of the multiple alignment. Thus in this paper we propose the quantum algorithm for the multiple alignment based on the works12,1,2 in which the NP complete problem was shown to be the P problem by means of quantum algorithm and chaos information dynamics.

  7. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences

    PubMed Central

    Derr, Julien; Manapat, Michael L.; Rajamani, Sudha; Leu, Kevin; Xulvi-Brunet, Ramon; Joseph, Isaac; Nowak, Martin A.; Chen, Irene A.

    2012-01-01

    During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life. PMID:22319215

  8. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found. PMID:658039

  9. Sequence information signal processor for local and global string comparisons

    DOEpatents

    Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.

    1997-01-01

    A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.

  10. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group. PMID:1368578

  11. Evaluation of integrated anaerobic/aerobic fixed-bed sequencing batch biofilm reactor for decolorization and biodegradation of azo dye acid red 18: comparison of using two types of packing media.

    PubMed

    Hosseini Koupaie, E; Alavi Moghaddam, M R; Hashemi, S H

    2013-01-01

    Two integrated anaerobic/aerobic fixed-bed sequencing batch biofilm reactor (FB-SBBR) were operated to evaluate decolorization and biodegradation of azo dye Acid Red 18 (AR18). Volcanic pumice stones and a type of plastic media made of polyethylene were used as packing media in FB-SBBR1 and FB-SBBR2, respectively. Decolorization of AR18 in both reactors followed first-order kinetic with respect to dye concentration. More than 63.7% and 71.3% of anaerobically formed 1-naphthylamine-4-sulfonate (1N-4S), as one of the main sulfonated aromatic constituents of AR18 was removed during the aerobic reaction phase in FB-SBBR1 and FB-SBBR2, respectively. Based on statistical analysis, performance of FB-SBBR2 in terms of COD removal as well as biodegradation of 1N-4S was significantly higher than that of FB-SBBR1. Spherical and rod shaped bacteria were the dominant species of bacteria in the biofilm grown on the pumice stones surfaces, while, the biofilm grown on surfaces of the polyethylene media had a fluffy structure. PMID:23138064

  12. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  13. A statistical physics perspective on alignment-independent protein sequence comparison

    PubMed Central

    Chattopadhyay, Amit K.; Nasiev, Diar; Flower, Darren R.

    2015-01-01

    Motivation: Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Results: Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from ‘first passage probability distribution’ to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. Contact: d.r.flower@aston.ac.uk PMID:25810434

  14. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  15. Comparison of Complete Genome Sequences of Usutu Virus Strains Detected in Spain, Central Europe, and Africa

    PubMed Central

    Busquets, Núria; Nowotny, Norbert

    2014-01-01

    Abstract The complete genomic sequence of Usutu virus (USUV, genus Flavivirus, family Flaviviridae) strain MB119/06, detected in a pool of Culex pipiens mosquitoes in northeastern Spain (Viladecans, Catalonia) in 2006, was determined and analyzed. The phylogenetic relationship with all other available complete USUV genome sequences was established. The Spanish sequence investigated showed the closest relationship to the USUV prototype strain SA AR 1776 isolated in South Africa in 1959 (96.9% nucleotide and 98.8% amino acid identities). Conserved structural elements and enzyme motifs of the putative polyprotein precursor were identified. Unique amino acid substitutions were recognized; however, their potential roles as virulence markers could not be verified. Comparisons of the polyprotein precursor sequences of USUV strains detected in mosquitoes, birds, and humans could not confirm the predicted role of unique amino acid substitutions in relation to virulence in humans. Phylogenetic analysis of a partial coding section of the NS5 protein gene region indicated that USUV strains circulating in Europe form three different genetic clusters. Broad and targeted surveys for USUV in mosquitoes could reveal further details of the geographic distribution and genetic diversity of the virus in Europe and in Africa. PMID:24746182

  16. Amino acid sequence of the Amur tiger prion protein.

    PubMed

    Wu, Changde; Pang, Wanyong; Zhao, Deming

    2006-10-01

    Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank. PMID:16780982

  17. Amino-terminal amino acid sequence of the major structural polypeptides of avian retroviruses: sequence homology between reticuloendotheliosis virus p30 and p30s of mammalian retroviruses.

    PubMed Central

    Hunter, E; Bhown, A S; Bennett, J C

    1978-01-01

    The major structural polypeptides, p30 of reticuloendotheliosis virus (REV) (strain T) and p27 of avian sarcoma virus B77, have been compared with regard to amino acid composition. NH2-terminal amino acid sequence, and immunological crossreactions. The amino acid composition of the two polypeptides is distinct, and a comparison of the first 30 NH2-terminal amino acids of REV p30 with that for the first 25 of B77 p27 yields only three homologous residues. In competition radioimmunoassays the polypeptides show no crossreactivity. A comparison of the amino acid composition and NH2-terminal amino acid sequence of REV p30 with those reported for several mammalian retrovirus p30s shows remarkable similarities. Both REV and mammalian p30s contain a large number of polar residues in their amino acid composition and show approximately 40% homology in the first 30 NH2-terminal amino acids. No crossreactivity could be observed, however, in competition radioimmunoassays between Rauscher murine leukemia virus p30 and that of REV. The observations reported here suggest a close evolutionary relationship between REV and the mammalian retroviruses. Images PMID:208072

  18. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  19. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

    PubMed Central

    Chan, S J; San Segundo, B; McCormick, M B; Steiner, D F

    1986-01-01

    Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene. PMID:3463996

  20. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  1. Correlation between fibroin amino acid sequence and physical silk properties.

    PubMed

    Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

    2003-09-12

    The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet. PMID:12816957

  2. Amino acid sequence of the nonsecretory ribonuclease of human urine.

    PubMed

    Beintema, J J; Hofsteenge, J; Iwama, M; Morita, T; Ohgi, K; Irie, M; Sugiyama, R H; Schieven, G L; Dekker, C A; Glitz, D G

    1988-06-14

    The amino acid sequence of a nonsecretory ribonuclease isolated from human urine was determined except for the identity of the residue at position 7. Sequence information indicates that the ribonucleases of human liver and spleen and an eosinophil-derived neurotoxin are identical or very closely related gene products. The sequence is identical at about 30% of the amino acid positions with those of all of the secreted mammalian ribonucleases for which information is available. Identical residues include active-site residues histidine-12, histidine-119, and lysine-41, other residues known to be important for substrate binding and catalytic activity, and all eight half-cystine residues common to these enzymes. Major differences include a deletion of six residues in the (so-called) S-peptide loop, insertions of two, and nine residues, respectively, in three other external loops of the molecule, and an addition of three residues at the amino terminus. The sequence shows the human nonsecretory ribonuclease to belong to the same ribonuclease superfamily as the mammalian secretory ribonucleases, turtle pancreatic ribonuclease, and human angiogenin. Sequence data suggest that a gene duplication occurred in an ancient vertebrate ancestor; one branch led to the nonsecretory ribonuclease, while the other branch led to a second duplication, with one line leading to the secretory ribonucleases (in mammals) and the second line leading to pancreatic ribonuclease in turtle and an angiogenic factor in mammals (human angiogenin). The nonsecretory ribonuclease has five short carbohydrate chains attached via asparagine residues at the surface of the molecule; these chains may have been shortened by exoglycosidase action.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3166997

  3. Sequence comparisons in the aminoacyl-tRNA synthetases with emphasis on regions of likely homology with sequences in the Rossmann fold in the methionyl and tyrosyl enzymes.

    PubMed

    Walker, E J; Jeffrey, P D

    1988-02-01

    Amino acid sequences of aminoacyl-tRNA synthetases specific for 12 different amino acids have now been published. Differences in origin at the species and organelle level result in 20 distinct sequences being available for comparison. Some of these were compared in small groups as they were determined and, although some homologies were detected, it was generally concluded that there was surprisingly little sequence homology in this functionally related group of enzymes. We have made comparisons of all of the available sequences by using a combination of computer and manual alignment methods and knowledge of the sequences in the Rossmann fold region of methionyl-tRNA synthetase from E. coli and tyrosyl-tRNA synthetase from B. stearothermophilus, enzymes whose three-dimensional structures have been described. It emerges that all of the aminoacyl-tRNA synthetase sequences thus examined show considerable homology with each other over at least parts of this region, some over virtually all of it. We conclude that a great deal more similarity than had previously been suspected exists in these proteins. In particular, the alignments we have made strongly imply the existence of a mononucleotide binding site of the Rossmann fold configuration in all of the synthetases compared. PMID:3283733

  4. Characterization and amino acid sequence of a fatty acid-binding protein from human heart.

    PubMed

    Offner, G D; Brecher, P; Sawlivich, W B; Costello, C E; Troxler, R F

    1988-05-15

    The complete amino acid sequence of a fatty acid-binding protein from human heart was determined by automated Edman degradation of CNBr, BNPS-skatole [3'-bromo-3-methyl-2-(2-nitrobenzenesulphenyl)indolenine], hydroxylamine, Staphylococcus aureus V8 proteinase, tryptic and chymotryptic peptides, and by digestion of the protein with carboxypeptidase A. The sequence of the blocked N-terminal tryptic peptide from citraconylated protein was determined by collisionally induced decomposition mass spectrometry. The protein contains 132 amino acid residues, is enriched with respect to threonine and lysine, lacks cysteine, has an acetylated valine residue at the N-terminus, and has an Mr of 14768 and an isoelectric point of 5.25. This protein contains two short internal repeated sequences from residues 48-54 and from residues 114-119 located within regions of predicted beta-structure and decreasing hydrophobicity. These short repeats are contained within two longer repeated regions from residues 48-60 and residues 114-125, which display 62% sequence similarity. These regions could accommodate the charged and uncharged moieties of long-chain fatty acids and may represent fatty acid-binding domains consistent with the finding that human heart fatty acid-binding protein binds 2 mol of oleate or palmitate/mol of protein. Detailed evidence for the amino acid sequences of the peptides has been deposited as Supplementary Publication SUP 50143 (23 pages) at the British Library Lending Division, Boston Spa, Yorkshire LS23 7BQ, U.K., from whom copies may be obtained as indicated in Biochem. J. (1988) 249, 5. PMID:3421901

  5. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  6. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  7. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  8. The amino acid sequence of chymopapain from Carica papaya.

    PubMed

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-02-15

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  9. Comparison of Buffer Effect of Different Acids During Sandstone Acidizing

    NASA Astrophysics Data System (ADS)

    Umer Shafiq, Mian; Khaled Ben Mahmud, Hisham; Hamid, Mohamed Ali

    2015-04-01

    The most important concern of sandstone matrix acidizing is to increase the formation permeability by removing the silica particles. To accomplish this, the mud acid (HF: HCl) has been utilized successfully for many years to stimulate the sandstone formations, but still it has many complexities. This paper presents the results of laboratory investigations of different acid combinations (HF: HCl, HF: H3PO4 and HF: HCOOH). Hydrofluoric acid and fluoboric acid are used to dissolve clays and feldspar. Phosphoric and formic acids are added as a buffer to maintain the pH of the solution; also it allows the maximum penetration of acid into the core sample. Different tests have been performed on the core samples before and after the acidizing to do the comparative study on the buffer effect of these acids. The analysis consists of permeability, porosity, color change and pH value tests. There is more increase in permeability and porosity while less change in pH when phosphoric and formic acids were used compared to mud acid. From these results it has been found that the buffer effect of phosphoric acid and formic acid is better than hydrochloric acid.

  10. RCARE: RNA Sequence Comparison and Annotation for RNA Editing

    PubMed Central

    2015-01-01

    The post-transcriptional sequence modification of transcripts through RNA editing is an important mechanism for regulating protein function and is associated with human disease phenotypes. The identification of RNA editing or RNA-DNA difference (RDD) sites is a fundamental step in the study of RNA editing. However, a substantial number of false-positive RDD sites have been identified recently. A major challenge in identifying RDD sites is to distinguish between the true RNA editing sites and the false positives. Furthermore, determining the location of condition-specific RDD sites and elucidating their functional roles will help toward understanding various biological phenomena that are mediated by RNA editing. The present study developed RNA-sequence comparison and annotation for RNA editing (RCARE) for searching, annotating, and visualizing RDD sites using thousands of previously known editing sites, which can be used for comparative analyses between multiple samples. RCARE also provides evidence for improving the reliability of identified RDD sites. RCARE is a web-based comparison, annotation, and visualization tool, which provides rich biological annotations and useful summary plots. The developers of previous tools that identify or annotate RNA-editing sites seldom mention the reliability of their respective tools. In order to address the issue, RCARE utilizes a number of scientific publications and databases to find specific documentations respective to a particular RNA-editing site, which generates evidence levels to convey the reliability of RCARE. Sequence-based alignment files can be converted into VCF files using a Python script and uploaded to the RCARE server for further analysis. RCARE is available for free at http://www.snubi.org/software/rcare/. PMID:26043858

  11. Comparison of DNA Quantification Methods for Next Generation Sequencing

    PubMed Central

    Robin, Jérôme D.; Ludlow, Andrew T.; LaRanger, Ryan; Wright, Woodring E.; Shay, Jerry W.

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library’s heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  12. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    PubMed

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  13. Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species

    NASA Technical Reports Server (NTRS)

    Haney, P. J.; Badger, J. H.; Buldak, G. L.; Reich, C. I.; Woese, C. R.; Olsen, G. J.

    1999-01-01

    The genome sequence of the extremely thermophilic archaeon Methanococcus jannaschii provides a wealth of data on proteins from a thermophile. In this paper, sequences of 115 proteins from M. jannaschii are compared with their homologs from mesophilic Methanococcus species. Although the growth temperatures of the mesophiles are about 50 degrees C below that of M. jannaschii, their genomic G+C contents are nearly identical. The properties most correlated with the proteins of the thermophile include higher residue volume, higher residue hydrophobicity, more charged amino acids (especially Glu, Arg, and Lys), and fewer uncharged polar residues (Ser, Thr, Asn, and Gln). These are recurring themes, with all trends applying to 83-92% of the proteins for which complete sequences were available. Nearly all of the amino acid replacements most significantly correlated with the temperature change are the same relatively conservative changes observed in all proteins, but in the case of the mesophile/thermophile comparison there is a directional bias. We identify 26 specific pairs of amino acids with a statistically significant (P < 0.01) preferred direction of replacement.

  14. Sequence of subunit c of the Na(+)-translocating F1F0 ATPase of Acetobacterium woodii: proposal for determinants of Na+ specificity as revealed by sequence comparisons.

    PubMed

    Rahlfs, S; Müller, V

    1997-03-10

    A 3.2 kb EcoRI fragment carrying genes for Na(+)-F1F0 ATPase was cloned from chromosomal DNA of Acetobacterium woodii. DNA sequence analysis revealed the presence of an open reading frame which was identified by data base searches and comparison with the experimentally derived N-terminal amino acid sequence to code for subunit c of Na(+)-F1F0 ATPase. A comparison of the primary sequences of the two well established Na(+)-translocating F1F0 ATPases from Acetobacterium woodii and Propionigenium modestum with H(+)-translocating enzymes indicates the length of the C-terminus as well as specific residues located in the cytoplasmic membrane to be important for Na+ transport. PMID:9119076

  15. Amino acid sequence prerequisites for the formation of cn ions.

    PubMed

    Downard, K M; Biemann, K

    1993-11-01

    Ammo acid sequence prerequisites are described for the formation of c, ions observed in high-energy collision-induced decomposition spectra of peptides. It is shown that the formation of cn ions is promoted by the nature of the amino acid C-terminal to the cleavage site. A propensity for cn cleavage preceding threonine, and to a lesser extent tryptophan, lysine, and serine, is demonstrated where fragmentation is directed N-terminally at these residues. In addition, the nature of the residue N-terminal to the cleavage site is shown to have little effect on cn ion formation. A mechanism for cn ion formation is proposed and its applicability to the results observed is discussed. PMID:24227531

  16. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  17. Genome sequence comparison of two United States live attenuated vaccines of infectious laryngotracheitis virus (ILTV).

    PubMed

    Chandra, Yohanna Gita; Lee, Jeongyoon; Kong, Byung-Whi

    2012-06-01

    This study was conducted to identify unique nucleotide differences in two U.S. chicken embryo origin (CEO) vaccines [LT Blen (GenBank accession: JQ083493) designated as vaccine 1; Laryngo-Vac(®) (GenBank accession: JQ083494) designated as vaccine 2] of infectious laryngotracheitis virus (ILTV) genomes compared to an Australian Serva vaccine reference ILTV genome sequence [Gallid herpesvirus 1 (GaHV-1); GenBank accession number: HQ630064]. Genomes of the two vaccine ILTV strains were sequenced using Illumina Genome Analyzer 2X of 36 cycles of single-end reads. Results revealed that few nucleotide differences (23 in vaccine 1; 31 in vaccine 2) were found and indicate that the US CEO strains are practically identical to the Australian Serva CEO strain, which is a European-origin vaccine. The sequence differences demonstrated the spectrum of variability among vaccine strains. Only eight amino acid differences were found in ILTV proteins including UL54, UL27, UL28, UL20, UL1, ICP4, and US8 in vaccine 1. Similarly, in vaccine 2, eight amino acid differences were found in UL54, UL27, UL28, UL36, UL1, ICP4, US10, and US8. Further comparison of US CEO vaccines to several ILTV genome sequences revealed that US CEO vaccines are genetically close to both the Serva vaccine and 63140/C/08/BR (GenBank accession: HM188407) and are distinct from the two Australian-origin CEO vaccines, SA2 (GenBank accession: JN596962) and A20 (GenBank accession: JN596963), which showed close similarity to each other. These data demonstrate the potential of high-throughput sequencing technology to yield insight into the sequence variation of different ILTV strains. This information can be used to discriminate between vaccine ILTV strains and further, to identify newly emerging mutant strains of field isolates. PMID:22382591

  18. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment. PMID:23485423

  19. A comparison of chromic acid and sulfuric acid anodizing

    NASA Technical Reports Server (NTRS)

    Danford, M. D.

    1992-01-01

    Because of federal and state mandates restricting the use of hexavalent chromium, it was deemed worthwhile to compare the corrosion protection afforded 2219-T87 aluminum alloy by both Type I chromic acid and Type II sulfuric acid anodizing per MIL-A-8625. Corrosion measurements were made on large, flat 2219-T87 aluminum alloy sheet material with an area of 1 cm(exp 2) exposed to a corrosive medium of 3.5-percent sodium chloride at pH 5.5. Both ac electrochemical impedance spectroscopy and the dc polarization resistance techniques were employed. The results clearly indicate that the corrosion protection obtained by Type II sulfuric acid anodizing is superior, and no problems should result by substituting Type II sulfuric acid anodizing for Type I chromic acid anodizing.

  20. Whole Chloroplast Genome Sequencing in Fragaria Using Deep Sequencing: A Comparison of Three Methods

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Chloroplast sequences previously investigated in Fragaria revealed low amounts of variation. Deep sequencing technologies enable economical sequencing of complete chloroplast genomes. These sequences can potentially provide robust phylogenetic resolution, even at low taxonomic levels within plant gr...

  1. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    PubMed Central

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  2. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison.

    PubMed

    Kato, Mikio

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  3. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  4. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  5. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  6. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  7. Alignment-free comparison of genome sequences by a new numerical characterization.

    PubMed

    Huang, Guohua; Zhou, Houqing; Li, Yongfan; Xu, Lixin

    2011-07-21

    In order to compare different genome sequences, an alignment-free method has proposed. First, we presented a new graphical representation of DNA sequences without degeneracy, which is conducive to intuitive comparison of sequences. Then, a new numerical characterization based on the representation was introduced to quantitatively depict the intrinsic nature of genome sequences, and considered as a 10-dimensional vector in the mathematical space. Alignment-free comparison of sequences was performed by computing the distances between vectors of the corresponding numerical characterizations, which define the evolutionary relationship. Two data sets of DNA sequences were constructed to assess the performance on sequence comparison. The results illustrate well validity of the method. The new numerical characterization provides a powerful tool for genome comparison. PMID:21536050

  8. Sequence comparison on a cluster of workstations using the PVM system

    SciTech Connect

    Guan, X.; Mural, R.J.; Uberbacher, E.C.

    1995-02-01

    We have implemented a distributed sequence comparison algorithm on a cluster of workstations using the PVM paradigm. This implementation has achieved similar performance to the intel iPSC/860 Hypercube, a massively parallel computer. The distributed sequence comparison algorithm serves as a search tool for two Internet servers GRAIL and GENQUEST. This paper describes the implementation and the performance of the algorithm.

  9. Alignment-Free Sequence Comparison Based on Next-Generation Sequencing Reads

    PubMed Central

    Song, Kai; Ren, Jie; Zhai, Zhiyuan; Liu, Xuemei

    2013-01-01

    Abstract Next-generation sequencing (NGS) technologies have generated enormous amounts of shotgun read data, and assembly of the reads can be challenging, especially for organisms without template sequences. We study the power of genome comparison based on shotgun read data without assembly using three alignment-free sequence comparison statistics, D2, \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin{document} $$\\textbf{\\textit{D}}_{\\bf 2}^{\\bf *}$$ \\end{document}, and \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin{document} $$\\textbf{\\textit{D}}_{\\bf 2}^S$$ \\end{document}, both theoretically and by simulations. Theoretical formulas for the power of detecting the relationship between two sequences related through a common motif model are derived. It is shown that both \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin{document} $$\\textbf{\\textit{D}}_{\\bf 2}^{\\bf *}$$ \\end{document} and \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin

  10. Comparison of antioxidant effectiveness of lipoic acid and dihydrolipoic acid.

    PubMed

    Zhao, Feng; Liu, Zai-Qun

    2011-01-01

    The abilities of dihydrolipoic acid (DHLA) to scavenge peroxynitrite (ONOO(-) ), galvinoxyl radical, 2,2'-azinobis(3-ethylbenzothiazoline-6-sulfonate) cation radical (ABTS(+•) ), and 2,2'-diphenyl-1-picrylhydrazyl radical (DPPH) were higher than those of lipoic acid (LA). The effectiveness of DHLA to protect methyl linoleate against 2,2'-azobis(2-amidinopropane hydrochloride) (AAPH)-induced oxidation was about 2.2-fold higher than that of LA, and DHLA can retard the autoxidation of linoleic acid (LH) in the β-carotene-bleaching test. DHLA can also trap ∼0.6 radicals in AAPH-induced oxidation of LH. Moreover, DHLA can scavenge ∼2.0 radicals in AAPH-induced oxidation of DNA and AAPH-induced hemolysis of erythrocytes, whereas LA can scavenge ∼1.5 radicals at the same experimental conditions. DHLA can protect erythrocytes against hemin-induced hemolysis, but accelerate the degradation of DNA in the presence of Cu(2+) . Therefore, the antioxidant capacity of -SH in DHLA is higher than S-S in LA. PMID:21812071

  11. Structural gene and complete amino acid sequence of Pseudomonas aeruginosa IFO 3455 elastase.

    PubMed Central

    Fukushima, J; Yamamoto, S; Morihara, K; Atsumi, Y; Takeuchi, H; Kawamoto, S; Okuda, K

    1989-01-01

    The DNA encoding the elastase of Pseudomonas aeruginosa IFO 3455 was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited high levels of both elastase activity and elastase antigens. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature elastase consisted of 301 amino acids with a relative molecular mass of 32,926 daltons. The amino acid composition predicted from the DNA sequence was quite similar to the chemically determined composition of purified elastase reported previously. We also observed nucleotide sequence encoding a signal peptide and "pro" sequence consisting of 197 amino acids upstream from the mature elastase protein gene. The amino acid sequence analysis revealed that both the N-terminal sequence of the purified elastase and the N-terminal side sequences of the C-terminal tryptic peptide as well as the internal lysyl peptide fragment were completely identical to the deduced amino acid sequences. The pattern of identity of amino acid sequences was quite evident in the regions that include structurally and functionally important residues of Bacillus subtilis thermolysin. PMID:2493453

  12. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    SciTech Connect

    Chang, Soo-Ik ); Hammes, G.G. )

    1989-11-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chicken and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.

  13. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  14. A Parallel Non-Alignment Based Approach to Efficient Sequence Comparison using Longest Common Subsequences

    NASA Astrophysics Data System (ADS)

    Bhowmick, S.; Shafiullah, M.; Rai, H.; Bastola, D.

    2010-11-01

    Biological sequence comparison programs have revolutionized the practice of biochemistry, and molecular and evolutionary biology. Pairwise comparison of genomic sequences is a popular method of choice for analyzing genetic sequence data. However the quality of results from most sequence comparison methods are significantly affected by small perturbations in the data and furthermore, there is a dearth of computational tools to compare sequences beyond a certain length. In this paper, we describe a parallel algorithm for comparing genetic sequences using an alignment free-method based on computing the Longest Common Subsequence (LCS) between genetic sequences. We validate the quality of our results by comparing the phylogenetic tress obtained from ClustalW and LCS. We also show through complexity analysis of the isoefficiency and by empirical measurement of the running time that our algorithm is very scalable.

  15. nWayComp: A Tool for Universal Comparison of DNA and Protein Sequences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The increasing number of whole genomic sequences of microorganisms has increased the complexity of genome-wide annotation and gene sequence comparison among multiple microorganisms. To address this problem, we developed nWayComp software that compares DNA and protein sequences of phylogenetically-r...

  16. Comparison of Whole-Genome Sequences from Two Colony Morphovars of Burkholderia pseudomallei

    PubMed Central

    Hsueh, Pei-Tan; Chen, Yao-Shen; Lin, Hsi-Hsu; Liu, Pei-Ju; Ni, Wen-Fan; Liu, Mei-Chun

    2015-01-01

    The entire genomes of two isogenic morphovars (vgh16W and vgh16R) of Burkholderia pseudomallei were sequenced. A comparison of the sequences from both strains indicates that they show 99.99% identity, are composed of 22 tandem repeated sequences with <100 bp of indels, and have 199 single-base variants. PMID:26472836

  17. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  18. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. PMID:26656109

  19. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  20. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza.

    PubMed

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  1. Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models

    PubMed Central

    Maaskola, Jonas; Rajewsky, Nikolaus

    2014-01-01

    We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized. PMID:25389269

  2. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  3. Evolutionary sequence comparisons using high-density oligonucleotide arrays.

    PubMed

    Hacia, J G; Makalowski, W; Edgemon, K; Erdos, M R; Robbins, C M; Fodor, S P; Brody, L C; Collins, F S

    1998-02-01

    We explored the utility of high-density oligonucleotide arrays (DNA chips) for obtaining sequence information from homologous genes in closely related species. Orthologues of the human BRCA1 exon 11, all approximately 3.4 kb in length and ranging from 98.2% to 83.5% nucleotide identity, were subjected to hybridization-based and conventional dideoxysequencing analysis. Retrospective guidelines for identifying high-fidelity hybridization-based sequence calls were formulated based upon dideoxysequencing results. Prospective application of these rules yielded base-calling with at least 98.8% accuracy over orthologous sequence tracts shown to have approximately 99% identity. For higher primate sequences with greater than 97% nucleotide identity, base-calling was made with at least 99.91% accuracy covering a minimum of 97% of the sequence. Using a second-tier confirmatory hybridization chip strategy, shown in several cases to confirm the identity of predicted sequence changes, the complete sequence of the chimpanzee, gorilla and orangutan orthologues should be deducible solely through hybridization-based methodologies. Analysis of less highly conserved orthologues can still identify conserved nucleotide tracts of at least 15 nucleotides and can provide useful information for designing primers. DNA-chip based assays can be a valuable new technology for obtaining high-throughput cost-effective sequence information from related genomes. PMID:9462745

  4. The complete amino acid sequence of the A-chain of human plasma alpha 2HS-glycoprotein.

    PubMed

    Yoshioka, Y; Gejyo, F; Marti, T; Rickli, E E; Bürgi, W; Offner, G D; Troxler, R F; Schmid, K

    1986-02-01

    Normal human plasma alpha 2HS-glycoprotein has earlier been shown to be comprised of two polypeptide chains. Recently, the amino acid and carbohydrate sequences of the short chain were elucidated (Gejyo, F., Chang, J.-L., Bürgi, W., Schmid, K., Offner, G. D., Troxler, R.F., van Halbeck, H., Dorland, L., Gerwig, G. J., and Vliegenthart, J.F.G. (1983) J. Biol. Chem. 258, 4966-4971). In the present study, the amino acid sequence of the long chain of this protein, designated A-chain, was determined and found to consist of 282 amino acid residues. Twenty-four amino acid doublets were found; the most abundant of these are Pro-Pro and Ala-Ala which each occur five times. Of particular interest is the presence of three Gly-X-Pro and one Gly-Pro-X sequences that are characteristic of the repeating sequences of collagens. Chou-Fasman evaluation of the secondary structure suggested that the A-chain contains 29% alpha-helix, 24% beta-pleated sheet, and 26% reverse turns and, thus, approximately 80% of the polypeptide chain may display ordered structure. Four glycosylation sites were identified. The two N-glycosidic oligosaccharides were found in the center region (residues 138 and 158), whereas the two O-glycosidic heterosaccharides, both linked to threonine (residues 238 and 252), occur within the carboxyl-terminal region. The N-glycans are linked to Asn residues in beta-turns, while the O-glycans are located in short random segments. Comparison of the sequence of the amino- and carboxyl-terminal 30 residues with protein sequences in a data bank demonstrated that the A-chain is not significantly related to any known proteins. However, the proline-rich carboxyl-terminal region of the A-chain displays some sequence similarity to collagens and the collagen-like domains of complement subcomponent C1q. PMID:3944104

  5. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  6. Amino acid sequence of rabbit kidney neutral endopeptidase 24.11 (enkephalinase) deduced from a complementary DNA.

    PubMed Central

    Devault, A; Lazure, C; Nault, C; Le Moual, H; Seidah, N G; Chrétien, M; Kahn, P; Powell, J; Mallet, J; Beaumont, A

    1987-01-01

    Neutral endopeptidase (EC 3.4.24.11) is a major constituent of kidney brush border membranes. It is also present in the brain where it has been shown to be involved in the inactivation of opioid peptides, methionine- and leucine-enkephalins. For this reason this enzyme is often called 'enkephalinase'. In order to characterize the primary structure of the enzyme, oligonucleotide probes were designed from partial amino acid sequences and used to isolate clones from kidney cDNA libraries. Sequencing of the cDNA inserts revealed the complete primary structure of the enzyme. Neutral endopeptidase consists of 750 amino acids. It contains a short N-terminal cytoplasmic domain (27 amino acids), a single membrane-spanning segment (23 amino acids) and an extracellular domain that comprises most of the protein mass. The comparison of the primary structure of neutral endopeptidase with that of thermolysin, a bacterial Zn-metallopeptidase, indicates that most of the amino acid residues involved in Zn coordination and catalytic activity in thermolysin are found within highly honmologous sequences in neutral endopeptidase. Images Fig. 1. Fig. 3. PMID:2440677

  7. Multiple Comparison Analysis of Two New Genomic Sequences of ILTV Strains from China with Other Strains from Different Geographic Regions

    PubMed Central

    Zhao, Yan; Kong, Congcong; Wang, Yunfeng

    2015-01-01

    To date, twenty complete genome sequences of ILTV strains have been published in GenBank, including one strain from China, and nineteen strains from Australian and the United States. To investigate the genomic information on ILTVs from different geographic regions, two additional individual complete genome sequences of WG and K317 strains from China were determined. The genomes of WG and K317 strains were 153,505 and 153,639 bp in length, respectively. Alignments performed on the amino acid sequences of the twelve glycoproteins showed that 13 out of 116 mutational sites were present only among the Chinese strain WG and the Australian strains SA2 and A20. The phylogenetic tree analysis suggested that the WG strain established close relationships with the Australian strain SA2. The recombination events were detected and confirmed in different subregions of the WG strain with the sequences of SA2 and K317 strains as parental. In this study, two new complete genome sequences of Chinese ILTV strains were used in comparative analysis with other complete genome sequences of ILTV strains from China, the United States, and Australia. The analysis of genome comparison, phylogenetic trees, and recombination events showed close relationships among the Chinese strain WG and the Australian strains SA2. The information of the two new complete genome sequences from China will help to facilitate the analysis of phylogenetic relationships and the molecular differences among ILTV strains from different geographic regions. PMID:26186451

  8. Multiple Comparison Analysis of Two New Genomic Sequences of ILTV Strains from China with Other Strains from Different Geographic Regions.

    PubMed

    Zhao, Yan; Kong, Congcong; Wang, Yunfeng

    2015-01-01

    To date, twenty complete genome sequences of ILTV strains have been published in GenBank, including one strain from China, and nineteen strains from Australian and the United States. To investigate the genomic information on ILTVs from different geographic regions, two additional individual complete genome sequences of WG and K317 strains from China were determined. The genomes of WG and K317 strains were 153,505 and 153,639 bp in length, respectively. Alignments performed on the amino acid sequences of the twelve glycoproteins showed that 13 out of 116 mutational sites were present only among the Chinese strain WG and the Australian strains SA2 and A20. The phylogenetic tree analysis suggested that the WG strain established close relationships with the Australian strain SA2. The recombination events were detected and confirmed in different subregions of the WG strain with the sequences of SA2 and K317 strains as parental. In this study, two new complete genome sequences of Chinese ILTV strains were used in comparative analysis with other complete genome sequences of ILTV strains from China, the United States, and Australia. The analysis of genome comparison, phylogenetic trees, and recombination events showed close relationships among the Chinese strain WG and the Australian strains SA2. The information of the two new complete genome sequences from China will help to facilitate the analysis of phylogenetic relationships and the molecular differences among ILTV strains from different geographic regions. PMID:26186451

  9. Quantitative Comparison of Large-Scale DNA Enrichment Sequencing Data.

    PubMed

    Lienhard, Matthias; Chavez, Lukas

    2016-01-01

    DNA enrichment followed by sequencing (DNA-IP seq) is a versatile tool in molecular biology with a wide variety of applications. Computational analysis of differential DNA enrichment between conditions is important for identifying epigenetic alterations in disease compared to healthy controls and for revealing dynamic epigenetic modifications throughout normal and distorted cell differentiation and development. We present a protocol for genome-wide comparative analysis of DNA-IP sequencing data to identify statistically significant differential sequencing coverage between two conditions by considering variation across replicates. The protocol provides a detailed description for the comparative analysis of DNA-IP sequencing data including basic data processing, quality controls, and identification of differential enrichment using the Bioconductor package "MEDIPS". PMID:27008016

  10. Comparison of the rotavirus nonstructural protein NSP1 (NS53) from different species by sequence analysis and northern blot hybridization.

    PubMed

    Dunn, S J; Cross, T L; Greenberg, H B

    1994-08-15

    The nucleotide sequence of gene 5 encoding the rotavirus nonstructural protein NSP1 (NS53) of 6 strains (EW, EHP, RRV, I321, OSU, and Gottfried) was determined and compared to 6 previously reported strains (SA11, UK, RF, Hu803, DS-1, and Wa). The 12 rotavirus strains were derived from a total of five separate species (murine, bovine, simian, porcine, and human). Gene sizes ranged from 1564 to 1611 nucleotides in length and the deduced protein sequences were found to be 486 to 495 amino acids in length. Comparisons of NSP1 amino acid sequences showed identities ranging from 36 to 92%. This diversity was most evident between strains from different species. Phylogenetic analysis revealed a clustering of NSP1 sequences according to species origin with the exception that the human and porcine strains were included in a single grouping. Northern blot hybridizations using additional rotavirus strains from the five species confirmed the grouping found by sequence analysis. The species specificity of NSP1 is consistent with the hypothesis that NSP1 plays a role in host range restriction. PMID:8030275

  11. Close sequence comparisons are sufficient to identify human cis-regulatory elements.

    PubMed

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M; Couronne, Olivier; Pennacchio, Len A

    2006-07-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points. PMID:16769978

  12. Amino acid sequence of Japanese quail (Coturnix japonica) and northern bobwhite (Colinus virginianus) myoglobin.

    PubMed

    Goodson, John; Beckstead, Robert B; Payne, Jason; Singh, Rakesh K; Mohan, Anand

    2015-08-15

    Myoglobin has an important physiological role in vertebrates, and as the primary sarcoplasmic pigment in meat, influences quality perception and consumer acceptability. In this study, the amino acid sequences of Japanese quail and northern bobwhite myoglobin were deduced by cDNA cloning of the coding sequence from mRNA. Japanese quail myoglobin was isolated from quail cardiac muscles, purified using ammonium sulphate precipitation and gel-filtration, and subjected to multiple enzymatic digestions. Mass spectrometry corroborated the deduced protein amino acid sequence at the protein level. Sequence analysis revealed both species' myoglobin structures consist of 153 amino acids, differing at only three positions. When compared with chicken myoglobin, Japanese quail showed 98% sequence identity, and northern bobwhite 97% sequence identity. The myoglobin in both quail species contained eight histidine residues instead of the nine present in chicken and turkey. PMID:25794748

  13. Comparison of simple sequence repeats in 19 Archaea.

    PubMed

    Trivedi, S

    2006-01-01

    All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome. PMID:17183484

  14. Use of gene sequence analyses and genome comparisons for yeast systematics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Detection, identification, and classification of yeasts has undergone a major transformation in the past decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined gene sequences from domains 1 and 2 of large sub...

  15. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  16. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  17. Phylogenetic relationships of Cryptosporidium determined by ribosomal RNA sequence comparison.

    PubMed

    Johnson, A M; Fielke, R; Lumb, R; Baverstock, P R

    1990-04-01

    Reverse transcription of total cellular RNA was used to obtain a partial sequence of the small subunit ribosomal RNA of Cryptosporidium, a protist currently placed in the phylum Apicomplexa. The semi-conserved regions were aligned with homologous sequences in a range of other eukaryotes, and the evolutionary relationships of Cryptosporidium were determined by two different methods of phylogenetic analysis. The prokaryotes Escherichia coli and Halobacterium cuti were included as outgroups. The results do not show an especially close relationship of Cryptosporidium to other members of the phylum Apicomplexa. PMID:2332273

  18. The complete genome sequences of poxviruses isolated from a penguin and a pigeon in South Africa and comparison to other sequenced avipoxviruses

    PubMed Central

    2014-01-01

    Background Two novel avipoxviruses from South Africa have been sequenced, one from a Feral Pigeon (Columba livia) (FeP2) and the other from an African penguin (Spheniscus demersus) (PEPV). We present a purpose-designed bioinformatics pipeline for analysis of next generation sequence data of avian poxviruses and compare the different avipoxviruses sequenced to date with specific emphasis on their evolution and gene content. Results The FeP2 (282 kbp) and PEPV (306 kbp) genomes encode 271 and 284 open reading frames respectively and are more closely related to one another (94.4%) than to either fowlpox virus (FWPV) (85.3% and 84.0% respectively) or Canarypox virus (CNPV) (62.0% and 63.4% respectively). Overall, FeP2, PEPV and FWPV have syntenic gene arrangements; however, major differences exist throughout their genomes. The most striking difference between FeP2 and the FWPV-like avipoxviruses is a large deletion of ~16 kbp from the central region of the genome of FeP2 deleting a cc-chemokine-like gene, two Variola virus B22R orthologues, an N1R/p28-like gene and a V-type Ig domain family gene. FeP2 and PEPV both encode orthologues of vaccinia virus C7L and Interleukin 10. PEPV contains a 77 amino acid long orthologue of Ubiquitin sharing 97% amino acid identity to human ubiquitin. Conclusions The genome sequences of FeP2 and PEPV have greatly added to the limited repository of genomic information available for the Avipoxvirus genus. In the comparison of FeP2 and PEPV to existing sequences, FWPV and CNPV, we have established insights into African avipoxvirus evolution. Our data supports the independent evolution of these South African avipoxviruses from a common ancestral virus to FWPV and CNPV. PMID:24919868

  19. tax and rex Sequences of bovine leukaemia virus from globally diverse isolates: rex amino acid sequence more variable than tax.

    PubMed

    McGirr, K M; Buehring, G C

    2005-02-01

    Bovine leukaemia virus (BLV) is an important agricultural problem with high costs to the dairy industry. Here, we examine the variation of the tax and rex genes of BLV. The tax and rex genes share 420 bases and have overlapping reading frames. The tax gene encodes a protein that functions as a transactivator of the BLV promoter, is required for viral replication, acts on cellular promoters, and is responsible for oncogenesis. The rex facilitates the export of viral mRNAs from the nucleus and regulates transcription. We have sequenced five new isolates of the tax/rex gene. We examined the five new and three previously published tax/rex DNA and predicted amino acid sequences of BLV isolates from cattle in representative regions worldwide. The highest variation among nucleic acid sequences for tax and rex was 7% and 5%, respectively; among predicted amino acid sequences for Tax and Rex, 9% and 11%, respectively. Significantly more nucleotide changes resulted in predicted amino acid changes in the rex gene than in the tax gene (P < or = 0.0006). This variability is higher than previously reported for any region of the viral genome. This research may also have implications for the development of Tax-based vaccines. PMID:15702995

  20. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities. PMID:4029488

  1. A COMPARISON OF FIXED SEQUENCE AND OPTIONAL BRANCHING AUTIOINSTRUCTIONAL METHODS.

    ERIC Educational Resources Information Center

    MELARAGNO, RALPH J.; AND OTHERS

    HYPOTHESES RELATED TO PROCEDURES PERMITTING STUDENTS TO BRANCH AT THEIR OWN OPTION WERE TESTED. THE FIRST HYPOTHESIS WAS THAT A FIXED-SEQUENCE PROGRAM WOULD BE LESS EFFECTIVE THAN THE SAME ITEMS CAST AS STATEMENTS IN TEXTBOOK FORMAT THROUGH WHICH THE STUDENT COULD SKIP AT HIS OWN OPTION. THE SECOND HYPOTHESIS WAS THAT PERFORMANCE ON A PROGRAM…

  2. Protein sequence comparisons show that the 'pseudoproteases' encoded by poxviruses and certain retroviruses belong to the deoxyuridine triphosphatase family.

    PubMed Central

    McGeoch, D J

    1990-01-01

    Amino acid sequence comparisons show extensive similarities among the deoxyuridine triphosphatases (dUTPases) of Escherichia coli and of herpesviruses, and the 'protease-like' or 'pseudoprotease' sequences encoded by certain retroviruses in the oncovirus and lentivirus families and by poxviruses. These relationships suggest strongly that the 'pseudoproteases' actually are dUTPases, and have not arisen by duplication of an oncovirus protease gene as had been suggested. The herpesvirus dUTPase sequences differ from the others in that they are longer (about 370 residues, against around 140) and one conserved element ('Motif 3') is displaced relative to its position in the other sequences; a model involving internal duplication of the herpesvirus gene can account effectively for these observations. Sequences closely similar to Motif 3 are also found in phosphofructokinases, where they form part of the active site and fructose phosphate binding structure; thus these sequences may represent a class of structural element generally involved in phosphate transfer to and from glycosides. PMID:2165588

  3. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  4. Structuring temporal sequences: comparison of models and factors of complexity.

    PubMed

    Essens, P

    1995-05-01

    Two stages for structuring tone sequences have been distinguished by Povel and Essens (1985). In the first, a mental clock segments a sequence into equal time units (clock model); in the second, intervals are specified in terms of subdivisions of these units. The present findings support the clock model in that it predicts human performance better than three other algorithmic models. Two further experiments in which clock and subdivision characteristics were varied did not support the hypothesized effect of the nature of the subdivisions on complexity. A model focusing on the variations in the beat-anchored envelopes of the tone clusters was proposed. Errors in reproduction suggest a dual-code representation comprising temporal and figural characteristics. The temporal part of the representation is based on the clock model but specifies, in addition, the metric of the level below the clock. The beat-tone-cluster envelope concept was proposed to specify the figural part. PMID:7596749

  5. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  6. 3D reconstruction software comparison for short sequences

    NASA Astrophysics Data System (ADS)

    Strupczewski, Adam; Czupryński, BłaŻej

    2014-11-01

    Large scale multiview reconstruction is recently a very popular area of research. There are many open source tools that can be downloaded and run on a personal computer. However, there are few, if any, comparisons between all the available software in terms of accuracy on small datasets that a single user can create. The typical datasets for testing of the software are archeological sites or cities, comprising thousands of images. This paper presents a comparison of currently available open source multiview reconstruction software for small datasets. It also compares the open source solutions with a simple structure from motion pipeline developed by the authors from scratch with the use of OpenCV and Eigen libraries.

  7. Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets.

    PubMed

    Melo, Francisco; Marti-Renom, Marc A

    2006-06-01

    Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs. PMID:16506243

  8. Characterization of mouse cellular deoxyribonucleic acid homologous to Abelson murine leukemia virus-specific sequences.

    PubMed Central

    Dale, B; Ozanne, B

    1981-01-01

    The genome of Abelson murine leukemia virus (A-MuLV) consists of sequences derived from both BALB/c mouse deoxyribonucleic acid and the genome of Moloney murine leukemia virus. Using deoxyribonucleic acid linear intermediates as a source of retroviral deoxyribonucleic acid, we isolated a recombinant plasmid which contained 1.9 kilobases of the 3.5-kilobase mouse-derived sequences found in A-MuLV (A-MuLV-specific sequences). We used this clone, designated pSA-17, as a probe restriction enzyme and Southern blot analyses to examine the arrangement of homologous sequences in BALB/c deoxyribonucleic acid (endogenous Abelson sequences). The endogenous Abelson sequences within the mouse genome were interrupted by noncoding regions, suggesting that a rearrangement of the cell sequences was required to produce the sequence found in the virus. Endogenous Abelson sequences were arranged similarly in mice that were susceptible to A-MuLV tumors and in mice that were resistant to A-MuLV tumors. An examination of three BALB/c plasmacytomas and a BALB/c early B-cell tumor likewise revealed no alteration in the arrangement of the endogenous Abelson sequences. Homology to pSA-17 was also observed in deoxyribonucleic acids prepared from rat, hamster, chicken, and human cells. An isolate of A-MuLV which encoded a 160,000-dalton transforming protein (P160) contained 700 more base pairs of mouse sequences than the standard A-MuLV isolate, which encoded a 120,000-dalton transforming protein (P120). Images PMID:9279386

  9. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly). PMID:9836434

  10. Studies on monotreme proteins. VII. Amino acid sequence of myoglobin from the platypus, Ornithoryhynchus anatinus.

    PubMed

    Fisher, W K; Thompson, E O

    1976-03-01

    Myoglobin isolated from skeletal muscle of the platypus contains 153 amino acid residues. The complete amino acid sequence has been determined following cleavage with cyanogen bromide and further digestion of the four fragments with trypsin, chymotrypsin, pepsin and thermolysin. Sequences of the purified peptides were determined by the dansyl-Edman procedure. The amino acid sequence showed 25 differences from human myoglobin and 24 from kangaroo myoglobin. Amino acid sequences in myoglobins are more conserved than sequences in the alpha- and beta-globin chains, and platypus myoglobin shows a similar number of variations in sequence to kangaroo myoglobin when compared with myoglobin of other species. The date of divergence of the platypus from other mammals was estimated at 102 +/- 31 million years, based on the number of amino acid differences between species and allowing for mutations during the evolutionary period. This estimate differs widely from the estimate given by similar treatment of the alpha- and beta-chain sequences and a constant rate of mutation of globin chains is not supported. PMID:962722

  11. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    PubMed

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  12. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  13. Reconstruction of an ancestral Yersinia pestis genome and comparison with an ancient sequence

    PubMed Central

    2015-01-01

    Background We propose the computational reconstruction of a whole bacterial ancestral genome at the nucleotide scale, and its validation by a sequence of ancient DNA. This rare possibility is offered by an ancient sequence of the late middle ages plague agent. It has been hypothesized to be ancestral to extant Yersinia pestis strains based on the pattern of nucleotide substitutions. But the dynamics of indels, duplications, insertion sequences and rearrangements has impacted all genomes much more than the substitution process, which makes the ancestral reconstruction task challenging. Results We use a set of gene families from 13 Yersinia species, construct reconciled phylogenies for all of them, and determine gene orders in ancestral species. Gene trees integrate information from the sequence, the species tree and gene order. We reconstruct ancestral sequences for ancestral genic and intergenic regions, providing nearly a complete genome sequence for the ancestor, containing a chromosome and three plasmids. Conclusion The comparison of the ancestral and ancient sequences provides a unique opportunity to assess the quality of ancestral genome reconstruction methods. But the quality of the sequencing and assembly of the ancient sequence can also be questioned by this comparison. PMID:26450112

  14. Basal Murphy belt and Chilhowee Group -- Sequence stratigraphic comparison

    SciTech Connect

    Aylor, J.G. Jr. . Dept. of Geology)

    1994-03-01

    The lower Murphy belt in the central western Blue Ridge is interpreted to be correlative to the Early Cambrian Chilhowee Group of the westernmost Blue Ridge and Appalachian fold and thrust belt. Basal Murphy belt depositional sequence stratigraphy represents a second-order, type-2 transgressive systems tract initiated with deposition of lowstand turbidites of the Dean Formation. These transgressive deposits of the Nantahala and Brasstown Formations are interpreted as middle to outer continental shelf deposits. Cyclic and stacked third-order regressive, coarsening upwards sequences of the Nantahala Formation display an overall increase in feldspar content stratigraphically upsection. These transgressive siliciclastic deposits are interpreted to be conformably overlain by a carbonate highstand systems tract of the Murphy Marble. Palinspastic reconstruction indicates that the Nantahala and Brasstown Formations possibly represent a basinward extension of up to 3 km thick siliciclastic wedge. The wedge tapers to the southwest along the strike of the Murphy belt at 10[degree] and thins northwestward to 2 km in the Tennessee depocenter where it is represented by the Chilhowee Group. The Murphy belt basin is believed to represent a transitional rift-to-drift facies deposited on the lower plate of the southern Blue Ridge rift zone.

  15. Molecular evolution of the Escherichia coli chromosome. IV. Sequence comparisons.

    PubMed

    Milkman, R; Bridges, M M

    1993-03-01

    DNA sequences have been compared in a 4,400-bp region for Escherichia coli K12 and 36 ECOR strains. Discontinuities in degree of similarity, previously inferred, are confirmed in detail. Three clonal frames are described on the basis of the present local high-resolution data, as well as previous analyses of restriction fragment length polymorphism (RFLP) and of multilocus enzyme electrophoresis (MLEE) covering small regions more widely dispersed on the chromosome. These three approaches show important consistency. The data illustrate the fact that, in the limited context of intraspecific genomic sequence variation, clonality and homology are synonymous. Two estimable quantitative properties are defined: recency of common ancestry (the reciprocal of the log10 of the number of generations since the most recent common ancestor), and the number of nucleotide pairs over which a given recency of common ancestry applies. In principle, these parameters are measures of the degree and physical extent of homology. The small size of apparent recombinational replacements, together with the observation that they occasionally occur in discontinuous series, raises the question of whether they result from the superimposition of replacements of much larger size (as expected from an elementary interpretation of conjugation and transduction in experimental E. coli systems) or via an alternative mechanism. Length polymorphisms of several sorts are described. PMID:8095913

  16. alpha. -Amylase of Clostridium thermosulfurogenes EM1: Nucleotide sequence of the gene, processing of the enzyme, and comparison to other. alpha. -amylases

    SciTech Connect

    Bahl, H.; Burchhardt, G.; Spreinat, A.; Haeckel, K.; Wienecke, A.; Antranikian, G.; Schmidt, B. )

    1991-05-01

    The nucleotide sequence of the {alpha}-amylase gene (amyA) from Clostridium thermosulfurogenes EM1 cloned in Escherichia coli was determined. The reading frame of the gene consisted of 2,121 bp. Comparison of the DNA sequence data with the amino acid sequence of the N terminus of the purified secreted protein of C. thermosulfurogenes Em1 suggested that the {alpha}-amylase is translated form mRNA as a secretory precursor with a signal peptide of 27 amino acid residues. The deduced amino acid sequence of the mature {alpha}-amylase contained 679 residues, resulting in a protein with a molecular mass of 75,112 Da. In E. coli the enzyme was transported to the periplasmic space and the signal peptide was cleaved at exactly the same site between two alanine residues. Comparison of the amino acid sequence of the C. thermosulfurogenes EM1 {alpha}-amylase with those from other bacterial and eukaryotic {alpha}-amylases showed several homologous regions, probably in the enzymatically functioning regions. The tentative Ca{sup 2+}-binding site (consensus region I) of this Ca{sub 2+}-independent enzyme showed only limited homology. The deduced amino acid sequence of a second obviously truncated open reading frame showed significant homology to the malG gene product of E. coli. Comparison of the {alpha}-amylase gene region of C. thermosulfurogenes EM1 (DSM3896) with the {beta}-amylase gene region of C. thermosulfurogenes (ATCC 33743) indicated that both genes have been exchanged with each other at identical sites in the chromosomes of these strains.

  17. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome.

    PubMed

    Pinto, Ameet J; Sharp, Jonathan O; Yoder, Michael J; Almstrand, Robert

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  18. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    PubMed Central

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  19. Method comparison study for weak acid dissociation cyanide analysis.

    PubMed

    Evans, Joseph D; Thompson, Leslie; Clark, Patrick J; Beckman, Scott W

    2003-02-01

    Method comparison studies of two different methods for the analysis of weak acid dissociable (WAD) cyanide revealed analytical flaws and/or matrix interference problems with both procedures. EPA "draft" method 1677 using a Perstorp 3202 CN analyzer was compared to Standard Method 4500 CN I. It was discovered that the Perstorp analyzer produced more precise and more accurate results once appropriate and necessary procedural steps from the EPA draft method were modified. Comparison of these two methods, was based on "real world" samples collected from a mine-tailing solution. The mine-tailing solution contained high concentrations of cyanide and metals. Inconsistencies in method procedures were traced to sulfide interferences and high concentrations of WAD metals. Conclusions were based upon a large sample base collected from a mine site over a 90-day period. PMID:12630477

  20. Two distinct ferredoxins from Rhodobacter capsulatus: complete amino acid sequences and molecular evolution.

    PubMed

    Saeki, K; Suetsugu, Y; Yao, Y; Horio, T; Marrs, B L; Matsubara, H

    1990-09-01

    Two distinct ferredoxins were purified from Rhodobacter capsulatus SB1003. Their complete amino acid sequences were determined by a combination of protease digestion, BrCN cleavage and Edman degradation. Ferredoxins I and II were composed of 64 and 111 amino acids, respectively, with molecular weights of 6,728 and 12,549 excluding iron and sulfur atoms. Both contained two Cys clusters in their amino acid sequences. The first cluster of ferredoxin I and the second cluster of ferredoxin II had a sequence, CxxCxxCxxxCP, in common with the ferredoxins found in Clostridia. The second cluster of ferredoxin I had a sequence, CxxCxxxxxxxxCxxxCM, with extra amino acids between the second and third Cys, which has been reported for other photosynthetic bacterial ferredoxins and putative ferredoxins (nif-gene products) from nitrogen-fixing bacteria, and with a unique occurrence of Met. The first cluster of ferredoxin II had a CxxCxxxxCxxxCP sequence, with two additional amino acids between the second and third Cys, a characteristics feature of Azotobacter-[3Fe-4S] [4Fe-4S]-ferredoxin. Ferredoxin II was also similar to Azotobacter-type ferredoxins with an extended carboxyl (C-) terminal sequence compared to the common Clostridium-type. The evolutionary relationship of the two together with a putative one recently found to be encoded in nifENXQ region in this bacterium [Moreno-Vivian et al. (1989) J. Bacteriol. 171, 2591-2598] is discussed. PMID:2277040

  1. Comparison of Sequencing (Barcode Region) and Sequence-Tagged-Site PCR for Blastocystis Subtyping

    PubMed Central

    2013-01-01

    Blastocystis is the most common nonfungal microeukaryote of the human intestinal tract and comprises numerous subtypes (STs), nine of which have been found in humans (ST1 to ST9). While efforts continue to explore the relationship between human health status and subtypes, no consensus regarding subtyping methodology exists. It has been speculated that differences detected in subtype distribution in various cohorts may to some extent reflect different approaches. Blastocystis subtypes have been determined primarily in one of two ways: (i) sequencing of small subunit rRNA gene (SSU-rDNA) PCR products and (ii) PCR with subtype-specific sequence-tagged-site (STS) diagnostic primers. Here, STS primers were evaluated against a panel of samples (n = 58) already subtyped by SSU-rDNA sequencing (barcode region), including subtypes for which STS primers are not available, and a small panel of DNAs from four other eukaryotes often present in feces (n = 18). Although the STS primers appeared to be highly specific, their sensitivity was only moderate, and the results indicated that some infections may go undetected when this method is used. False-negative STS results were not linked exclusively to certain subtypes or alleles, and evidence of substantial genetic variation in STS loci was obtained. Since the majority of DNAs included here were extracted from feces, it is possible that STS primers may generally work better with DNAs extracted from Blastocystis cultures. In conclusion, due to its higher applicability and sensitivity, and since sequence information is useful for other forms of research, SSU-rDNA barcoding is recommended as the method of choice for Blastocystis subtyping. PMID:23115257

  2. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: comparison with the hepatitis B virus sequence.

    PubMed Central

    Galibert, F; Chen, T N; Mandart, E

    1982-01-01

    The complete nucleotide sequence of a woodchuck hepatitis virus genome cloned in Escherichia coli was determined by the method of Maxam and Gilbert. This sequence was found to be 3,308 nucleotides long. Potential ATG initiator triplets and nonsense codons were identified and used to locate regions with a substantial coding capacity. A striking similarity was observed between the organization of human hepatitis B virus and woodchuck hepatitis virus. Nucleotide sequences of these open regions in the woodchuck virus were compared with corresponding regions present in hepatitis B virus. This allowed the location of four viral genes on the L strand and indicated the absence of protein coded by the S strand. Evolution rates of the various parts of the genome as well as of the four different proteins coded by hepatitis B virus and woodchuck hepatitis virus were compared. These results indicated that: (i) the core protein has evolved slightly less rapidly than the other proteins; and (ii) when a region of DNA codes for two different proteins, there is less freedom for the DNA to evolve and, moreover, one of the proteins can evolve more rapidly than the other. A hairpin structure, very well conserved in the two genomes, was located in the only region devoid of coding function, suggesting the location of the origin of replication of the viral DNA. Images PMID:7086958

  3. Protein chemotaxonomy. XIII. Amino acid sequence of ferredoxin from Panax ginseng.

    PubMed

    Mino, Yoshiki

    2006-08-01

    The complete amino acid sequence of [2Fe-2S] ferredoxin from Panax ginseng (Araliaceae) has been determined by automated Edman degradation of the entire S-carboxymethylcysteinyl protein and of the peptides obtained by enzymatic digestion. This ferredoxin has a unique amino acid sequence, which includes an insertion of Tyr at the 3rd position from the amino-terminus and a deletion of two amino acid residues at the carboxyl terminus. This ferredoxin had 18 differences in its amino acid sequence compared to that of Petroselinum sativum (Umbelliferae). In contrast, 23-33 differences were observed compared to other dicotyledonous plants. This suggests that Panax ginseng is related taxonomically to umbelliferous plants. PMID:16880642

  4. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor. PMID:2708331

  5. Secure distributed genome analysis for GWAS and sequence comparison computation

    PubMed Central

    2015-01-01

    Background The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. Methods In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest. Results We provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques. Conclusions This work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice. PMID:26733307

  6. Comparison of methods for acid quantification: impact of resist components on acid-generating efficiency

    NASA Astrophysics Data System (ADS)

    Cameron, James F.; Fradkin, Leslie; Moore, Kathryn; Pohlers, Gerd

    2000-06-01

    Chemically amplified deep UV (CA-DUV) positive resists are the enabling materials for manufacture of devices at and below 0.18 micrometer design rules in the semiconductor industry. CA-DUV resists are typically based on a combination of an acid labile polymer and a photoacid generator (PAG). Upon UV exposure, a catalytic amount of a strong Bronsted acid is released and is subsequently used in a post-exposure bake step to deprotect the acid labile polymer. Deprotection transforms the acid labile polymer into a base soluble polymer and ultimately enables positive tone image development in dilute aqueous base. As CA-DUV resist systems continue to mature and are used in increasingly demanding situations, it is critical to develop a fundamental understanding of how robust these materials are. One of the most important factors to quantify is how much acid is photogenerated in these systems at key exposure doses. For the purpose of quantifying photoacid generation several methods have been devised. These include spectrophotometric methods, ion conductivity methods and most recently an acid-base type titration similar to the standard addition method. This paper compares many of these techniques. First, comparisons between the most commonly used acid sensitive dye, tetrabromophenol blue sodium salt (TBPB) and a less common acid sensitive dye, Rhodamine B base (RB) are made in several resist systems. Second, the novel acid-base type titration based on the standard addition method is compared to the spectrophotometric titration method. During these studies, the make up of the resist system is probed as follows: the photoacid generator and resist additives are varied to understand the impact of each of these resist components on the acid generation process.

  7. N-terminal sequence of amino acids and some properties of an acid-stable alpha-amylase from citric acid-koji (Aspergillus usamii var.).

    PubMed

    Suganuma, T; Tahara, N; Kitahara, K; Nagahama, T; Inuzuka, K

    1996-01-01

    An acid-stable alpha-amylase (AA) was purified from an acidic extract of citric acid-koji (A. usamii var.). The N-terminal sequence of the first 20 amino acids of the enzyme was identical with that of AA from A. niger, but the two enzymes differed in molecular weight. HPLC analysis for identifying the anomers of products indicated that the AA hydrolyzed maltopentaose (G5) at the third glycoside bond predominantly, which differed from Taka-amylase A and the neutral alpha-amylase (NA) from the citric acid-koji. PMID:8824843

  8. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

    PubMed Central

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79–92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50–70%). PMID:27547936

  9. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies.

    PubMed

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79-92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50-70%). PMID:27547936

  10. Circular Helix-Like Curve: An Effective Tool of Biological Sequence Analysis and Comparison.

    PubMed

    Li, Yushuang; Xiao, Wenli

    2016-01-01

    This paper constructed a novel injection from a DNA sequence to a 3D graph, named circular helix-like curve (CHC). The presented graphical representation is available for visualizing characterizations of a single DNA sequence and identifying similarities and differences among several DNAs. A 12-dimensional vector extracted from CHC, as a numerical characterization of CHC, was applied to analyze phylogenetic relationships of 11 species, 74 ribosomal RNAs, 48 Hepatitis E viruses, and 18 eutherian mammals, respectively. Successful experiments illustrated that CHC is an effective tool of biological sequence analysis and comparison. PMID:27403205

  11. Circular Helix-Like Curve: An Effective Tool of Biological Sequence Analysis and Comparison

    PubMed Central

    Li, Yushuang

    2016-01-01

    This paper constructed a novel injection from a DNA sequence to a 3D graph, named circular helix-like curve (CHC). The presented graphical representation is available for visualizing characterizations of a single DNA sequence and identifying similarities and differences among several DNAs. A 12-dimensional vector extracted from CHC, as a numerical characterization of CHC, was applied to analyze phylogenetic relationships of 11 species, 74 ribosomal RNAs, 48 Hepatitis E viruses, and 18 eutherian mammals, respectively. Successful experiments illustrated that CHC is an effective tool of biological sequence analysis and comparison. PMID:27403205

  12. Ribosomal DNA ITS-1 and ITS-2 sequence comparisons as a tool for predicting genetic relatedness.

    PubMed

    Coleman, A W; Mai, J C

    1997-08-01

    The determination of the secondary structure of the internal transcribed spacer (ITS) regions separating nuclear ribosomal RNA genes of Chlorophytes has improved the fidelity of alignment of nuclear ribosomal ITS sequences from related organisms. Application of this information to sequences from green algae and plants suggested that a subset of the ITS-2 positions is relatively conserved. Organisms that can mate are identical at all of these 116 positions, or differ by at most, one nucleotide change. Here we sequenced and compared the ITS-1 and ITS-2 of 40 green flagellates in search of the nearest relative to Chlamydomonas reinhardtii. The analysis clearly revealed one unique candidate, C. incerta. Several ancillary benefits of the analysis included the identification of mislabelled cultures, the resolution of confusion concerning C. smithii, the discovery of misidentified sequences in GenBank derived from a green algal contaminant, and an overview of evolutionary relationships among the Volvocales, which is congruent with that derived from rDNA gene sequence comparisons but improves upon its resolution. The study further delineates the taxonomic level at which ITS sequences, in comparison to ribosomal gene sequences, are most useful in systematic and other studies. PMID:9236277

  13. Comparison of alignment software for genome-wide bisulphite sequence data

    PubMed Central

    Chatterjee, Aniruddha; Stockwell, Peter A.; Rodger, Euan J.; Morison, Ian M.

    2012-01-01

    Recent advances in next generation sequencing (NGS) technology now provide the opportunity to rapidly interrogate the methylation status of the genome. However, there are challenges in handling and interpretation of the methylation sequence data because of its large volume and the consequences of bisulphite modification. We sequenced reduced representation human genomes on the Illumina platform and efficiently mapped and visualized the data with different pipelines and software packages. We examined three pipelines for aligning bisulphite converted sequencing reads and compared their performance. We also comment on pre-processing and quality control of Illumina data. This comparison highlights differences in methods for NGS data processing and provides guidance to advance sequence-based methylation data analysis for molecular biologists. PMID:22344695

  14. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  15. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  16. Comparison of solution-based exome capture methods for next generation sequencing

    PubMed Central

    2011-01-01

    Background Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison. Results We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, a larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11× coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on genome-wide association arrays. Conclusions Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20×, yet none of the kits captured all the Consensus Coding Sequence annotated exons. PMID:21955854

  17. Mining and comparison of haplotype-based expressed sequence tag single nucleotide polymorphisms among citrus cultivars

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially...

  18. Genomic sequence comparison of eif(iso)4E between Arabidopsis and melon

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Eukaryotic initiation factors (eifs) bind to mRNA and initiate translation in plants. Mutations in eifs condition recessively inherited virus resistances. While coding regions among eifs have been compared both within and among species, comparisons among flanking genomic sequences are lacking. We ...

  19. Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...

  20. Analyses of mitochondrial amino acid sequence datasets support the proposal that specimens of Hypodontus macropi from three species of macropodid hosts represent distinct species

    PubMed Central

    2013-01-01

    Background Hypodontus macropi is a common intestinal nematode of a range of kangaroos and wallabies (macropodid marsupials). Based on previous multilocus enzyme electrophoresis (MEE) and nuclear ribosomal DNA sequence data sets, H. macropi has been proposed to be complex of species. To test this proposal using independent molecular data, we sequenced the whole mitochondrial (mt) genomes of individuals of H. macropi from three different species of hosts (Macropus robustus robustus, Thylogale billardierii and Macropus [Wallabia] bicolor) as well as that of Macropicola ocydromi (a related nematode), and undertook a comparative analysis of the amino acid sequence datasets derived from these genomes. Results The mt genomes sequenced by next-generation (454) technology from H. macropi from the three host species varied from 13,634 bp to 13,699 bp in size. Pairwise comparisons of the amino acid sequences predicted from these three mt genomes revealed differences of 5.8% to 18%. Phylogenetic analysis of the amino acid sequence data sets using Bayesian Inference (BI) showed that H. macropi from the three different host species formed distinct, well-supported clades. In addition, sliding window analysis of the mt genomes defined variable regions for future population genetic studies of H. macropi in different macropodid hosts and geographical regions around Australia. Conclusions The present analyses of inferred mt protein sequence datasets clearly supported the hypothesis that H. macropi from M. robustus robustus, M. bicolor and T. billardierii represent distinct species. PMID:24261823

  1. Computational approach towards promoter sequence comparison via TF mapping using a new distance measure.

    PubMed

    Meera, A; Rangarajan, Lalitha; Bhat, Savithri

    2011-03-01

    We propose a method for identifying transcription factor binding sites (TFBS) in the given promoter sequence and mapping the transcription factors (TFs). The proposed algorithm searches the +1 transcription start site (TSS) for eukaryotic and prokaryotic sequences individually. The algorithm was tested with sequences from both eukaryotes and prokaryotes for at least 9 experimentally verified and validated functional TFs in promoter sequences. The order and type of TF binding to the promoter of genes encoding central metabolic pathway (CMP) enzyme was tabulated. A new similarity measure was devised for scoring the similarity between a pair of promoter sequences based on the number and order of motifs. Further, these were grouped in clusters considering the scores between them. The distance between each of the clusters in individual pathway was calculated and a phylogenetic tree was developed. This method is further applied to other pathways such as lipid and amino acid biosynthesis to retrieve and compare experimentally verified and conserved TFBS. PMID:21369887

  2. Progressive structure-based alignment of homologous proteins: Adopting sequence comparison strategies.

    PubMed

    Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G

    2012-09-01

    Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. PMID:22676903

  3. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  4. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  5. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  6. Protein location prediction using atomic composition and global features of the amino acid sequence

    SciTech Connect

    Cherian, Betsy Sheena; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

  7. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences

    PubMed Central

    2012-01-01

    Background Tandem repetitions within protein amino acid sequences often correspond to regular secondary structures and form multi-repeat 3D assemblies of varied size and function. Developing internal repetitions is one of the evolutionary mechanisms that proteins employ to adapt their structure and function under evolutionary pressure. While there is keen interest in understanding such phenomena, detection of repeating structures based only on sequence analysis is considered an arduous task, since structure and function is often preserved even under considerable sequence divergence (fuzzy tandem repeats). Results In this paper we present PTRStalker, a new algorithm for ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. In the reported results we show that by feeding PTRStalker with amino acid sequences from the UniProtKB/Swiss-Prot database we detect novel tandemly repeated structures not captured by other state-of-the-art tools. Experiments with membrane proteins indicate that PTRStalker can detect global symmetries in the primary structure which are then reflected in the tertiary structure. Conclusions PTRStalker is able to detect fuzzy tandem repeating structures in protein sequences, with performance beyond the current state-of-the art. Such a tool may be a valuable support to investigating protein structural properties when tertiary X-ray data is not available. PMID:22536906

  8. Multimodal phylogeny for taxonomy: integrating information from nucleotide and amino acid sequences.

    PubMed

    Bicego, Manuele; Dellaglio, Franco; Felis, Giovanna E

    2007-10-01

    The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. PMID:17933011

  9. The amino-acid sequence of leghemoglobin component a from Phaseolus vulgaris (kidney bean).

    PubMed

    Lehtovaara, P; Ellfolk, N

    1975-06-01

    1. Leghemoglobin component a from Phaseolus vulgaris (kidney bean) was digested with trypsin; 15 tryptic peptides and free lysine were purified and the amino acid sequences of the peptides determined. 2. The internal order of the tryptic peptides was determined by the bridge peptides obtained from the thermolytic digest and the dilute acid hydrolyzate of kidney bean leghemoglobin a; 12 thermolytic peptides and two acid hydrolysis peptides were purified and the sequences were partially or completely determined. 3. The complete amino acid sequence of kidney bean leghemoglobin a is compared to that of leghemoglobin a from soybean (Glycine max) and to some animal globins. As regards sequence, the kidney bean globin has 79% identity with the soybean globin and 21% identity with human hemoglobin gamma-chain. Seven of the 14 amino acid residues common to most globins are found in the kidney bean globin. Trp-15 and Tyr-145 are evolutionarily conserved in this globin, which confirms the concept of a common origin of animal and plant globins. PMID:809270

  10. Draft Genome Sequence of Ustilago trichophora RK089, a Promising Malic Acid Producer

    PubMed Central

    Zambanini, Thiemo; Buescher, Joerg M.; Meurer, Guido; Blank, Lars M.

    2016-01-01

    The basidiomycetous smut fungus Ustilago trichophora RK089 produces malate from glycerol. De novo genome sequencing revealed a 20.7-Mbp genome (301 gap-closed contigs, 246 scaffolds). A comparison to the genome of Ustilago maydis 521 revealed all essential genes for malate production from glycerol contributing to metabolic engineering for improving malate production. PMID:27469969

  11. Draft Genome Sequence of Ustilago trichophora RK089, a Promising Malic Acid Producer.

    PubMed

    Zambanini, Thiemo; Buescher, Joerg M; Meurer, Guido; Wierckx, Nick; Blank, Lars M

    2016-01-01

    The basidiomycetous smut fungus Ustilago trichophora RK089 produces malate from glycerol. De novo genome sequencing revealed a 20.7-Mbp genome (301 gap-closed contigs, 246 scaffolds). A comparison to the genome of Ustilago maydis 521 revealed all essential genes for malate production from glycerol contributing to metabolic engineering for improving malate production. PMID:27469969

  12. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  13. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  14. Sequence-specific purification of nucleic acids by PNA-controlled hybrid selection.

    PubMed

    Orum, H; Nielsen, P E; Jørgensen, M; Larsson, C; Stanley, C; Koch, T

    1995-09-01

    Using an oligohistidine peptide nucleic acids (oligohistidine-PNA) chimera, we have developed a rapid hybrid selection method that allows efficient, sequence-specific purification of a target nucleic acid. The method exploits two fundamental features of PNA. First, that PNA binds with high affinity and specificity to its complementary nucleic acid. Second, that amino acids are easily attached to the PNA oligomer during synthesis. We show that a (His)6-PNA chimera exhibits strong binding to chelated Ni2+ ions without compromising its native PNA hybridization properties. We further show that these characteristics allow the (His)6-PNA/DNA complex to be purified by the well-established method of metal ion affinity chromatography using a Ni(2+)-NTA (nitrilotriactic acid) resin. Specificity and efficiency are the touchstones of any nucleic acid purification scheme. We show that the specificity of the (His)6-PNA selection approach is such that oligonucleotides differing by only a single nucleotide can be selectively purified. We also show that large RNAs (2224 nucleotides) can be captured with high efficiency by using multiple (His)6-PNA probes. PNA can hybridize to nucleic acids in low-salt concentrations that destabilize native nucleic acid structures. We demonstrate that this property of PNA can be utilized to purify an oligonucleotide in which the target sequence forms part of an intramolecular stem/loop structure. PMID:7495562

  15. Identification of Clinical Coryneform Bacterial Isolates: Comparison of Biochemical Methods and Sequence Analysis of 16S rRNA and rpoB Genes▿

    PubMed Central

    Adderson, Elisabeth E.; Boudreaux, Jan W.; Cummings, Jessica R.; Pounds, Stanley; Wilson, Deborah A.; Procop, Gary W.; Hayden, Randall T.

    2008-01-01

    We compared the relative levels of effectiveness of three commercial identification kits and three nucleic acid amplification tests for the identification of coryneform bacteria by testing 50 diverse isolates, including 12 well-characterized control strains and 38 organisms obtained from pediatric oncology patients at our institution. Between 33.3 and 75.0% of control strains were correctly identified to the species level by phenotypic systems or nucleic acid amplification assays. The most sensitive tests were the API Coryne system and amplification and sequencing of the 16S rRNA gene using primers optimized for coryneform bacteria, which correctly identified 9 of 12 control isolates to the species level, and all strains with a high-confidence call were correctly identified. Organisms not correctly identified were species not included in the test kit databases or not producing a pattern of reactions included in kit databases or which could not be differentiated among several genospecies based on reaction patterns. Nucleic acid amplification assays had limited abilities to identify some bacteria to the species level, and comparison of sequence homologies was complicated by the inclusion of allele sequences obtained from uncultivated and uncharacterized strains in databases. The utility of rpoB genotyping was limited by the small number of representative gene sequences that are currently available for comparison. The correlation between identifications produced by different classification systems was poor, particularly for clinical isolates. PMID:18160450

  16. Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.

    PubMed

    Mirsky, Alexander; Kazandjian, Linda; Anisimova, Maria

    2015-03-01

    Antibodies are glycoproteins produced by the immune system as a dynamically adaptive line of defense against invading pathogens. Very elegant and specific mutational mechanisms allow B lymphocytes to produce a large and diversified repertoire of antibodies, which is modified and enhanced throughout all adulthood. One of these mechanisms is somatic hypermutation, which stochastically mutates nucleotides in the antibody genes, forming new sequences with different properties and, eventually, higher affinity and selectivity to the pathogenic target. As somatic hypermutation involves fast mutation of antibody sequences, this process can be described using a Markov substitution model of molecular evolution. Here, using large sets of antibody sequences from mice and humans, we infer an empirical amino acid substitution model AB, which is specific to antibody sequences. Compared with existing general amino acid models, we show that the AB model provides significantly better description for the somatic evolution of mice and human antibody sequences, as demonstrated on large next generation sequencing (NGS) antibody data. General amino acid models are reflective of conservation at the protein level due to functional constraints, with most frequent amino acids exchanges taking place between residues with the same or similar physicochemical properties. In contrast, within the variable part of antibody sequences we observed an elevated frequency of exchanges between amino acids with distinct physicochemical properties. This is indicative of a sui generis mutational mechanism, specific to antibody somatic hypermutation. We illustrate this property of antibody sequences by a comparative analysis of the network modularity implied by the AB model and general amino acid substitution models. We recommend using the new model for computational studies of antibody sequence maturation, including inference of alignments and phylogenetic trees describing antibody somatic hypermutation in

  17. Comparison of Dixon Sequences for Estimation of Percent Breast Fibroglandular Tissue

    PubMed Central

    Ledger, Araminta E. W.; Scurr, Erica D.; Hughes, Julie; Macdonald, Alison; Wallace, Toni; Thomas, Karen; Wilson, Robin; Leach, Martin O.; Schmidt, Maria A.

    2016-01-01

    Objectives To evaluate sources of error in the Magnetic Resonance Imaging (MRI) measurement of percent fibroglandular tissue (%FGT) using two-point Dixon sequences for fat-water separation. Methods Ten female volunteers (median age: 31 yrs, range: 23–50 yrs) gave informed consent following Research Ethics Committee approval. Each volunteer was scanned twice following repositioning to enable an estimation of measurement repeatability from high-resolution gradient-echo (GRE) proton-density (PD)-weighted Dixon sequences. Differences in measures of %FGT attributable to resolution, T1 weighting and sequence type were assessed by comparison of this Dixon sequence with low-resolution GRE PD-weighted Dixon data, and against gradient-echo (GRE) or spin-echo (SE) based T1-weighted Dixon datasets, respectively. Results %FGT measurement from high-resolution PD-weighted Dixon sequences had a coefficient of repeatability of ±4.3%. There was no significant difference in %FGT between high-resolution and low-resolution PD-weighted data. Values of %FGT from GRE and SE T1-weighted data were strongly correlated with that derived from PD-weighted data (r = 0.995 and 0.96, respectively). However, both sequences exhibited higher mean %FGT by 2.9% (p < 0.0001) and 12.6% (p < 0.0001), respectively, in comparison with PD-weighted data; the increase in %FGT from the SE T1-weighted sequence was significantly larger at lower breast densities. Conclusion Although measurement of %FGT at low resolution is feasible, T1 weighting and sequence type impact on the accuracy of Dixon-based %FGT measurements; Dixon MRI protocols for %FGT measurement should be carefully considered, particularly for longitudinal or multi-centre studies. PMID:27011312

  18. Extraction of high quality k-words for alignment-free sequence comparison.

    PubMed

    Gunasinghe, Upuli; Alahakoon, Damminda; Bedingfield, Susan

    2014-10-01

    The weighted Euclidean distance (D(2)) is one of the earliest dissimilarity measures used for alignment free comparison of biological sequences. This distance measure and its variants have been used in numerous applications due to its fast computation, and many variants of it have been subsequently introduced. The D(2) distance measure is based on the count of k-words in the two sequences that are compared. Traditionally, all k-words are compared when computing the distance. In this paper we show that similar accuracy in sequence comparison can be achieved by using a selected subset of k-words. We introduce a term variance based quality measure for identifying the important k-words. We demonstrate the application of the proposed technique in phylogeny reconstruction and show that up to 99% of the k-words can be filtered out for certain datasets, resulting in faster sequence comparison. The paper also presents an exploratory analysis based evaluation of optimal k-word values and discusses the impact of using subsets of k-words in such optimal instances. PMID:24846728

  19. Sequence comparison of JSRV with endogenous proviruses: envelope genotypes and a novel ORF with similarity to a G-protein-coupled receptor.

    PubMed

    Bai, J; Bishop, J V; Carlson, J O; DeMartini, J C

    1999-06-01

    Ovine pulmonary carcinoma, a contagious lung cancer of sheep, is caused by the oncogenic jaagsiekte sheep retrovirus (JSRV) that is closely related to a family of endogenous sheep retroviral sequences (ESRVs). By using exogenous virus-specific U3 oligonucleotide primers, the entire JSRV proviral genome or its 3' part was amplified from tumor DNA. Analysis of these proviral sequences revealed a novel open reading frame (ORF) within the pol coding region, designated ORF X, which was well conserved in ESRV and JSRV sequences. Deduced amino acids of ORF X showed similarity to a portion of the mammalian adenosine receptor subtype 3, a member of the G-protein-coupled receptor family. Comparison of deduced env amino acids of six JSRV strains from three continents identified 15 residues that defined two distinct genotypes of JSRVs. Sequence analysis identified two highly variable regions between JSRV and ESRV in the transmembrane domain of env (TM) and the 3' unique sequence (U3) of the long terminal repeat, from which JSRV-specific DNA probes were derived. By using these DNA probes in Southern hybridization, for the first time we successfully identified JSRV proviral sequences in tumor genomic DNA in the presence of multiple ESRV loci, validating the use of exogenous virus-specific DNA probes in the analysis of oncogenic proviral integration sites and identification of integrated exogenous proviral sequences. PMID:10366570

  20. [Comparison of demineralization of different organic acid to enamel].

    PubMed

    Liu, L; Yue, S; Jiang, H; Lu, T

    1998-05-01

    The rates of demineralization of 5 organic acids (mathanoic acid, formic acid, propionic acid, Lactic acid, acetic acid, mixed acid) to the bovine enamel were tested and analysed with the self-made calcium ionselective microelectrodes(Ca(2+)-ISME) basing on a neutral carriers of ETH1001. The results showed; 1. The difference between the rates of demineralization of formic acid and lactic acid, formic acid and propionic acid, formic acid and acetic acid, acetic acid and mixed acid, acetic acid and lactic acid, propionic acid and mixed acid, propionic acid and lactic acid, lactic acid and mixed acid were of great significance (P < 0.01); 2. The rates of demineralization of acetic and mixed acid decreased with time, due to saturation of the solution during demineralization; 3. Ca(2+)-ISME was of the advantages of simplicity, rapidity, sensitivity and accuracy. The results suggest that the cariogenic potential is related to different acid products of different cariogenic bacteria, and the degree of mineral saturation within solution affects the rate of demineralization. PMID:12214404

  1. A Multiple-Sequence Variant of the Multiple-Baseline Design: A Strategy for Analysis of Sequence Effects and Treatment Comparison.

    ERIC Educational Resources Information Center

    Noell, George H.; Gresham, Frank M.

    2001-01-01

    Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…

  2. Sequence comparison of new prokaryotic and mitochondrial members of the polypeptide chain release factor family predicts a five-domain model for release factor structure.

    PubMed Central

    Pel, H J; Rep, M; Grivell, L A

    1992-01-01

    We have recently reported the cloning and sequencing of the gene for the mitochondrial release factor mRF-1. mRF-1 displays high sequence similarity to the bacterial release factors RF-1 and RF-2. A database search for proteins resembling these three factors revealed high similarities to two amino acid sequences deduced from unassigned genomic reading frames in Escherichia coli and Bacillus subtilis. The amino acid sequence derived from the Bacillus reading frame is 47% identical to E.coli and Salmonella typhimurium RF-2, strongly suggesting that it represents B.subtilis RF-2. Our comparison suggests that the expression of the B.subtilis gene is, like that of the E.coli and S. typhimurium RF-2 genes, autoregulated by a stop codon dependent +1 frameshift. A comparison of prokaryotic and mitochondrial release factor sequences, including the putative B.subtilis RF-2, leads us to propose a five-domain model for release factor structure. Possible functions of the various domains are discussed. PMID:1408743

  3. Comparison between optimized GRE and RARE sequences for 19F MRI studies

    NASA Astrophysics Data System (ADS)

    Soffientini, Chiara D.; Mastropietro, Alfonso; Caffini, Matteo; Cocco, Sara; Zucca, Ileana; Scotti, Alessandro; Baselli, Giuseppe; Bruzzone, Maria Grazia

    2014-03-01

    In 19F-MRI studies limiting factors are the presence of a low signal due to the low concentration of 19F-nuclei, necessary for biological applications, and the inherent low sensitivity of MRI. Hence, acquiring images using the pulse sequence with the best signal to noise ratio (SNR) by optimizing the acquisition parameters specifically to a 19F compound is a core issue. In 19F-MRI, multiple-spin-echo (RARE) and gradient-echo (GRE) are the two most frequently used pulse sequence families; therefore we performed an optimization study of GRE pulse sequences based on numerical simulations and experimental acquisitions on fluorinated compounds. We compared GRE performance to an optimized RARE sequence. Images were acquired on a 7T MRI preclinical scanner on phantoms containing different fluorinated compounds. Actual relaxation times (T1, T2, T2*) were evaluated in order to predict SNR dependence on sequence parameters. Experimental comparisons between spoiled GRE and RARE, obtained at a fixed acquisition time and in steady state condition, showed RARE sequence outperforming the spoiled GRE (up to 406% higher). Conversely, the use of the unbalanced-SSFP showed a significant increase in SNR compared to RARE (up to 28% higher). Moreover, this sequence (as GRE in general) was confirmed to be virtually insensitive to T1 and T2 relaxation times, after proper optimization, thus improving marker independence from the biological environment. These results confirm the efficacy of the proposed optimization tool and foster further investigation addressing in-vivo applicability.

  4. Amino acid sequence of a vitamin K-dependent Ca2+-binding peptide from bovine prothrombin.

    PubMed

    Howard, J B; Fausch, M D

    1975-08-10

    The amino acid sequence of a 31-residue peptide from bovine prothrombin has been determined. This peptide has been shown to contain the vitamin K-dependent modification required for Ca2+ binding (Nelsestuen, G. L., and Suttie, J. W. (1973) Proc. Natl. Acad. Sci. U. S. A. 70, 3366-3370) and the modified amino acid, gamma-carboxyglutamic acid (Nelsestuen, G. L., Zytkovicz, T., and Howard, J. B. (1974) J. Biol. Chem. 249, 6347-6350). The peptide was shown to correspond to residues 12 to 42 of prothrombin. PMID:807581

  5. Amino acid sequences around the cysteine residues of rabbit muscle triose phosphate isomerase

    PubMed Central

    Miller, Janet C.; Waley, S. G.

    1971-01-01

    1. The nature of the subunits in rabbit muscle triose phosphate isomerase has been investigated. 2. Amino acid analyses show that there are five cysteine residues and two methionine residues/subunit. 3. The amino acid sequences around the cysteine residues have been determined; these account for about 75 residues. 4. Cleavage at the methionine residues with cyanogen bromide gave three fragments. 5. These results show that the subunits correspond to polypeptide chains, containing about 230 amino acid residues. The chains in triose phosphate isomerase seem to be shorter than those of other glycolytic enzymes. PMID:5165707

  6. Definition and Analysis of a System for the Automated Comparison of Curriculum Sequencing Algorithms in Adaptive Distance Learning

    ERIC Educational Resources Information Center

    Limongelli, Carla; Sciarrone, Filippo; Temperini, Marco; Vaste, Giulia

    2011-01-01

    LS-Lab provides automatic support to comparison/evaluation of the Learning Object Sequences produced by different Curriculum Sequencing Algorithms. Through this framework a teacher can verify the correspondence between the behaviour of different sequencing algorithms and her pedagogical preferences. In fact the teacher can compare algorithms…

  7. Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

    SciTech Connect

    Ovacik, Meric A.; Androulakis, Ioannis P.

    2013-09-15

    Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogenetic relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.

  8. Comparison of gene expression methods to identify genes responsive to perfluorooctane sulfonic acid.

    PubMed

    Hu, Wenyue; Jones, Paul D; Decoen, Wim; Newsted, John L; Giesy, John P

    2005-01-01

    Genome-wide expression techniques are being increasingly used to assess the effects of environmental contaminants. Oligonucleotide or cDNA microarray methods make possible the screening of large numbers of known sequences for a given model species, while differential display analysis makes possible analysis of the expression of all the genes from any species. We report a comparison of two currently popular methods for genome-wide expression analysis in rat hepatoma cells treated with perfluorooctane sulfonic acid. The two analyses provided 'complimentary' information. Approximately 5% of the 8000 genes analyzed by the GeneChip array, were altered by a factor of three or greater. Differential display results were more difficult to interpret, since multiple gene products were present in most gel bands so a probabilistic approach was used to determine which pathways were affected. The mechanistic interpretation derived from these two methods was in agreement, both showing similar alterations in a specific set of genes. PMID:21783471

  9. Complete amino acid sequence of the Mu heavy chain of a human IgM immunoglobulin.

    PubMed

    Putnam, F W; Florent, G; Paul, C; Shinoda, T; Shimizu, A

    1973-10-19

    The amino acid sequence of the micro, chain of a human IgM immunoglobulin, including the location of all disulfide bridges and oligosaccharides, has been determined. The homology of the constant regions of immunoglobulin micro, gamma, alpha, and epsilon heavy chains reveals evolutionary relationships and suggests that two genes code for each heavy chain. PMID:4742735

  10. Draft Genome Sequence of the Butyric Acid Producer Clostridium tyrobutyricum Strain CIP I-776 (IFP923)

    PubMed Central

    Clément, Benjamin; Lopes Ferreira, Nicolas

    2016-01-01

    Here, we report the draft genome sequence of Clostridium tyrobutyricum CIP I-776 (IFP923), an efficient producer of butyric acid. The genome consists of a single chromosome of 3.19 Mb and provides useful data concerning the metabolic capacities of the strain. PMID:26941139

  11. Draft Genome Sequence of Perfluorooctane Acid-Degrading Bacterium Pseudomonas parafulva YAB-1

    PubMed Central

    Tang, Chongjian; Peng, Qingjing; Peng, Qingzhong

    2015-01-01

    Pseudomonas parafulva YAB-1, isolated from perfluorinated compound-contaminated soil, has the ability to degrade perfluorooctane acid (PFOA) compound. Here, we report the draft genome sequence and annotation of the PFOA-degrading bacterium P. parafulva YAB-1. The data provide the basis to investigate the molecular mechanism of PFOA metabolism. PMID:26337877

  12. Shotgun Sequencing Analysis of Trypanosoma cruzi I Sylvio X10/1 and Comparison with T. cruzi VI CL Brener

    PubMed Central

    Franzén, Oscar; Ochaya, Stephen; Sherwood, Ellen; Lewis, Michael D.; Llewellyn, Martin S.; Miles, Michael A.; Andersson, Björn

    2011-01-01

    Trypanosoma cruzi is the causative agent of Chagas disease, which affects more than 9 million people in Latin America. We have generated a draft genome sequence of the TcI strain Sylvio X10/1 and compared it to the TcVI reference strain CL Brener to identify lineage-specific features. We found virtually no differences in the core gene content of CL Brener and Sylvio X10/1 by presence/absence analysis, but 6 open reading frames from CL Brener were missing in Sylvio X10/1. Several multicopy gene families, including DGF, mucin, MASP and GP63 were found to contain substantially fewer genes in Sylvio X10/1, based on sequence read estimations. 1,861 small insertion-deletion events and 77,349 nucleotide differences, 23% of which were non-synonymous and associated with radical amino acid changes, further distinguish these two genomes. There were 336 genes indicated as under positive selection, 145 unique to T. cruzi in comparison to T. brucei and Leishmania. This study provides a framework for further comparative analyses of two major T. cruzi lineages and also highlights the need for sequencing more strains to understand fully the genomic composition of this parasite. PMID:21408126

  13. The amino acid sequence of cytochrome c-555 from the methane-oxidizing bacterium Methylococcus capsulatus.

    PubMed Central

    Ambler, R P; Dalton, H; Meyer, T E; Bartsch, R G; Kamen, M D

    1986-01-01

    The amino acid sequence of the cytochrome c-555 from the obligate methanotroph Methylococcus capsulatus strain Bath (N.C.I.B. 11132) was determined. It is a single polypeptide chain of 96 residues, binding a haem group through the cysteine residues at positions 19 and 22, and the only methionine residue is a position 59. The sequence does not closely resemble that of any other cytochrome c that has yet been characterized. Detailed evidence for the amino acid sequence of the protein has been deposited as Supplementary Publication SUP 50131 (12 pages) at the British Library Lending Division, Boston Spa, West Yorkshire LS23 7BQ, U.K., from whom copies are available on prepayment. PMID:3006666

  14. Allelic polymorphism in arabian camel ribonuclease and the amino acid sequence of bactrian camel ribonuclease.

    PubMed

    Welling, G W; Mulder, H; Beintema, J J

    1976-04-01

    Pancreatic ribonucleases from several species (whitetail deer, roe deer, guinea pig, and arabian camel) exhibit more than one amino acid at particular positions in their amino acid sequences. Since these enzymes were isolated from pooled pancreas, the origin of this heterogeneity is not clear. The pancreatic ribonucleases from 11 individual arabian camels (Camelus dromedarius) have been investigated with respect to the lysine-glutamine heterogeneity at position 103 (Welling et al., 1975). Six ribonucleases showed only one basic band and five showed two bands after polyacrylamide gel electrophoresis, suggesting a gene frequency of about 0.75 for the Lys gene and about 0.25 for the Gln gene. The amino acid sequence of bactrian camel (Camelus bactrianus) ribonuclease isolated from individual pancreatic tissue was determined and compared with that of arabian camel ribonuclease. The only difference was observed at position 103. In the ribonucleases from two unrelated bactrian camels, only glutamine was observed at that position. PMID:962846

  15. Use of a structural alphabet to find compatible folds for amino acid sequences

    PubMed Central

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  16. Use of a structural alphabet to find compatible folds for amino acid sequences.

    PubMed

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as "Protein Blocks" (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  17. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided. PMID:11414222

  18. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    SciTech Connect

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would be very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since

  19. Amino acid racemization dating of fossil bones, I. inter-laboratory comparison of racemization measurements

    USGS Publications Warehouse

    Bada, J.L.; Hoopes, E.; Darling, D.; Dungworth, G.; Kessels, H.J.; Kvenvolden, K.A.; Blunt, D.J.

    1979-01-01

    Enantiomeric measurements for aspartic acid, glutamic acid, and alanine in twenty-one different fossil bone samples have been carried out by three different laboratories using different analytical methods. These inter-laboratory comparisons demonstrate that D/L aspartic acid measurements are highly reproducible, whereas the enantiomeric measurements for the other amino acids show a wide variation between the three laboratories. At present, aspartic acid measurements are the most suitable for racemization dating of bone because of their superior analytical precision. ?? 1979.

  20. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  1. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  2. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken]; SNL,

    2013-01-25

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  3. The amino acid sequence of ribonuclease U2 from Ustilago sphaerogena.

    PubMed Central

    Sato, S; Uchida, T

    1975-01-01

    1. RNAase (ribonuclease) U2, a purine-specific RNAase, was reduced, aminoethylated and hydrolysed with trypsin, chymotrypsin and thermolysin. On the basis of the analyses of the resulting peptides, the complete amino acid sequence of RNAase U2 was determined, 2. When the sequence was compared with the amino acid sequence of RNAase T1 (EC 3.1.4.8), the following regions were found to be similar in the two enzymes; Tyr-Pro-His-Gln-Tyr (38-42) in RNAase U2 and Tyr-Pro-His-Lys-Tyr (38-42) in RNAase T1, Glu-Phe-Pro-Leu-Val (61-65) in RNAase U2 and Glu-Trp-Pro-Ile-Leu (58-62) in RNAase T1, Asp-Arg-Val-Ile-Tyr-Gln (83-88) in RNAase U2 and Asp-Arg-Val-Phe-Asn (76-81) in RNAase T1 and Val-Thr-His-Thr-Gly-Ala (98-103) in RNAase U2 and Ile-Thr-His-Thr-Gly-Ala (90-95) in RNAase T1. All of the amino acid residues, histidine-40, glutamate-58, arginine-77 and histidine-92, which were found to play a crucial role in the biological activity of RNAase T1, were included in the regions cited here. 3. Detailed evidence for the amino acid sequence of the sequence of the proteins has been deposited as Supplementary Publication SUP 50041 (33 PAGES) AT THE British Library (Lending Division)(formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1975), 145, 5. PMID:1156364

  4. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  5. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand. PMID:21402111

  6. Human liver type pyruvate kinase: complete amino acid sequence and the expression in mammalian cells.

    PubMed Central

    Tani, K; Fujii, H; Nagata, S; Miwa, S

    1988-01-01

    Pyruvate kinase (PK) has four isozymes (L, R, M1, M2) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. We isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1629 base pairs encoding 543 amino acids, 68 base pairs of 5'-noncoding sequence, and 734 base pairs of 3'-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method. Images PMID:3126495

  7. Human liver type pyruvate kinase: Complete amino acid sequence and the expression in mammalian cells

    SciTech Connect

    Tani, Kenzaburo; Nagata, Shigekazu ); Fujii, Hisaichi ); Miwa, Shiro )

    1988-03-01

    Pyruvate kinase (PK) has four isozymes (L, R, M{sub 1}, M{sub 2}) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. The authors isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1,629 base pairs encoding 543 amino acids, 68 base pairs of 5{prime}-noncoding sequence, and 734 base pairs of 3{prime}-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method.

  8. Molecular cytogenetics by polymerase catalyzed amplification or in situ labelling of specific nucleic acid sequences

    SciTech Connect

    Bolund, L.; Brandt, C.; Hindkjaer, J.; Koch, J.; Koelvraa, S.; Pedersen, S. )

    1993-01-01

    The Polymerase Chain Reaction (PCR) can be performed on isolated cells or chromosomes and the product can be analyzed by DNA technology or by FISH to test metaphases. The authors have good experiences analyzing aberrant chromosomes by FACS sorting, PCR with degenerated primers and painting of test metaphases with the PCR product. They also utilize polymerases for PRimed IN Situ labelling (PRINS) of specific nucleic acid sequences. In PRINS oligonucleotides are hybridized to their target sequences and labeled nucleotides are incorporated at the site of hybridization with the oligonucleotide as primer. PRINS may eventually allow the study of individual genes, gene expression and even somatic mutations (in mRNA) in single cells.

  9. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  10. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  11. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  12. Partial amino acid sequence of apolipoprotein(a) shows that it is homologous to plasminogen

    SciTech Connect

    Eaton, D.L.; Fless, G.M.; Kohr, W.J.; McLean, J.W.; Xu, Q.T.; Miller, C.G.; Lawn, R.M.; Scanu, A.M.

    1987-05-01

    Apolipoprotein(a) (apo(a)) is a glycoprotein with M/sub r/ approx. 280,000 that is disulfide linked to apolipoprotein B in lipoprotein(a) particles. Elevated plasma levels of lipoprotein(a) are correlated with atherosclerosis. Partial amino acid sequence of apo(a) shows that it has striking homology to plasminogen. Plasminogen is a plasma serine protease zymogen that consists of five homologous and tandemly repeated domains called kringles and a trypsin-like protease domain. The amino-terminal sequence obtained for apo(a) is homologous to the beginning of kringle 4 but not the amino terminus of plasminogen. Apo(a) was subjected to limited proteolysis by trypsin or V8 protease, and fragments generated were isolated and sequenced. Sequences obtained from several of these fragments are highly (77-100%) homologous to plasminogen residues 391-421, which reside within kringle 4. Analysis of these internal apo(a) sequences revealed that apo(a) may contain at least two kringle 4-like domains. A sequence obtained from another tryptic fragment also shows homology to the end of kringle 4 and the beginning of kringle 5. Sequence data obtained from the two tryptic fragments shows homology with the protease domain of plasminogen. One of these sequences is homologous to the sequences surrounding the activation site of plasminogen. Plasminogen is activated by the cleavage of a specific arginine residue by urokinase and tissue plasminogen activator; however, the corresponding site in apo(a) is a serine that would not be cleaved by tissue plasminogen activator or urokinase. Using a plasmin-specific assay, no proteolytic activity could be demonstrated for lipoprotein(a) particles. These results suggest that apo(a) contains kringle-like domains and an inactive protease domain.

  13. The Evidence for α-Linolenic Acid and Cardiovascular Disease Benefits: Comparisons with Eicosapentaenoic Acid and Docosahexaenoic Acid12

    PubMed Central

    Fleming, Jennifer A.; Kris-Etherton, Penny M.

    2014-01-01

    Our understanding of the cardiovascular disease (CVD) benefits of α-linolenic acid (ALA, 18:3n–3) has advanced markedly during the past decade. It is now evident that ALA benefits CVD risk. The expansion of the ALA evidence base has occurred in parallel with ongoing research on eicosapentaenoic acid (EPA, 20:5n–3) and docosahexaenoic acid (DHA, 22:6n–3) and CVD. The available evidence enables comparisons to be made for ALA vs. EPA + DHA for CVD risk reduction. The epidemiologic evidence suggests comparable benefits of plant-based and marine-derived n–3 (omega-3) PUFAs. The clinical trial evidence for ALA is not as extensive; however, there have been CVD event benefits reported. Those that have been reported for EPA + DHA are stronger because only EPA + DHA differed between the treatment and control groups, whereas in the ALA studies there were diet differences beyond ALA between the treatment and control groups. Despite this, the evidence suggests many comparable CVD benefits of ALA vs. EPA + DHA. Thus, we believe that it is time to revisit what the contemporary dietary recommendation should be for ALA to decrease the risk of CVD. Our perspective is that increasing dietary ALA will decrease CVD risk; however, randomized controlled clinical trials are necessary to confirm this and to determine what the recommendation should be. With a stronger evidence base, the nutrition community will be better positioned to revise the dietary recommendation for ALA for CVD risk reduction. PMID:25398754

  14. The evidence for α-linolenic acid and cardiovascular disease benefits: Comparisons with eicosapentaenoic acid and docosahexaenoic acid.

    PubMed

    Fleming, Jennifer A; Kris-Etherton, Penny M

    2014-11-01

    Our understanding of the cardiovascular disease (CVD) benefits of α-linolenic acid (ALA, 18:3n-3) has advanced markedly during the past decade. It is now evident that ALA benefits CVD risk. The expansion of the ALA evidence base has occurred in parallel with ongoing research on eicosapentaenoic acid (EPA, 20:5n-3) and docosahexaenoic acid (DHA, 22:6n-3) and CVD. The available evidence enables comparisons to be made for ALA vs. EPA + DHA for CVD risk reduction. The epidemiologic evidence suggests comparable benefits of plant-based and marine-derived n-3 (omega-3) PUFAs. The clinical trial evidence for ALA is not as extensive; however, there have been CVD event benefits reported. Those that have been reported for EPA + DHA are stronger because only EPA + DHA differed between the treatment and control groups, whereas in the ALA studies there were diet differences beyond ALA between the treatment and control groups. Despite this, the evidence suggests many comparable CVD benefits of ALA vs. EPA + DHA. Thus, we believe that it is time to revisit what the contemporary dietary recommendation should be for ALA to decrease the risk of CVD. Our perspective is that increasing dietary ALA will decrease CVD risk; however, randomized controlled clinical trials are necessary to confirm this and to determine what the recommendation should be. With a stronger evidence base, the nutrition community will be better positioned to revise the dietary recommendation for ALA for CVD risk reduction. PMID:25398754

  15. Effect of k-tuple length on sample-comparison with high-throughput sequencing data.

    PubMed

    Wang, Ying; Lei, Xiaoye; Wang, Shun; Wang, Zicheng; Song, Nianfeng; Zeng, Feng; Chen, Ting

    2016-01-22

    The high-throughput metagenomic sequencing offers a powerful technique to compare the microbial communities. Without requiring extra reference sequences, alignment-free models with short k-tuple (k = 2-10 bp) yielded promising results. Short k-tuples describe the overall statistical distribution, but is hard to capture the specific characteristics inside one microbial community. Longer k-tuple contains more abundant information. However, because the frequency vector of long k-tuple(k ≥ 30 bp) is sparse, the statistical measures designed for short k-tuples are not applicable. In our study, we considered each tuple as a meaningful word and then each sequencing data as a document composed of the words. Therefore, the comparison between two sequencing data is processed as "topic analysis of documents" in text mining. We designed a pipeline with long k-tuple features to compare metagenomic samples combined using algorithms from text mining and pattern recognition. The pipeline is available at http://culotuple.codeplex.com/. Experiments show that our pipeline with long k-tuple features: ①separates genomes with high similarity; ②outperforms short k-tuple models in all experiments. When k ≥ 12, the short k-tuple measures are not applicable anymore. When k is between 20 and 40, long k-tuple pipeline obtains much better grouping results; ③is free from the effect of sequencing platforms/protocols. ③We obtained meaningful and supported biological results on the 40-tuples selected for comparison. PMID:26721429

  16. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  17. On human disease-causing amino acid variants: statistical study of sequence and structural patterns

    PubMed Central

    Alexov, Emil

    2015-01-01

    Statistical analysis was carried out on large set of naturally occurring human amino acid variations and it was demonstrated that there is a preference for some amino acid substitutions to be associated with diseases. At an amino acid sequence level, it was shown that the disease-causing variants frequently involve drastic changes of amino acid physico-chemical properties of proteins such as charge, hydrophobicity and geometry. Structural analysis of variants involved in diseases and being frequently observed in human population showed similar trends: disease-causing variants tend to cause more changes of hydrogen bond network and salt bridges as compared with harmless amino acid mutations. Analysis of thermodynamics data reported in literature, both experimental and computational, indicated that disease-causing variants tend to destabilize proteins and their interactions, which prompted us to investigate the effects of amino acid mutations on large databases of experimentally measured energy changes in unrelated proteins. Although the experimental datasets were linked neither to diseases nor exclusory to human proteins, the observed trends were the same: amino acid mutations tend to destabilize proteins and their interactions. Having in mind that structural and thermodynamics properties are interrelated, it is pointed out that any large change of any of them is anticipated to cause a disease. PMID:25689729

  18. Self-sequencing of amino acids and origins of polyfunctional protocells.

    PubMed

    Fox, S W

    1984-01-01

    The primal role of the origins of proteins in molecular evolution is discussed. On the basis of this premise, the significance of the experimentally established self-sequencing of amino acids under simulated geological conditions is explained as due to the fact that the products are highly nonrandom and accordingly contain many kinds of information. When such thermal proteins are aggregated into laboratory protocells, an action that occurs readily, the resultant protocells also contain many kinds of information. Residue-by-residue order, enzymic activities, and lipid quality accordingly occur within each preparation of proteinoid (thermal protein). In this paper are reviewed briefly the phenomenon of self-sequencing of amino acids, its relationship to evolutionary processes, other significance of such self-ordering, and the experimental evidence for original polyfunctional protocells. PMID:6462684

  19. Self-Sequencing of Amino Acids and Origins of Polyfunctional Protocells

    NASA Astrophysics Data System (ADS)

    Fox, Sidney W.

    1984-12-01

    The primal role of the origins of proteins in molecular evolution is discussed. On the basis of this premise, the significance of the experimentally established self-sequencing of amino acids under simulated geological conditions is explained as due to the fact that the products are highly nonrandom and accordingly contain many kinds of information. When such thermal proteins are aggregated into laboratory protocells, an action that occurs readily, the resultant protocells also contain many kinds of information. Residue-by-residue order, enzymic activities, and lipid quality accordingly occur within each preparation of proteinoid (thermal protein). In this paper are reviewed briefly the phenomenon of self-sequencing of amino acids, its relationship to evolutionary processes, other significance of such self-ordering, and the experimental evidence for original polyfunctional protocells.

  20. Sequence of morphological transitions in two-dimensional pattern growth from aqueous ascorbic Acid solutions.

    PubMed

    Paranjpe, A S

    2002-08-12

    A sequence of morphological transitions in two-dimensional dehydration patterns of aqueous solutions of ascorbic acid is observed with humidity as a control parameter. Change in morphology occurs due to humidity induced variation in the concentration of the metastable supersaturated solution phase formed after initial solvent evaporation. As percent humidity is varied from 40 to 80, patterns change from compact circular --> radial --> density modulated radial (a new morphology) --> density modulated circular --> density modulated dendritic (a new morphology) --> dense branching. PMID:12190528

  1. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  2. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein. PMID:7461607

  3. Proteus mirabilis urease: nucleotide sequence determination and comparison with jack bean urease.

    PubMed

    Jones, B D; Mobley, H L

    1989-12-01

    Proteus mirabilis, a common cause of urinary tract infection, produces a potent urease that hydrolyzes urea to NH3 and CO2, initiating kidney stone formation. Urease genes, which were localized to a 7.6-kilobase-pair region of DNA, were sequenced by using the dideoxy method. Six open reading frames were found within a region of 4,952 base pairs which were predicted to encode polypeptides of 31.0 (ureD), 11.0 (ureA), 12.2 (ureB), 61.0 (ureC), 17.9 (ureE), and 23.0 (ureF) kilodaltons (kDa). Each open reading frame was preceded by a ribosome-binding site, with the exception of ureE. Putative promoterlike sequences were identified upstream of ureD, ureA, and ureF. Possible termination sites were found downstream of ureD, ureC, and ureF. Structural subunits of the enzyme were encoded by ureA, ureB, and ureC and were translated from a single transcript in the order of 11.0, 12.2, and 61.0 kDa. When the deduced amino acid sequences of the P. mirabilis urease subunits were compared with the amino acid sequence of the jack bean urease, significant amino acid similarity was observed (58% exact matches; 73% exact plus conservative replacements). The 11.0-kDa polypeptide aligned with the N-terminal residues of the plant enzyme, the 12.2-kDa polypeptide lined up with internal residues, and the 61.0-kDa polypeptide matched with the C-terminal residues, suggesting an evolutionary relationship of the urease genes of jack bean and P. mirabilis. PMID:2687233

  4. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those...

  5. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those...

  6. Nanopore Analysis of Nucleic Acids: Single-Molecule Studies of Molecular Dynamics, Structure, and Base Sequence

    NASA Astrophysics Data System (ADS)

    Olasagasti, Felix; Deamer, David W.

    Nucleic acids are linear polynucleotides in which each base is covalently linked to a pentose sugar and a phosphate group carrying a negative charge. If a pore having roughly the crosssectional diameter of a single-stranded nucleic acid is embedded in a thin membrane and a voltage of 100 mV or more is applied, individual nucleic acids in solution can be captured by the electrical field in the pore and translocated through by single-molecule electrophoresis. The dimensions of the pore cannot accommodate anything larger than a single strand, so each base in the molecule passes through the pore in strict linear sequence. The nucleic acid strand occupies a large fraction of the pore's volume during translocation and therefore produces a transient blockade of the ionic current created by the applied voltage. If it could be demonstrated that each nucleotide in the polymer produced a characteristic modulation of the ionic current during its passage through the nanopore, the sequence of current modulations would reflect the sequence of bases in the polymer. According to this basic concept, nanopores are analogous to a Coulter counter that detects nanoscopic molecules rather than microscopic [1,2]. However, the advantage of nanopores is that individual macromolecules can be characterized because different chemical and physical properties affect their passage through the pore. Because macromolecules can be captured in the pore as well as translocated, the nanopore can be used to detect individual functional complexes that form between a nucleic acid and an enzyme. No other technique has this capability.

  7. Complete amino acid sequence of a histidine-rich proteolytic fragment of human ceruloplasmin.

    PubMed

    Kingston, I B; Kingston, B L; Putnam, F W

    1979-04-01

    The complete amino acid sequence has been determined for a fragment of human ceruloplasmin [ferroxidase; iron(II):oxygen oxidoreductase, EC 1.16.3.1]. The fragment (designated Cp F5) contains 159 amino acid residues and has a molecular weight of 18,650; it lacks carbohydrate, is rich in histidine, and contains one free cysteine that may be part of a copper-binding site. This fragment is present in most commercial preparations of ceruloplasmin, probably owing to proteolytic degradation, but can also be obtained by limited cleavage of single-chain ceruloplasmin with plasmin. Cp F5 probably is an intact domain attached to the COOH-terminal end of single-chain ceruloplasmin via a labile interdomain peptide bond. A model of the secondary structure predicted by empirical methods suggests that almost one-third of the amino acid residues are distributed in alpha helices, about a third in beta-sheet structure, and the remainder in beta turns and unidentified structures. Computer analysis of the amino acid sequence has not demonstrated a statistically significant relationship between this ceruloplasmin fragment and any other protein, but there is some evidence for an internal duplication. PMID:287005

  8. Comparison of Predicted Scaffold-Compatible Sequence Variation in the Triple-Hairpin Structure of Human Immunodeficiency Virus Type 1 gp41 with Patient Data

    PubMed Central

    Boutonnet, Nathalie; Janssens, Wouter; Boutton, Carlo; Verschelde, Jean-Luc; Heyndrickx, Leo; Beirnaert, Els; van der Groen, Guido; Lasters, Ignace

    2002-01-01

    It has been proposed that the ectodomain of human immunodeficiency virus type 1 (HIV-1) gp41 (e-gp41), involved in HIV entry into the target cell, exists in at least two conformations, a pre-hairpin intermediate and a fusion-active hairpin structure. To obtain more information on the structure-sequence relationship in e-gp41, we performed in silico a full single-amino-acid substitution analysis, resulting in a Fold Compatible Database (FCD) for each conformation. The FCD contains for each residue position in a given protein a list of values assessing the energetic compatibility (ECO) of each of the 20 natural amino acids at that position. Our results suggest that FCD predictions are in good agreement with the sequence variation observed for well-validated e-gp41 sequences. The data show that at a minECO threshold value of 5 kcal/mol, about 90% of the observed patient sequence variation is encompassed by the FCD predictions. Some inconsistent FCD predictions at N-helix positions packing against residues of the C helix suggest that packing of both peptides may involve some flexibility and may be attributed to an altered orientation of the C-helical domain versus the N-helical region. The permissiveness of sequence variation in the C helices is in agreement with FCD predictions. Comparison of N-core and triple-hairpin FCDs suggests that the N helices may impose more constraints on sequence variation than the C helices. Although the observed sequences of e-gp41 contain many multiple mutations, our method, which is based on single-point mutations, can predict the natural sequence variability of e-gp41 very well. PMID:12097573

  9. PipTools: a computational toolkit to annotate and analyze pairwise comparisons of genomic sequences.

    PubMed

    Elnitski, Laura; Riemer, Cathy; Petrykowska, Hanna; Florea, Liliana; Schwartz, Scott; Miller, Webb; Hardison, Ross

    2002-12-01

    Sequence conservation between species is useful both for locating coding regions of genes and for identifying functional noncoding segments. Hence interspecies alignment of genomic sequences is an important computational technique. However, its utility is limited without extensive annotation. We describe a suite of software tools, PipTools, and related programs that facilitate the annotation of genes and putative regulatory elements in pairwise alignments. The alignment server PipMaker uses the output of these tools to display detailed information needed to interpret alignments. These programs are provided in a portable format for use on common desktop computers and both the toolkit and the PipMaker server can be found at our Web site (http://bio.cse.psu.edu/). We illustrate the utility of the toolkit using annotation of a pairwise comparison of the mouse MHC class II and class III regions with orthologous human sequences and subsequently identify conserved, noncoding sequences that are DNase I hypersensitive sites in chromatin of mouse cells. PMID:12504859

  10. In Silico Genome Comparison and Distribution Analysis of Simple Sequences Repeats in Cassava

    PubMed Central

    Vásquez, Andrea; López, Camilo

    2014-01-01

    We conducted a SSRs density analysis in different cassava genomic regions. The information obtained was useful to establish comparisons between cassava's SSRs genomic distribution and those of poplar, flax, and Jatropha. In general, cassava has a low SSR density (~50 SSRs/Mbp) and has a high proportion of pentanucleotides, (24,2 SSRs/Mbp). It was found that coding sequences have 15,5 SSRs/Mbp, introns have 82,3 SSRs/Mbp, 5′ UTRs have 196,1 SSRs/Mbp, and 3′ UTRs have 50,5 SSRs/Mbp. Through motif analysis of cassava's genome SSRs, the most abundant motif was AT/AT while in intron sequences and UTRs regions it was AG/CT. In addition, in coding sequences the motif AAG/CTT was also found to occur most frequently; in fact, it is the third most used codon in cassava. Sequences containing SSRs were classified according to their functional annotation of Gene Ontology categories. The identified SSRs here may be a valuable addition for genetic mapping and future studies in phylogenetic analyses and genomic evolution. PMID:25374887

  11. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  12. Complete amino acid sequence of globin chains and biological activity of fragmented crocodile hemoglobin (Crocodylus siamensis).

    PubMed

    Srihongthong, Saowaluck; Pakdeesuwan, Anawat; Daduang, Sakda; Araki, Tomohiro; Dhiravisit, Apisak; Thammasirirak, Sompong

    2012-08-01

    Hemoglobin, α-chain, β-chain and fragmented hemoglobin of Crocodylus siamensis demonstrated both antibacterial and antioxidant activities. Antibacterial and antioxidant properties of the hemoglobin did not depend on the heme structure but could result from the compositions of amino acid residues and structures present in their primary structure. Furthermore, thirteen purified active peptides were obtained by RP-HPLC analyses, corresponding to fragments in the α-globin chain and the β-globin chain which are mostly located at the N-terminal and C-terminal parts. These active peptides operate on the bacterial cell membrane. The globin chains of Crocodylus siamensis showed similar amino acids to the sequences of Crocodylus niloticus. The novel amino acid substitutions of α-chain and β-chain are not associated with the heme binding site or the bicarbonate ion binding site, but could be important through their interactions with membranes of bacteria. PMID:22648692

  13. [Partial sequence homology of FtsZ in phylogenetics analysis of lactic acid bacteria].

    PubMed

    Zhang, Bin; Dong, Xiu-zhu

    2005-10-01

    FtsZ is a structurally conserved protein, which is universal among the prokaryotes. It plays a key role in prokaryote cell division. A partial fragment of the ftsZ gene about 800bp in length was amplified and sequenced and a partial FtsZ protein phylogenetic tree for the lactic acid bacteria was constructed. By comparing the FtsZ phylogenetic tree with the 16S rDNA tree, it was shown that the two trees were similar in topology. Both trees revealed that Pediococcus spp. were closely related with L. casei group of Lactobacillus spp. , but less related with other lactic acid cocci such as Enterococcus and Streptococcus. The results also showed that the discriminative power of FtsZ was higher than that of 16S rDNA for either inter-species or inter-genus and could be a very useful tool in species identification of lactic acid bacteria. PMID:16342751

  14. Sequence Comparisons of Odorant Receptors among Tortricid Moths Reveal Different Rates of Molecular Evolution among Family Members

    PubMed Central

    Carraher, Colm; Authier, Astrid; Steinwender, Bernd; Newcomb, Richard D.

    2012-01-01

    In insects, odorant receptors detect volatile cues involved in behaviours such as mate recognition, food location and oviposition. We have investigated the evolution of three odorant receptors from five species within the moth genera Ctenopseustis and Planotrotrix, family Tortricidae, which fall into distinct clades within the odorant receptor multigene family. One receptor is the orthologue of the co-receptor Or83b, now known as Orco (OR2), and encodes the obligate ion channel subunit of the receptor complex. In comparison, the other two receptors, OR1 and OR3, are ligand-binding receptor subunits, activated by volatile compounds produced by plants - methyl salicylate and citral, respectively. Rates of sequence evolution at non-synonymous sites were significantly higher in OR1 compared with OR2 and OR3. Within the dataset OR1 contains 109 variable amino acid positions that are distributed evenly across the entire protein including transmembrane helices, loop regions and termini, while OR2 and OR3 contain 18 and 16 variable sites, respectively. OR2 shows a high level of amino acid conservation as expected due to its essential role in odour detection; however we found unexpected differences in the rate of evolution between two ligand-binding odorant receptors, OR1 and OR3. OR3 shows high sequence conservation suggestive of a conserved role in odour reception, whereas the higher rate of evolution observed in OR1, particularly at non-synonymous sites, may be suggestive of relaxed constraint, perhaps associated with the loss of an ancestral role in sex pheromone reception. PMID:22701634

  15. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids.

    PubMed

    Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

    2010-04-01

    Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279-284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614

  16. Amino acid sequence alignment of bacterial and mammalian pancreatic serine proteases based on topological equivalences.

    PubMed

    James, M N; Delbaere, L T; Brayer, G D

    1978-06-01

    The three-dimensional structures of the bacterial serine proteases SGPA, SGPB, and alpha-lytic protease have been compared with those of the pancreatic enzymes alpha-chymotrypsin and elastase. This comparison shows that approximately 60% (55-64%) of the alpha-carbon atom positions of the bacterial serine proteases are topologically equivalent to the alpha-carbon atom positions of the pancreatic enzymes. The corresponding value for a comparison of the bacterial enzymes among themselves is approximately 84%. The results of these topological comparisons have been used to deduce an experimentally sound sequence alignment for these several enzymes. This alignment shows that there is extensive tertiary structural homology among the bacteria and pancreatic enzymes without significant primary sequence identity (less than 21%). The acquisition of a zymogen function by the pancreatic enzymes is accompanied by two major changes to the bacterial enzymes' architecture: an insertion of 9 residues to increase the length of the N-terminal loop, and one of 12 residues to a loop near the activation salt bridge. In addition, in these two enzyme families, the methionine loop (residues 164-182) adopts very different comformations which are associated with their altered substrate specificities. PMID:96920

  17. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis.

    PubMed

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye

    2016-07-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods. PMID:27006647

  18. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis

    PubMed Central

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P.; Marians, Kenneth J.

    2016-01-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods. PMID:27006647

  19. Partial amino acid sequence of fructose-1,6-bisphosphatase from the blue-green algae Synechococcus leopoliensis.

    PubMed

    Marcus, F; Latshaw, S P; Steup, M; Gerbling, K P

    1989-08-01

    Purified fructose-1,6-bisphosphatase from the cyanobacterium Synechococcus leopoliensis was S-carboxymethylated and cleaved with trypsin. The resulting peptides were purified by reversed-phase high performance liquid chromatography and the amino acid sequence of six of the purified peptides was determined by gas-phase microsequencing. The results revealed sequence homology with other fructose-1,6-bisphosphatases. The obtained sequence data provides information required for the design of oligonucleotide hybridization probes to screen existing libraries of cyanobacterial DNA. The determination of the amino acid sequence of cyanobacterial proteins may yield important information with respect to the endosymbiotic theory of evolution. PMID:2550924

  20. Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.

    PubMed

    Xu, Chunrui; Sun, Dandan; Liu, Shenghui; Zhang, Yusen

    2016-10-01

    In this contribution we introduced a novel graphical method to compare protein sequences. By mapping a protein sequence into 3D space based on codons and physicochemical properties of 20 amino acids, we are able to get a unique P-vector from the 3D curve. This approach is consistent with wobble theory of amino acids. We compute the distance between sequences by their P-vectors to measure similarities/dissimilarities among protein sequences. Finally, we use our method to analyze four datasets and get better results compared with previous approaches. PMID:27375218

  1. Comparison of the complete genome sequences of Pseudomonas syringae pv. syringae B728a and pv. tomato DC3000

    SciTech Connect

    Feil, Helene; Feil, William; Chain, Patrick S. G.; Larimer, Frank W; DiBartolo, Genevieve; Copeland, A; Lykidis, A; Trong, Stephen; Nolan, Matt; Goltsman, Eugene; Thiel, James; Malfatti, Stephanie; Loper, Joyce E.; Detter, J C; Lapidus, Alla L.; Land, Miriam L; Richardson, P M; Kyrpides, Nikos C; Ivanova, N; Lindow, Steven E.

    2005-01-01

    The complete genomic sequence of Pseudomonas syringae pv. syringae B728a (Pss B728a) has been determined and is compared with that of A syringae pv. tomato DC3000 (Pst DC3000). The two pathovars of this economically important species of plant pathogenic bacteria differ in host range and other interactions with plants, with Pss having a more pronounced epiphytic stage of growth and higher abiotic stress tolerance and Pst DC3000 having a more pronounced apoplastic growth habitat. The Pss B728a genome (6.1 Mb) contains a circular chromosome and no plasmid, whereas the Pst DC3000 genome is 6.5 mbp in size, composed of a circular chromosome and two plasmids. Although a high degree of similarity exists between the two sequenced Pseudomonads, 976 protein-encoding genes are unique to Pss B728a when compared with Pst DC3000, including large genomic islands likely to contribute to virulence and host specificity. Over 375 repetitive extragenic palindromic sequences unique to Pss B728a when compared with Pst DC3000 are widely distributed throughout the chromosome except in 14 genomic islands, which generally had lower GC content than the genome as a whole. Content of the genomic islands varies, with one containing a prophage and another the plasmid pKLC102 of Pseudomonas aeruginosa PAO1. Among the 976 genes of Pss B728a with no counterpart in Pst DC3000 are those encoding for syringopeptin, syringomycin, indole acetic acid biosynthesis, arginine degradation, and production of ice nuclei. The genomic comparison suggests that several unique genes for Pss B728a such as ectoine synthase, DNA repair, and antibiotic production may contribute to the epiphytic fitness and stress tolerance of this organism.

  2. Comparison of the complete genome sequences of Pseudomonas syringae pv. syringae B728a and pv. tomato DC3000

    SciTech Connect

    Feil, H; Feil, W S; Chain, P; Larimer, F; DiBartolo, G; Copeland, A; Lykidis, A; Trong, S; Nolan, M; Goltsman, E; Thiel, J; Malfatti, S; Loper, J E; Lapidus, A; Detter, J C; Land, M; Richardson, P M; Kyrpides, N C; Ivanova, N; Lindow, S E

    2005-07-14

    The complete genomic sequence of Pseudomonas syringae pathovar syringae B728a (Pss B728a), has been determined and is compared with that of Pseudomonas syringae pv. tomato DC3000 (Pst DC3000). The two pathovars of this economically important species of plant pathogenic bacteria differ in host range and other interactions with plants, with Pss having a more pronounced epiphytic stage of growth and higher abiotic stress tolerance and Pst DC3000 having a more pronounced apoplastic growth habitat. The Pss B728a genome (6.1 megabases) contains a circular chromosome and no plasmid, whereas the Pst DC3000 genome is 6.5 mbp in size, composed of a circular chromosome and two plasmids. While a high degree of similarity exists between the two sequenced Pseudomonads, 976 protein-encoding genes are unique to Pss B728a when compared to Pst DC3000, including large genomic islands likely to contribute to virulence and host specificity. Over 375 repetitive extragenic palindromic sequences (REPs) unique to Pss B728a when compared to Pst DC3000 are widely distributed throughout the chromosome except in 14 genomic islands, which generally had lower GC content than the genome as a whole. Content of the genomic islands vary, with one containing a prophage and another the plasmid pKLC102 of P. aeruginosa PAO1. Among the 976 genes of Pss B728a with no counterpart in Pst DC3000 are those encoding for syringopeptin (SP), syringomycin (SR), indole acetic acid biosynthesis, arginine degradation, and production of ice nuclei. The genomic comparison suggests that several unique genes for Pss B728a such as ectoine synthase, DNA repair, and antibiotic production may contribute to epiphytic fitness and stress tolerance of this organism.

  3. nWayComp: a genome-wide sequence comparison tool for multiple strains/species of phylogenetically related microorganisms.

    PubMed

    Yao, Jiqiang; Lin, Hong; Doddapaneni, Harshavardhan; Civerolo, Edwin L

    2007-01-01

    The increasing number of whole genomic sequences of microorganisms has led to the complexity of genome-wide annotation and gene sequence comparison among multiple microorganisms. To address this problem, we have developed nWayComp software that compares DNA and protein sequences of phylogenetically-related microorganisms. This package integrates a series of bioinformatics tools such as BLAST, ClustalW, ALIGN, PHYLIP and PRIMER3 for sequence comparison. It searches for homologous sequences among multiple organisms and identifies genes that are unique to a particular organism. The homologous gene sets are then ranked in the descending order of the sequence similarity. For each set of homologous sequences, a table of sequence identity among homologous genes along with sequence variations such as SNPs and INDELS is developed, and a phylogenetic tree is constructed. In addition, a common set of primers that can amplify all the homologous sequences are generated. The nWayComp package provides users with a quick and convenient tool to compare genomic sequences among multiple organisms at the whole-genome level. PMID:17688445

  4. Implicit Sequence Learning in Dyslexia: A Within-Sequence Comparison of First- and Higher-Order Information

    ERIC Educational Resources Information Center

    Du, Wenchong; Kelly, Steve W.

    2013-01-01

    The present study examines implicit sequence learning in adult dyslexics with a focus on comparing sequence transitions with different statistical complexities. Learning of a 12-item deterministic sequence was assessed in 12 dyslexic and 12 non-dyslexic university students. Both groups showed equivalent standard reaction time increments when the…

  5. Bacteria obtained from a sequencing batch reactor that are capable of growth on dehydroabietic acid.

    PubMed Central

    Mohn, W W

    1995-01-01

    Eleven isolates capable of growth on the resin acid dehydroabietic acid (DhA) were obtained from a sequencing batch reactor designed to treat a high-strength process stream from a paper mill. The isolates belonged to two groups, represented by strains DhA-33 and DhA-35, which were characterized. In the bioreactor, bacteria like DhA-35 were more abundant than those like DhA-33. The population in the bioreactor of organisms capable of growth on DhA was estimated to be 1.1 x 10(6) propagules per ml, based on a most-probable-number determination. Analysis of small-subunit rRNA partial sequences indicated that DhA-33 was most closely related to Sphingomonas yanoikuyae (Sab = 0.875) and that DhA-35 was most closely related to Zoogloea ramigera (Sab = 0.849). Both isolates additionally grew on other abietanes, i.e., abietic and palustric acids, but not on the pimaranes, pimaric and isopimaric acids. For DhA-33 and DhA-35 with DhA as the sole organic substrate, doubling times were 2.7 and 2.2 h, respectively, and growth yields were 0.30 and 0.25 g of protein per g of DhA, respectively. Glucose as a cosubstrate stimulated growth of DhA-33 on DhA and stimulated DhA degradation by the culture. Pyruvate as a cosubstrate did not stimulate growth of DhA-35 on DhA and reduced the specific rate of DhA degradation of the culture. DhA induced DhA and abietic acid degradation activities in both strains, and these activities were heat labile. Cell suspensions of both strains consumed DhA at a rate of 6 mumol mg of protein-1 h-1.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7793937

  6. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  7. Novel method for PIK3CA mutation analysis: locked nucleic acid--PCR sequencing.

    PubMed

    Ang, Daphne; O'Gara, Rebecca; Schilling, Amy; Beadling, Carol; Warrick, Andrea; Troxell, Megan L; Corless, Christopher L

    2013-05-01

    Somatic mutations in PIK3CA are commonly seen in invasive breast cancer and several other carcinomas, occurring in three hotspots: codons 542 and 545 of exon 9 and in codon 1047 of exon 20. We designed a locked nucleic acid (LNA)-PCR sequencing assay to detect low levels of mutant PIK3CA DNA with attention to avoiding amplification of a pseudogene on chromosome 22 that has >95% homology to exon 9 of PIK3CA. We tested 60 FFPE breast DNA samples with known PIK3CA mutation status (48 cases had one or more PIK3CA mutations, and 12 were wild type) as identified by PCR-mass spectrometry. PIK3CA exons 9 and 20 were amplified in the presence or absence of LNA-oligonucleotides designed to bind to the wild-type sequences for codons 542, 545, and 1047, and partially suppress their amplification. LNA-PCR sequencing confirmed all 51 PIK3CA mutations; however, the mutation detection rate by standard Sanger sequencing was only 69% (35 of 51). Of the 12 PIK3CA wild-type cases, LNA-PCR sequencing detected three additional H1047R mutations in "normal" breast tissue and one E545K in usual ductal hyperplasia. Histopathological review of these three normal breast specimens showed columnar cell change in two (both with known H1047R mutations) and apocrine metaplasia in one. The novel LNA-PCR shows higher sensitivity than standard Sanger sequencing and did not amplify the known pseudogene. PMID:23541593

  8. Guanine nucleotide-binding proteins that enhance choleragen ADP-ribosyltransferase activity: nucleotide and deduced amino acid sequence of an ADP-ribosylation factor cDNA.

    PubMed Central

    Price, S R; Nightingale, M; Tsai, S C; Williamson, K C; Adamik, R; Chen, H C; Moss, J; Vaughan, M

    1988-01-01

    Three (two soluble and one membrane) guanine nucleotide-binding proteins (G proteins) that enhance ADP-ribosylation of the Gs alpha stimulatory subunit of the adenylyl cyclase (EC 4.6.1.1) complex by choleragen have recently been purified from bovine brain. To further define the structure and function of these ADP-ribosylation factors (ARFs), we isolated a cDNA clone (lambda ARF2B) from a bovine retinal library by screening with a mixed heptadecanucleotide probe whose sequence was based on the partial amino acid sequence of one of the soluble ARFs from bovine brain. Comparison of the deduced amino acid sequence of lambda ARF2B with sequences of peptides from the ARF protein (total of 60 amino acids) revealed only two differences. Whether these are cloning artifacts or reflect the existence of more than one ARF protein remains to be determined. Deduced amino acid sequences of ARF, Go alpha (the alpha subunit of a G protein that may be involved in regulation of ion fluxes), and c-Ha-ras gene product p21 show similarities in regions believed to be involved in guanine nucleotide binding and GTP hydrolysis. ARF apparently lacks a site analogous to that ADP-ribosylated by choleragen in G-protein alpha subunits. Although both the ARF proteins and the alpha subunits bind guanine nucleotides and serve as choleragen substrates, they must interact with the toxin A1 peptide in different ways. In addition to serving as an ADP-ribose acceptor, ARF interacts with the toxin in a manner that modifies its catalytic properties. PMID:3135549

  9. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  10. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  11. Substrate-Driven Mapping of the Degradome by Comparison of Sequence Logos

    PubMed Central

    Fuchs, Julian E.; von Grafenstein, Susanne; Huber, Roland G.; Kramer, Christian; Liedl, Klaus R.

    2013-01-01

    Sequence logos are frequently used to illustrate substrate preferences and specificity of proteases. Here, we employed the compiled substrates of the MEROPS database to introduce a novel metric for comparison of protease substrate preferences. The constructed similarity matrix of 62 proteases can be used to intuitively visualize similarities in protease substrate readout via principal component analysis and construction of protease specificity trees. Since our new metric is solely based on substrate data, we can engraft the protease tree including proteolytic enzymes of different evolutionary origin. Thereby, our analyses confirm pronounced overlaps in substrate recognition not only between proteases closely related on sequence basis but also between proteolytic enzymes of different evolutionary origin and catalytic type. To illustrate the applicability of our approach we analyze the distribution of targets of small molecules from the ChEMBL database in our substrate-based protease specificity trees. We observe a striking clustering of annotated targets in tree branches even though these grouped targets do not necessarily share similarity on protein sequence level. This highlights the value and applicability of knowledge acquired from peptide substrates in drug design of small molecules, e.g., for the prediction of off-target effects or drug repurposing. Consequently, our similarity metric allows to map the degradome and its associated drug target network via comparison of known substrate peptides. The substrate-driven view of protein-protein interfaces is not limited to the field of proteases but can be applied to any target class where a sufficient amount of known substrate data is available. PMID:24244149

  12. Comparison of ribotyping and sequence-based typing for discriminating among isolates of Bordetella bronchiseptica.

    PubMed

    Register, Karen B; Nicholson, Tracy L; Brunelle, Brian W

    2016-10-01

    PvuII ribotyping and MLST are each highly discriminatory methods for genotyping Bordetella bronchiseptica, but a direct comparison between these approaches has not been undertaken. The goal of this study was to directly compare the discriminatory power of PvuII ribotyping and MLST, using a single set of geographically and genetically diverse strains, and to determine whether subtyping based on repeat region sequences of the pertactin gene (prn) provides additional resolution. One hundred twenty-two isolates were analyzed, representing 11 mammalian or avian hosts, sourced from the United States, Europe, Israel and Australia. Thirty-two ribotype patterns were identified; one isolate could not be typed. In comparison, all isolates were typeable by MLST and a total of 30 sequence types was identified. An analysis based on Simpson's Index of Diversity (SID) revealed that ribotyping and MLST are nearly equally discriminatory, with SIDs of 0.920 for ribotyping and 0.919 for MLST. Nonetheless, for ten ribotypes and eight MLST sequence types, the alternative method discriminates among isolates that otherwise type identically. Pairing prn repeat region typing with ribotyping yielded 54 genotypes and increased the SID to 0.954. Repeat region typing combined with MLST resulted in 47 genotypes and an SID of 0.944. Given the technical and practical advantages of MLST over ribotyping, and the nominal difference in their SIDs, we conclude MLST is the preferred primary typing tool. We recommend the combination of MLST and prn repeat region typing as a high-resolution, objective and standardized approach valuable for investigating the population structure and epidemiology of B. bronchiseptica. PMID:27542997

  13. Bile acid sulfotransferase I from rat liver sulfates bile acids and 3-hydroxy steroids: purification, N-terminal amino acid sequence, and kinetic properties.

    PubMed

    Barnes, S; Buchina, E S; King, R J; McBurnett, T; Taylor, K B

    1989-04-01

    A bile acid:3'phosphoadenosine-5'phosphosulfate:sulfotransferase (BAST I) from adult female rat liver cytosol has been purified 157-fold by a two-step isolation procedure. The N-terminal amino acid sequence of the 30,000 subunit has been determined for the first 35 residues. The Vmax of purified BAST I is 18.7 nmol/min per mg protein with N-(3-hydroxy-5 beta-cholanoyl)glycine (glycolithocholic acid) as substrate, comparable to that of the corresponding purified human BAST (Chen, L-J., and I. H. Segel, 1985. Arch. Biochem. Biophys. 241: 371-379). BAST I activity has a broad pH optimum from 5.5-7.5. Although maximum activity occurs with 5 mM MgCl2, Mg2+ is not essential for BAST I activity. The greatest sulfotransferase activity and the highest substrate affinity is observed with bile acids or steroids that have a steroid nucleus containing a 3 beta-hydroxy group and a 5-6 double bond or a trans A-B ring junction. These substrates have normal hyperbolic initial velocity curves with substrate inhibition occurring above 5 microM. Of the saturated 5 beta-bile acids, those with a single 3-hydroxy group are the most active. The addition of a second hydroxy group at the 6- or 7-position eliminates more than 99% of the activity. In contrast, 3 alpha,12 alpha-dihydroxy-5 beta-cholan-24-oic acid (deoxycholic acid) is an excellent substrate. The initial velocity curves for glycolithocholic and deoxycholic acid conjugates are sigmoidal rather than hyperbolic, suggestive of an allosteric effect. Maximum activity is observed at 80 microM for glycolithocholic acid. All substrates, bile acids and steroids, are inhibited by the 5 beta-bile acid, 3-keto-5 beta-cholanoic acid. The data suggest that BAST I is the same protein as hydrosteroid sulfotransferase 2 (Marcus, C. J., et al. 1980. Anal. Biochem. 107: 296-304). PMID:2754334

  14. A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies

    PubMed Central

    Zhang, Wenyu; Chen, Jiajia; Yang, Yang; Tang, Yifei; Shang, Jing; Shen, Bairong

    2011-01-01

    The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, maximum random access memory (RAM) occupancy, assembly accuracy and integrity, our study indicate that string-based assemblers, overlap-layout-consensus (OLC) assemblers are well-suited for very short reads and longer reads of small genomes respectively. For large datasets of more than hundred millions of short reads, De Bruijn graph-based assemblers would be more appropriate. In terms of software implementation, string-based assemblers are superior to graph-based ones, of which SOAPdenovo is complex for the creation of configuration file. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers or the developing of novel assemblers. PMID:21423806

  15. Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

    PubMed

    Bonham-Carter, Oliver; Steele, Joe; Bastola, Dhundy

    2014-11-01

    Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression. PMID:23904502

  16. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  17. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  18. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  19. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  20. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    PubMed

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  1. Detection of Nucleic Acids with Graphene Nanopores: Ab Initio Characterization of a Novel Sequencing Device

    NASA Astrophysics Data System (ADS)

    Nelson, Tammie; Zhang, Bo; Prezhdo, Oleg

    2010-03-01

    We report an ab initio study of the interaction of two nucleobases, cytosine and adenine, with a novel graphene nanopore device for detecting the base sequence of a single-stranded nucleic acid (ssDNA or RNA). The nucleobases were inserted into a pore in a graphene nanoribbon, and the electrical current and conductance spectra were calculated as functions of voltage applied across the nanoribbon. The conductance spectra and charge densities were analyzed in the presence of each nucleobase in the graphene nanopore. The results indicate that, due to significant differences in the conductance spectra, the proposed device has adequate sensitivity to discriminate between different nucleotides. Moreover, we show that the nucleotide conductance spectra is not affected by its orientation inside the graphene nanopore. The proposed technique may be extremely useful for real applications in developing ultrafast, low cost DNA sequencing methods.

  2. Nonprotein Amino Acids from Spark Discharges and Their Comparison with the Murchison Meteorite Amino Acids

    PubMed Central

    Wolman, Yecheskel; Haverland, William J.; Miller, Stanley L.

    1972-01-01

    All the nonprotein amino acids found in the Murchison meteorite are products of the action of electric discharge on a mixture of methane, nitrogen, and water with traces of ammonia. These amino acids include α-amino-n-butyric acid, α-aminoisobutyric acid, norvaline, isovaline, pipecolic acid, β-alanine, β-amino-n-butyric acid, β-aminoisobutyric acid, γ-aminobutyric acid, sarcosine, N-ethylglycine, and N-methylalanine. In addition, norleucine, alloisoleucine, N-propylglycine, N-isopropylglycine, N-methyl-β-alanine, N-ethyl-β-alanine α,β-diaminopropionic acid, isoserine, α,γ-diaminobutyric acid, and α-hydroxy-γ-aminobutyric acid are produced by the electric discharge, but have not been found in the meteorite. PMID:16591973

  3. Nonprotein amino acids from spark discharges and their comparison with the murchison meteorite amino acids.

    PubMed

    Wolman, Y; Haverland, W J; Miller, S L

    1972-04-01

    All the nonprotein amino acids found in the Murchison meteorite are products of the action of electric discharge on a mixture of methane, nitrogen, and water with traces of ammonia. These amino acids include alpha-amino-n-butyric acid, alpha-aminoisobutyric acid, norvaline, isovaline, pipecolic acid, beta-alanine, beta-amino-n-butyric acid, beta-aminoisobutyric acid, gamma-aminobutyric acid, sarcosine, N-ethylglycine, and N-methylalanine. In addition, norleucine, alloisoleucine, N-propylglycine, N-isopropylglycine, N-methyl-beta-alanine, N-ethyl-beta-alanine alpha,beta-diaminopropionic acid, isoserine, alpha,gamma-diaminobutyric acid, and alpha-hydroxy-gamma-aminobutyric acid are produced by the electric discharge, but have not been found in the meteorite. PMID:16591973

  4. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  5. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  6. Purification and amino acid sequence of aminopeptidase P from pig kidney.

    PubMed

    Vergas Romero, C; Neudorfer, I; Mann, K; Schäfer, W

    1995-04-01

    Aminopeptidase P from kidney cortex was purified in high yield (recovery greater than or equal to 20%) by a series of column chromatographic steps after solubilization of the membrane-bound glycoprotein with n-butanol. A coupled enzymic assay, using Gly-Pro-Pro-NH-Nap as substrate and dipeptidyl-peptidase IV as auxilliary enzyme, was used to monitor the purification. The purification procedure yielded two forms of aminopeptidase P differing in their carbohydrate composition (glycoforms). Both enzyme preparations were homogeneous as assessed by SDS/PAGE silver staining, and isoelectric focusing. Both forms possessed the same substrate specificity, catalysed the same reaction, and consisted of identical protein chains. The amino acid sequence determined by Edman degradation and mass spectrometry consisted of 623 amino acids. Six N-glycosylation sites, all contained in the N-terminal half of the protein, were characterized. PMID:7744038

  7. Draft Genome Sequence of Cupriavidus sp. Strain SK-3, a 4-Chlorobiphenyl- and 4-Clorobenzoic Acid-Degrading Bacterium

    PubMed Central

    Vilo, Claudia; Benedik, Michael J.; Ilori, Matthew

    2014-01-01

    We report the draft genome sequence of Cupriavidus sp. strain SK-3, which can use 4-chlorobiphenyl and 4-clorobenzoic acid as the sole carbon source for growth. The draft genome sequence allowed the study of the polychlorinated biphenyl degradation mechanism and the recharacterization of the strain SK-3 as a Cupriavidus species. PMID:24994805

  8. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid

    PubMed Central

    Tan, Siyuan; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  9. New monoclonal antibodies to the Ebola virus glycoprotein: Identification and analysis of the amino acid sequence of the variable domains.

    PubMed

    Panina, A A; Aliev, T K; Shemchukova, O B; Dement'yeva, I G; Varlamov, N E; Pozdnyakova, L P; Bokov, M N; Dolgikh, D A; Sveshnikov, P G; Kirpichnikov, M P

    2016-03-01

    We determined the nucleotide and amino acid sequences of variable domains of three new monoclonal antibodies to the glycoprotein of Ebola virus capsid. The framework and hypervariable regions of immunoglobulin heavy and light chains were identified. The primary structures were confirmed using massspectrometry analysis. Immunoglobulin database search showed the uniqueness of the sequences obtained. PMID:27193713

  10. Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis subsp. lactis TOMSC161, Isolated from a Nonscalded Curd Pressed Cheese

    PubMed Central

    Velly, H.; Abraham, A.-L.; Loux, V.; Delacroix-Buchet, A.; Fonseca, F.; Bouix, M.

    2014-01-01

    Lactococcus lactis is a lactic acid bacterium used in the production of many fermented foods, such as dairy products. Here, we report the genome sequence of L. lactis subsp. lactis TOMSC161, isolated from nonscalded curd pressed cheese. This genome sequence provides information in relation to dairy environment adaptation. PMID:25377704

  11. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid.

    PubMed

    Tan, Siyuan; Meng, Yonghong; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  12. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution. PMID:27261456

  13. Formation Sequences of Iron Minerals in the Acidic Alteration Products and Variation of Hydrothermal Fluid Conditions

    NASA Astrophysics Data System (ADS)

    Isobe, H.; Yoshizawa, M.

    2008-12-01

    Iron minerals have important role in environmental issues not only on the Earth but also other terrestrial planets. Iron mineral species related to alteration products of primary minerals with surface or subsurface fluids are characterized by temperature, acidity and redox conditions of the fluids. We can see various iron- bearing alteration products in alteration products around fumaroles in geothermal/volcanic areas. In this study, zonal structures of iron minerals in alteration products of the geothermal area are observed to elucidate temporal and spatial variation of hydrothermal fluids. Alteration of the pyroxene-amphibole andesite of Garan-dake volcano, Oita, Japan occurs by the acidic hydrothermal fluid to form cristobalite leaching out elements other than Si. Hand specimens with unaltered or weakly altered core and cristobalite crust show various sequences of layers. XRD analysis revealed that the alteration degree is represented by abundance of cristobalite. Intermediately altered layers are characterized by occurrence including alunite, pyrite, kaolinite, goethite and hematite. A specimen with reddish brown core surrounded by cristobalite-rich white crust has brown colored layers at the boundary of core and the crust. Reddish core is characterized by occurrence of crystalline hematite by XRD. Another hand specimen has light gray core, which represents reduced conditions, and white cristobalite crust with light brown and reddish brown layers of ferric iron minerals between the core and the crust. On the other hand, hornblende crystals, typical ferrous iron-bearing mineral of the host rock, are well preserved in some samples with strongly decolorized cristobalite-rich groundmass. Hydrothermal alteration experiments of iron-rich basaltic material shows iron mineral species depend on acidity and temperature of the fluid. Oxidation states of the iron-bearing mineral species are strongly influenced by the acidity and redox conditions. Variations of alteration

  14. Comparison of Bacillus monooxygenase genes for unique fatty acid production

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This paper reviews Bacillus genes encoding monooxygenase enzymes producing unique fatty acid metabolites. Specifically, it examines standard monooxygenase electron transfer schemes and related domain structures of these fused domain enzymes on route to understanding the observed oxygenase activiti...

  15. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  16. Eimeria maxima phosphatidylinositol 4-phosphate 5-kinase: locus sequencing, characterization, and cross-phylum comparison.

    PubMed

    Goh, Mei-Yen; Pan, Mei-Zhen; Blake, Damer P; Wan, Kiew-Lian; Song, Beng-Kah

    2011-03-01

    Phosphatidylinositol 4-phosphate 5-kinase (PIP5K) may play an important role in host-cell invasion by the Eimeria species, protozoan parasites which can cause severe intestinal disease in livestock. Here, we report the structural organization of the PIP5K gene in Eimeria maxima (Weybridge strain). Two E. maxima BAC clones carrying the E. maxima PIP5K (EmPIP5K) coding sequences were selected for shotgun sequencing, yielding a 9.1-kb genomic segment. The EmPIP5K coding region was initially identified using in silico gene-prediction approaches and subsequently confirmed by mapping rapid amplification of cDNA ends and RT-PCR-generated cDNA sequence to its genomic segment. The putative EmPIP5K gene was located at position 710-8036 nt on the complimentary strand and comprised of 23 exons. Alignment of the 1147 amino acid sequence with previously annotated PIP5K proteins from other Apicomplexa species detected three conserved motifs encompassing the kinase core domain, which has been shown by previous protein deletion studies to be necessary for PIP5K protein function. Phylogenetic analysis provided further evidence that the putative EmPIP5K protein is orthologous to that of other Apicomplexa. Subsequent comparative gene structure characterization revealed events of intron loss/gain throughout the evolution of the apicomplexan PIP5K gene. Further scrutiny of the genomic structure revealed a possible trend towards "intron gain" between two of the motif regions. Our findings offer preliminary insights into the structural variations that have occurred during the evolution of the PIP5K locus and may aid in understanding the functional role of this gene in the cellular biology of apicomplexan parasites. PMID:20938684

  17. Sequence comparisons of the variable VP2 region of eight infectious bursal disease virus isolates.

    PubMed

    Dormitorio, T V; Giambrone, J J; Duck, L W

    1997-01-01

    The VP2 gene is part of the genomic segment A of infectious bursal disease virus (IBDV). It has been identified as the major host-protective antigen of IBDV and is known to contain conformationally dependent protective epitopes. A 643-base pair segment covering the hypervariable region of this gene from three recent serologic variant IBDV isolates from the southeastern United States, two variants from the Delmarva Peninsula, and three serologic standard viruses were amplified and sequenced using the reverse transcription polymerase chain reaction and cycle sequencing techniques. This was done to determine the molecular similarity among isolates that differ antigenically and pathologically. Sequence analysis suggested that the Arkansas (Ark) and Mississippi (Miss) isolates evolved closely and separately from the Delmarva variants (GLS and DELE), in contrast to the other southeastern variant Georgia (Ga), which is more closely related (98.32%) to Delaware E (DELE). All variants, except for Miss, underwent a shift in amino acid number 222 from proline to threonine. The sequence of Univax BD virus, a commercially available intermediate vaccine, was markedly different, evolving from a separate lineage than the others. Restriction enzyme sites could differentiate most isolates. Except for Miss, variants do not have EcoRII site at the larger hydrophilic domain. All variants lost their HaeIII, StuI, and StyI cutting sites with a change in base number 856. The TaqI site is in DELE, whereas the SpeI site is absent in the standard vaccine viruses. The SWASASGS heptapeptide is conserved in all virulent viruses, including APHIS, but not in the attenuated (Univax BD and Bursa Vac 3) and published (D78 and PBG98) vaccines. PMID:9087318

  18. Large-scale sequence and structural comparisons of human naive and antigen-experienced antibody repertoires.

    PubMed

    DeKosky, Brandon J; Lungu, Oana I; Park, Daechan; Johnson, Erik L; Charab, Wissam; Chrysostomou, Constantine; Kuroda, Daisuke; Ellington, Andrew D; Ippolito, Gregory C; Gray, Jeffrey J; Georgiou, George

    2016-05-10

    Elucidating how antigen exposure and selection shape the human antibody repertoire is fundamental to our understanding of B-cell immunity. We sequenced the paired heavy- and light-chain variable regions (VH and VL, respectively) from large populations of single B cells combined with computational modeling of antibody structures to evaluate sequence and structural features of human antibody repertoires at unprecedented depth. Analysis of a dataset comprising 55,000 antibody clusters from CD19(+)CD20(+)CD27(-) IgM-naive B cells, >120,000 antibody clusters from CD19(+)CD20(+)CD27(+) antigen-experienced B cells, and >2,000 RosettaAntibody-predicted structural models across three healthy donors led to a number of key findings: (i) VH and VL gene sequences pair in a combinatorial fashion without detectable pairing restrictions at the population level; (ii) certain VH:VL gene pairs were significantly enriched or depleted in the antigen-experienced repertoire relative to the naive repertoire; (iii) antigen selection increased antibody paratope net charge and solvent-accessible surface area; and (iv) public heavy-chain third complementarity-determining region (CDR-H3) antibodies in the antigen-experienced repertoire showed signs of convergent paired light-chain genetic signatures, including shared light-chain third complementarity-determining region (CDR-L3) amino acid sequences and/or Vκ,λ-Jκ,λ genes. The data reported here address several longstanding questions regarding antibody repertoire selection and development and provide a benchmark for future repertoire-scale analyses of antibody responses to vaccination and disease. PMID:27114511

  19. Analysis of expressed sequence tags from Uromyces appendiculatus hyphae and haustoria and their comparison to sequences from other rust fungi

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Two separate cDNA libraries were prepared for RNA extracted from bean rust (Uromyces appendiculatus) hyphae and haustoria isolated from infected leaves bean leaves (Phaseolus vulgaris cv Pint 111) between 2 and 8 dpi. Approximately 13,000 clones were sequenced from both ends and the sequences assem...

  20. A comparison of 454 sequencing and clonal sequencing for the characterization of hepatitis C virus NS3 variants.

    PubMed

    Ho, Cynthia K Y; Welkers, Matthijs R A; Thomas, Xiomara V; Sullivan, James C; Kieffer, Tara L; Reesink, Henk W; Rebers, Sjoerd P H; de Jong, Menno D; Schinkel, Janke; Molenkamp, Richard

    2015-07-01

    We compared 454 amplicon sequencing with clonal sequencing for the characterization of intra-host hepatitis C virus (HCV) NS3 variants. Clonal and 454 sequences were obtained from 12 patients enrolled in a clinical phase I study for telaprevir, an NS3-4a protease inhibitor. Thirty-nine datasets were used to compare the consensus sequence, average pairwise distance, normalized Shannon entropy, phylogenetic tree topology and the number and frequency of variants derived from both sequencing techniques. In general, a good concordance was observed between both techniques for the majority of datasets. Discordant results were observed for 5 out of 39 clonal and 454 datasets, which could be attributed to primer-related selective amplification used for clonal sequencing. Both 454 and clonal datasets consisted of a few major variants and a large number of low-frequency variants. Telaprevir resistance-associated variants were observed in low frequencies and were detected more often by 454. We conclude that performance of 454 and clonal sequencing is comparable for the characterization of intra-host virus populations. Not surprisingly, 454 is superior for the detection of low frequency resistance-associated variants. However, despite the greater coverage, 454 failed to detect some low frequency variants detected by clonal sequencing. PMID:25818622

  1. H3 and H4 histone cDNA sequences from Xenopus: a sequence comparison of H4 genes.

    PubMed Central

    Turner, P C; Woodland, H R

    1982-01-01

    Ovarian poly (A) + RNA from Xenopus laevis and Xenopus borealis was used to construct two cDNA libraries which were screened for histone sequences. cDNA clones to H4 mRNA were obtained from both species and an H3 cDNA clone from Xenopus laevis. The complete DNA sequences of these clones have been determined and are presented. These new sequences are compared with other H3 and H4 DNA sequences both in the coding and 3' noncoding regions. We find that there is considerable non-random codon usage in ten H4 genes. In addition there are some sequence similarities in the 3' noncoding regions of H3 and H4 genes. PMID:6896750

  2. Draft Genome Sequences of Gluconobacter cerinus CECT 9110 and Gluconobacter japonicus CECT 8443, Acetic Acid Bacteria Isolated from Grape Must

    PubMed Central

    Sainz, Florencia

    2016-01-01

    We report here the draft genome sequences of Gluconobacter cerinus strain CECT9110 and Gluconobacter japonicus CECT8443, acetic acid bacteria isolated from grape must. Gluconobacter species are well known for their ability to oxidize sugar alcohols into the corresponding acids. Our objective was to select strains to oxidize effectively d-glucose. PMID:27365351

  3. Swfoldrate: predicting protein folding rates from amino acid sequence with sliding window method.

    PubMed

    Cheng, Xiang; Xiao, Xuan; Wu, Zhi-cheng; Wang, Pu; Lin, Wei-zhong

    2013-01-01

    Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long-range and short-range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci-bioinfo.cn/swfrate/input.jsp. PMID:22933332

  4. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

    PubMed

    Blanco-Míguez, Aitor; Gutiérrez-Jácome, Alberto; Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Catalán-García, Sandra; Fdez-Riverola, Florentino; Lourenço, Anália; Sánchez, Borja

    2016-06-01

    Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as "antiproliferative," "antitumoral," or "apoptosis" among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed. PMID:27010507

  5. The amino acid sequences and activities of synergistic hemolysins from Staphylococcus cohnii.

    PubMed

    Mak, Pawel; Maszewska, Agnieszka; Rozalska, Malgorzata

    2008-10-01

    Staphylococcus cohnii ssp. cohnii and S. cohnii ssp. urealyticus are a coagulase-negative staphylococci considered for a long time as unable to cause infections. This situation changed recently and pathogenic strains of these bacteria were isolated from hospital environments, patients and medical staff. Most of the isolated strains were resistant to many antibiotics. The present work describes isolation and characterization of several synergistic peptide hemolysins produced by these bacteria and acting as virulence factors responsible for hemolytic and cytotoxic activities. Amino acid sequences of respective hemolysins from S. cohnii ssp. cohnii (named as H1C, H2C and H3C) and S. cohnii ssp. urealyticus (H1U, H2U and H3U) were identical. Peptides H1 and H3 possessed significant amino acid homology to three synergistic hemolysins secreted by Staphylococcus lugdunensis and to putative antibacterial peptide produced by Staphylococcus saprophyticus ssp. saprophyticus. On the other hand, hemolysin H2 had a unique sequence. All isolated peptides lysed red cells from different mammalian species and exerted a cytotoxic effect on human fibroblasts. PMID:18752624

  6. Clostridium sticklandii, a specialist in amino acid degradation:revisiting its metabolism through its genome sequence

    PubMed Central

    2010-01-01

    Background Clostridium sticklandii belongs to a cluster of non-pathogenic proteolytic clostridia which utilize amino acids as carbon and energy sources. Isolated by T.C. Stadtman in 1954, it has been generally regarded as a "gold mine" for novel biochemical reactions and is used as a model organism for studying metabolic aspects such as the Stickland reaction, coenzyme-B12- and selenium-dependent reactions of amino acids. With the goal of revisiting its carbon, nitrogen, and energy metabolism, and comparing studies with other clostridia, its genome has been sequenced and analyzed. Results C. sticklandii is one of the best biochemically studied proteolytic clostridial species. Useful additional information has been obtained from the sequencing and annotation of its genome, which is presented in this paper. Besides, experimental procedures reveal that C. sticklandii degrades amino acids in a preferential and sequential way. The organism prefers threonine, arginine, serine, cysteine, proline, and glycine, whereas glutamate, aspartate and alanine are excreted. Energy conservation is primarily obtained by substrate-level phosphorylation in fermentative pathways. The reactions catalyzed by different ferredoxin oxidoreductases and the exergonic NADH-dependent reduction of crotonyl-CoA point to a possible chemiosmotic energy conservation via the Rnf complex. C. sticklandii possesses both the F-type and V-type ATPases. The discovery of an as yet unrecognized selenoprotein in the D-proline reductase operon suggests a more detailed mechanism for NADH-dependent D-proline reduction. A rather unusual metabolic feature is the presence of genes for all the enzymes involved in two different CO2-fixation pathways: C. sticklandii harbours both the glycine synthase/glycine reductase and the Wood-Ljungdahl pathways. This unusual pathway combination has retrospectively been observed in only four other sequenced microorganisms. Conclusions Analysis of the C. sticklandii genome and

  7. Complete amino acid sequence of the myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani.

    PubMed

    Jones, B N; Wang, C C; Dwulet, F E; Lehman, L D; Meuth, J L; Bogardt, R A; Gurd, F R

    1979-04-25

    The complete amino acid sequence of the major component myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani, was determined by the automated Edman degradation of several large peptides obtained by specific cleavage of the protein. The acetimidated apomyoglobin was selectively cleaved at its two methionyl residues with cyanogen bromide and at its three arginyl residues by trypsin. By subjecting four of these peptides and the apomyoglobin to automated Edman degradation, over 80% of the primary structure of the protein was obtained. The remainder of the covalent structure was determined by the sequence analysis of peptides that resulted from further digestion of the central cyanogen bromide fragment. This fragment was cleaved at its glutamyl residues with staphylococcal protease and its lysyl residues with trypsin. The action of trypsin was restricted to the lysyl residues by chemical modification of the single arginyl residue of the fragment with 1,2-cyclohexanedione. The primary structure of this myoglobin proved to be identical with that from the Atlantic bottlenosed dolphin and Pacific common dolphin but differs from the myoglobins of the killer whale and pilot whale at two positions. The above sequence identities and differences reflect the close taxonomic relationship of these five species of Cetacea. PMID:454657

  8. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon.

    PubMed Central

    Yu, J H; Eng, J; Yalow, R S

    1990-01-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled pork insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report we describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. We demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in our immunoassay system is only a few percent of that of human insulin. Squirrel monkey glucagon is identical with the usual glucagon found in Old World mammals, which predicts that the glucagons of other New World monkeys would not differ from the usual Old World mammalian glucagon. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species. PMID:2263627

  9. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon

    SciTech Connect

    Yu, Jinghua ); Eng, J.; Yalow, R.S. City Univ. of New York, NY )

    1990-12-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled park insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report the authors describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. They demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in their immunoassay system is only a few percent of that of human insulin. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species.

  10. Comparison of fungi within the Gaeumannomyces-Phialophora complex by analysis of ribosomal DNA sequences.

    PubMed Central

    Bryan, G T; Daniels, M J; Osbourn, A E

    1995-01-01

    Four ascomycete species of the genus Gaeumannomyces infect roots of monocotyledons. Gaeumannomyces graminis contains four varieties, var. tritici, var. avenae, var. graminis, and var. maydis. G. graminis varieties tritici, avenae, and graminis have Phialophora-like anamorphs and, together with the other Gaeumannomyces and Phialophora species found on cereal roots, constitute the Gaeumannomyces-Phialophora complex. Relatedness of a number of Gaeumannomyces and Phialophora isolates was assessed by comparison of DNA sequences of the 18S rRNA gene, the 5.8S rRNA gene, and the internal transcribed spacers (ITS). G. graminis var. tritici, G. graminis var. avenae, and G. graminis var. graminis isolates can be distinguished from each other by nucleotide sequence differences in the ITS regions. The G. graminis var. tritici isolates can be further subdivided into R and N isolates (correlating with ability [R] or inability [N] to infect rye). Phylogenetic analysis of the ITS regions of several oat-infecting G. graminis var. tritici isolates suggests that these isolates are actually more closely related to G. graminis var. avenae. The isolates of Magnaporthe grisea included in the analysis showed a surprising degree of relatedness to members of the Gaeumannomyces-Phialophora complex. G. graminis variety-specific oligonucleotide primers were used in PCRs to amplify DNA from cereal seedlings infected with G. graminis var. tritici or G. graminis var. avenae, and these should be valuable for sensitive detection of pathogenic isolates and for diagnosis of take-all. PMID:7574606

  11. Comparison and quantitative verification of mapping algorithms for whole-genome bisulfite sequencing.

    PubMed

    Kunde-Ramamoorthy, Govindarajan; Coarfa, Cristian; Laritsky, Eleonora; Kessler, Noah J; Harris, R Alan; Xu, Mingchu; Chen, Rui; Shen, Lanlan; Milosavljevic, Aleksandar; Waterland, Robert A

    2014-04-01

    Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitative accuracy has been reported. We sequenced bisulfite-converted DNA from two tissues from each of two healthy human adults and systematically compared five widely used Bisulfite-seq mapping algorithms: Bismark, BSMAP, Pash, BatMeth and BS Seeker. We evaluated their computational speed and genomic coverage and verified their percentage methylation estimates. With the exception of BatMeth, all mappers covered >70% of CpG sites genome-wide and yielded highly concordant estimates of percentage methylation (r(2) ≥ 0.95). Fourfold variation in mapping time was found between BSMAP (fastest) and Pash (slowest). In each library, 8-12% of genomic regions covered by Bismark and Pash were not covered by BSMAP. An experiment using simulated reads confirmed that Pash has an exceptional ability to uniquely map reads in genomic regions of structural variation. Independent verification by bisulfite pyrosequencing generally confirmed the percentage methylation estimates by the mappers. Of these algorithms, Bismark provides an attractive combination of processing speed, genomic coverage and quantitative accuracy, whereas Pash offers considerably higher genomic coverage. PMID:24391148

  12. Comparison and quantitative verification of mapping algorithms for whole-genome bisulfite sequencing

    PubMed Central

    Kunde-Ramamoorthy, Govindarajan; Coarfa, Cristian; Laritsky, Eleonora; Kessler, Noah J.; Harris, R. Alan; Xu, Mingchu; Chen, Rui; Shen, Lanlan; Milosavljevic, Aleksandar; Waterland, Robert A.

    2014-01-01

    Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitative accuracy has been reported. We sequenced bisulfite-converted DNA from two tissues from each of two healthy human adults and systematically compared five widely used Bisulfite-seq mapping algorithms: Bismark, BSMAP, Pash, BatMeth and BS Seeker. We evaluated their computational speed and genomic coverage and verified their percentage methylation estimates. With the exception of BatMeth, all mappers covered >70% of CpG sites genome-wide and yielded highly concordant estimates of percentage methylation (r2 ≥ 0.95). Fourfold variation in mapping time was found between BSMAP (fastest) and Pash (slowest). In each library, 8–12% of genomic regions covered by Bismark and Pash were not covered by BSMAP. An experiment using simulated reads confirmed that Pash has an exceptional ability to uniquely map reads in genomic regions of structural variation. Independent verification by bisulfite pyrosequencing generally confirmed the percentage methylation estimates by the mappers. Of these algorithms, Bismark provides an attractive combination of processing speed, genomic coverage and quantitative accuracy, whereas Pash offers considerably higher genomic coverage. PMID:24391148

  13. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    PubMed Central

    2012-01-01

    Background Next generation sequencing (NGS) technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy. Results Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform. Conclusions All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support. PMID:22827831

  14. Amino acid sequence analysis and characterization of a ribonuclease from starfish Asterias amurensis.

    PubMed

    Motoyoshi, Naomi; Kobayashi, Hiroko; Itagaki, Tadashi; Inokuchi, Norio

    2016-09-01

    The aim of this study was to phylogenetically characterize the location of the RNase T2 enzyme in the starfish (Asterias amurensis). We isolated an RNase T2 ribonuclease (RNase Aa) from the ovaries of starfish and determined its amino acid sequence by protein chemistry and cloning cDNA encoding RNase Aa. The isolated protein had 231 amino acid residues, a predicted molecular mass of 25,906 Da, and an optimal pH of 5.0. RNase Aa preferentially released guanylic acid from the RNA. The catalytic sites of the RNase T2 family are conserved in RNase Aa; furthermore, the distribution of the cysteine residues in RNase Aa is similar to that in other animal and plant T2 RNases. RNase Aa is cleaved at two points: 21 residues from the N-terminus and 29 residues from the C-terminus; however, both fragments may remain attached to the protein via disulfide bridges, leading to the maintenance of its conformation, as suggested by circular dichroism spectrum analysis. The phylogenetic analysis revealed that starfish RNase Aa is evolutionarily an intermediate between protozoan and oyster RNases. PMID:26920046

  15. Susceptibility of muridae cell lines to ecotropic murine leukemia virus and the cationic amino acid transporter 1 viral receptor sequences: implications for evolution of the viral receptor.

    PubMed

    Kakoki, Katsura; Shinohara, Akio; Izumida, Mai; Koizumi, Yosuke; Honda, Eri; Kato, Goro; Igawa, Tsukasa; Sakai, Hideki; Hayashi, Hideki; Matsuyama, Toshifumi; Morita, Tetsuo; Koshimoto, Chihiro; Kubo, Yoshinao

    2014-06-01

    Ecotropic murine leukemia viruses (Eco-MLVs) infect mouse and rat, but not other mammalian cells, and gain access for infection through binding the cationic amino acid transporter 1 (CAT1). Glycosylation of the rat and hamster CAT1s inhibits Eco-MLV infection, and treatment of rat and hamster cells with a glycosylation inhibitor, tunicamycin, enhances Eco-MLV infection. Although the mouse CAT1 is also glycosylated, it does not inhibit Eco-MLV infection. Comparison of amino acid sequences between the rat and mouse CAT1s shows amino acid insertions in the rat protein near the Eco-MLV-binding motif. In addition to the insertion present in the rat CAT1, the hamster CAT1 has additional amino acid insertions. In contrast, tunicamycin treatment of mink and human cells does not elevate the infection, because their CAT1s do not have the Eco-MLV-binding motif. To define the evolutionary pathway of the Eco-MLV receptor, we analyzed CAT1 sequences and susceptibility to Eco-MLV infection of other several murinae animals, including the southern vole (Microtus rossiaemeridionalis), large Japanese field mouse (Apodemus speciosus), and Eurasian harvest mouse (Micromys minutus). Eco-MLV infection was enhanced by tunicamycin in these cells, and their CAT1 sequences have the insertions like the hamster CAT1. Phylogenetic analysis of mammalian CAT1s suggested that the ancestral CAT1 does not have the Eco-MLV-binding motif, like the human CAT1, and the mouse CAT1 is thought to be generated by the amino acid deletions in the third extracellular loop of CAT1. PMID:24469466

  16. NITRIC ACID SHOOTOUT: FIELD COMPARISON OF MEASUREMENT METHODS (JOURNAL VERSION)

    EPA Science Inventory

    Eighteen instruments for measuring atmospheric concentrations of nitric acid were compared in an eight-day field study at Pomona College, situated in the eastern portion of the Los Angeles Basin, in September 1985. The study design included collocated and separated duplicate samp...

  17. Complexation of NpO2+ with N-methyl-iminodiacetic Acid: in Comparison with Iminodiacetic and Dipicolinic Acids

    SciTech Connect

    Tian, Guoxin; Rao, Linfeng

    2010-10-01

    Complexation of Np(V) with N-methyl-iminodiacetic acid (MIDA) in 1 M NaClO{sub 4} solution was studied with multiple techniques including potentiometry, spectrophotometry, and microcalorimetry. The 1:2 complex, NpO{sub 2}(MIDA){sub 2}{sup 3-} was identified for the first time in aqueous solution. The correlation between its optical absorption properties and symmetry was discussed, in comparison with Np(V) complexes with two structurally related nitrilo-dicarboxylic acids, iminodiacetic acid (IDA) and dipicolinic acid (DPA). The order of the binding strength (DPA > MIDA > IDA) is explained by the difference in the structural and electronic properties of the ligands. In general, the nitrilo-dicarboxylates form stronger complexes with Np(V) than oxy-dicarboxylates due to a much more favorable enthalpy of complexation.

  18. Circular RNA oligonucleotides. Synthesis, nucleic acid binding properties, and a comparison with circular DNAs.

    PubMed Central

    Wang, S; Kool, E T

    1994-01-01

    We report the synthesis and nucleic acid binding properties of two cyclic RNA oligonucleotides designed to bind single-stranded nucleic acids by pyr.pur.pyr-type triple helix formation. The circular RNAs are 34 nucleotides in size and were cyclized using a template-directed nonenzymatic ligation. To ensure isomeric 3'-5' purity in the ligation reaction, one nucleotide at the ligation site is a 2'-deoxyribose. One circle (1) is complementary to the sequence 5'-A12, and the second (2) is complementary to 5'-AAGAAAGAAAAG. Results of thermal denaturation experiments and mixing studies show that both circles bind complementary single-stranded DNA or RNA substrates by triple helix formation, in which two domains in a pyrimidine-rich circle sandwich a central purine-rich substrate. The affinities of these circles with their purine complements are much higher than the affinities of either the linear precursors or simple Watson-Crick DNA complements. For example, circle 1 binds rA12 (pH 7.0, 10 mM MgCl2, 100 mM NaCl) with a Tm of 48 degrees C and a Kd (37 degrees C) of 4.1 x 10(-9) M, while the linear precursor of the circle binds with a Tm of 34 degrees C and a Kd of 1.2 x 10(-6) M. The complexes of circle 2 are pH-dependent, as expected for triple helical complexes involving C(+)G.C triads, and mixing plots for both circles reveal one-to-one stoichiometry of binding either to RNA or DNA substrates. Comparison of circular RNAs with previously synthesized circular DNA oligonucleotides of the same sequence reveals similar behavior in the binding of DNA, but strikingly different behavior in the binding of RNA. The cyclic DNAs show high DNA-binding selectivity, giving relatively weaker duplex-type binding with complementary RNAs. The relative order of thermodynamic stability for the four types of triplex studied here is found to be DDD >> RRR > RDR >> DRD. The results are discussed in the context of recent reports of strong triplex dependence on RNA versus DNA backbones

  19. Circular RNA oligonucleotides. Synthesis, nucleic acid binding properties, and a comparison with circular DNAs.

    PubMed

    Wang, S; Kool, E T

    1994-06-25

    We report the synthesis and nucleic acid binding properties of two cyclic RNA oligonucleotides designed to bind single-stranded nucleic acids by pyr.pur.pyr-type triple helix formation. The circular RNAs are 34 nucleotides in size and were cyclized using a template-directed nonenzymatic ligation. To ensure isomeric 3'-5' purity in the ligation reaction, one nucleotide at the ligation site is a 2'-deoxyribose. One circle (1) is complementary to the sequence 5'-A12, and the second (2) is complementary to 5'-AAGAAAGAAAAG. Results of thermal denaturation experiments and mixing studies show that both circles bind complementary single-stranded DNA or RNA substrates by triple helix formation, in which two domains in a pyrimidine-rich circle sandwich a central purine-rich substrate. The affinities of these circles with their purine complements are much higher than the affinities of either the linear precursors or simple Watson-Crick DNA complements. For example, circle 1 binds rA12 (pH 7.0, 10 mM MgCl2, 100 mM NaCl) with a Tm of 48 degrees C and a Kd (37 degrees C) of 4.1 x 10(-9) M, while the linear precursor of the circle binds with a Tm of 34 degrees C and a Kd of 1.2 x 10(-6) M. The complexes of circle 2 are pH-dependent, as expected for triple helical complexes involving C(+)G.C triads, and mixing plots for both circles reveal one-to-one stoichiometry of binding either to RNA or DNA substrates. Comparison of circular RNAs with previously synthesized circular DNA oligonucleotides of the same sequence reveals similar behavior in the binding of DNA, but strikingly different behavior in the binding of RNA. The cyclic DNAs show high DNA-binding selectivity, giving relatively weaker duplex-type binding with complementary RNAs. The relative order of thermodynamic stability for the four types of triplex studied here is found to be DDD > RRR > RDR > DRD. The results are discussed in the context of recent reports of strong triplex dependence on RNA versus DNA backbones. Triplex

  20. Fatty acid mobilization and comparison to milk fatty acid content in northern elephant seals.

    PubMed

    Fowler, Melinda A; Debier, Cathy; Mignolet, Eric; Linard, Clementine; Crocker, Daniel E; Costa, Daniel P

    2014-01-01

    A fundamental feature of the life history of true seals, bears and baleen whales is lactation while fasting. This study examined the mobilization of fatty acids from blubber and their subsequent partitioning into maternal metabolism and milk production in northern elephant seals (Mirounga angustirostris). The fatty acid composition of blubber and milk was measured in both early and late lactation. Proportions of fatty acids in milk and blubber were found to display a high degree of similarity both early and late in lactation. Seals mobilized an enormous amount of lipid (~66 kg in 17 days), but thermoregulatory fatty acids, those that remain fluid at low temperatures, were relatively conserved in the outer blubber layer. Despite the stratification, the pattern of mobilization of specific fatty acids conforms to biochemical predictions. Long chain (>20C) monounsaturated fatty acids (MUFAs) were the least mobilized from blubber and the only class of fatty acids that showed a proportional increase in milk in late lactation. Polyunsaturated fatty acids (PUFAs) and saturated fatty acids (SFAs) were more mobilized from the blubber, but neither proportion increased in milk at late lactation. These data suggest that of the long chain MUFA mobilized, the majority is directed to milk synthesis. The mother may preferentially use PUFA and SFA for her own metabolism, decreasing the availability for deposition into milk. The potential impacts of milk fatty acid delivery on pup diving development and thermoregulation are exciting avenues for exploration. PMID:24126964

  1. Full Genome Virus Detection in Fecal Samples Using Sensitive Nucleic Acid Preparation, Deep Sequencing, and a Novel Iterative Sequence Classification Algorithm

    PubMed Central

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J.; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis. PMID:24695106

  2. A Comparison between Transcriptome Sequencing and 16S Metagenomics for Detection of Bacterial Pathogens in Wildlife

    PubMed Central

    Razzauti, Maria; Galan, Maxime; Bernard, Maria; Maman, Sarah; Klopp, Christophe; Charbonnel, Nathalie; Vayssier-Taussat, Muriel; Eloit, Marc; Cosson, Jean-François

    2015-01-01

    Background Rodents are major reservoirs of pathogens responsible for numerous zoonotic diseases in humans and livestock. Assessing their microbial diversity at both the individual and population level is crucial for monitoring endemic infections and revealing microbial association patterns within reservoirs. Recently, NGS approaches have been employed to characterize microbial communities of different ecosystems. Yet, their relative efficacy has not been assessed. Here, we compared two NGS approaches, RNA-Sequencing (RNA-Seq) and 16S-metagenomics, assessing their ability to survey neglected zoonotic bacteria in rodent populations. Methodology/Principal Findings We first extracted nucleic acids from the spleens of 190 voles collected in France. RNA extracts were pooled, randomly retro-transcribed, then RNA-Seq was performed using HiSeq. Assembled bacterial sequences were assigned to the closest taxon registered in GenBank. DNA extracts were analyzed via a 16S-metagenomics approach using two sequencers: the 454 GS-FLX and the MiSeq. The V4 region of the gene coding for 16S rRNA was amplified for each sample using barcoded universal primers. Amplicons were multiplexed and processed on the distinct sequencers. The resulting datasets were de-multiplexed, and each read was processed through a pipeline to be taxonomically classified using the Ribosomal Database Project. Altogether, 45 pathogenic bacterial genera were detected. The bacteria identified by RNA-Seq were comparable to those detected by 16S-metagenomics approach processed with MiSeq (16S-MiSeq). In contrast, 21 of these pathogens went unnoticed when the 16S-metagenomics approach was processed via 454-pyrosequencing (16S-454). In addition, the 16S-metagenomics approaches revealed a high level of coinfection in bank voles. Conclusions/Significance We concluded that RNA-Seq and 16S-MiSeq are equally sensitive in detecting bacteria. Although only the 16S-MiSeq method enabled identification of bacteria in each

  3. Evolutionary connections of biological kingdoms based on protein and nucleic acid sequence evidence

    NASA Technical Reports Server (NTRS)

    Dayhoff, M. O.

    1983-01-01

    Prokaryotic and eukaryotic evolutionary trees are developed from protein and nucleic-acid sequences by the methods of numerical taxonomy. Trees are presented for bacterial ferredoxins, 5S ribosomal RNA, c-type cytochromes , cytochromes c2 and c', and 5.8S ribosomal RNA; the implications for early evolution are discussed; and a composite tree showing the branching of the anaerobes, aerobes, archaebacteria, and eukaryotes is shown. Single lines are found for all oxygen-evolving photosynthetic forms and for the salt-loving and high-temperature forms of archaebacteria. It is argued that the eukaryote mitochondria, chloroplasts, and cytoplasmic host material are descended from free-living prokaryotes that formed symbiotic associations, with more than one symbiotic event involved in the evolution of each organelle.

  4. Alginic acid decreases postprandial upright gastroesophageal reflux. Comparison with equal-strength antacid.

    PubMed

    Castell, D O; Dalton, C B; Becker, D; Sinclair, J; Castell, J A

    1992-04-01

    This study tested the hypothesis that (alginic) acid may have a preferential effect on reflux in the upright position. We evaluated the effect of a compound containing alginic acid plus antacid (extra-strength Gaviscon) versus active control antacid with equal acid-neutralizing capacity on intraesophageal acid exposure following a high-fat meal (61% fat: sausage, egg, and biscuit). In random sequence, each of the 10 volunteers received either alginic acid-antacid or control antacid immediately following and 1, 2, and 3 hr after the meal. The sequence was repeated for both test drugs in the supine and upright positions with constant pH monitoring. Alginic acid-antacid significantly decreased postprandial reflux in the upright position compared to an equal amount of antacid. This effect did not occur in the supine position. These findings support the hypothesis that alginic acid is primarily effective in the upright position and the clinical observations of the effectiveness of alginic acid on daytime reflux symptoms. PMID:1551350

  5. The amino acid alphabet and the architecture of the protein sequence-structure map. I. Binary alphabets.

    PubMed

    Ferrada, Evandro

    2014-12-01

    The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet. PMID:25473967

  6. The Amino Acid Alphabet and the Architecture of the Protein Sequence-Structure Map. I. Binary Alphabets

    PubMed Central

    Ferrada, Evandro

    2014-01-01

    The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet. PMID:25473967

  7. Trypsin inhibitors from ridged gourd (Luffa acutangula Linn.) seeds: purification, properties, and amino acid sequences.

    PubMed

    Haldar, U C; Saha, S K; Beavis, R C; Sinha, N K

    1996-02-01

    Two trypsin inhibitors, LA-1 and LA-2, have been isolated from ridged gourd (Luffa acutangula Linn.) seeds and purified to homogeneity by gel filtration followed by ion-exchange chromatography. The isoelectric point is at pH 4.55 for LA-1 and at pH 5.85 for LA-2. The Stokes radius of each inhibitor is 11.4 A. The fluorescence emission spectrum of each inhibitor is similar to that of the free tyrosine. The biomolecular rate constant of acrylamide quenching is 1.0 x 10(9) M-1 sec-1 for LA-1 and 0.8 x 10(9) M-1 sec-1 for LA-2 and that of K2HPO4 quenching is 1.6 x 10(11) M-1 sec-1 for LA-1 and 1.2 x 10(11) M-1 sec-1 for LA-2. Analysis of the circular dichroic spectra yields 40% alpha-helix and 60% beta-turn for La-1 and 45% alpha-helix and 55% beta-turn for LA-2. Inhibitors LA-1 and LA-2 consist of 28 and 29 amino acid residues, respectively. They lack threonine, alanine, valine, and tryptophan. Both inhibitors strongly inhibit trypsin by forming enzyme-inhibitor complexes at a molar ratio of unity. A chemical modification study suggests the involvement of arginine of LA-1 and lysine of LA-2 in their reactive sites. The inhibitors are very similar in their amino acid sequences, and show sequence homology with other squash family inhibitors. PMID:8924202

  8. Microfluidic platform for isolating nucleic acid targets using sequence specific hybridization

    PubMed Central

    Wang, Jingjing; Morabito, Kenneth; Tang, Jay X.; Tripathi, Anubhav

    2013-01-01

    The separation of target nucleic acid sequences from biological samples has emerged as a significant process in today's diagnostics and detection strategies. In addition to the possible clinical applications, the fundamental understanding of target and sequence specific hybridization on surface modified magnetic beads is of high value. In this paper, we describe a novel microfluidic platform that utilizes a mobile magnetic field in static microfluidic channels, where single stranded DNA (ssDNA) molecules are isolated via nucleic acid hybridization. We first established efficient isolation of biotinylated capture probe (BP) using streptavidin-coated magnetic beads. Subsequently, we investigated the hybridization of target ssDNA with BP bound to beads and explained these hybridization kinetics using a dual-species kinetic model. The number of hybridized target ssDNA molecules was determined to be about 6.5 times less than that of BP on the bead surface, due to steric hindrance effects. The hybridization of target ssDNA with non-complementary BP bound to bead was also examined, and non-specific hybridization was found to be insignificant. Finally, we demonstrated highly efficient capture and isolation of target ssDNA in the presence of non-target ssDNA, where as low as 1% target ssDNA can be detected from mixture. The microfluidic method described in this paper is significantly relevant and is broadly applicable, especially towards point-of-care biological diagnostic platforms that require binding and separation of known target biomolecules, such as RNA, ssDNA, or protein. PMID:24404041

  9. The Short ITS2 Sequence Serves as an Efficient Taxonomic Sequence Tag in Comparison with the Full-Length ITS

    PubMed Central

    Han, Jianping; Zhu, Yingjie; Chen, Xiaochen; Liao, Baoshen; Yao, Hui; Song, Jingyuan; Chen, Shilin; Meng, Fanyun

    2013-01-01

    An ideal DNA barcoding region should be short enough to be amplified from degraded DNA. In this paper, we discuss the possibility of using a short nuclear DNA sequence as a barcode to identify a wide range of medicinal plant species. First, the PCR and sequencing success rates of ITS and ITS2 were evaluated based entirely on materials from dry medicinal product and herbarium voucher specimens, including some samples collected back to 90 years ago. The results showed that ITS2 could recover 91% while ITS could recover only 23% efficiency of PCR and sequencing by using one pair of primer. Second, 12861 ITS and ITS2 plant sequences were used to compare the identification efficiency of the two regions. Four identification criteria (BLAST, inter- and intradivergence Wilcoxon signed rank tests, and TaxonDNA) were evaluated. Our results supported the hypothesis that ITS2 can be used as a minibarcode to effectively identify species in a wide variety of specimens and medicinal materials. PMID:23484151

  10. Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing

    PubMed Central

    2013-01-01

    Background Legionella pneumophila is an opportunistic pathogen of humans where the source of infection is usually from contaminated man-made water systems. When an outbreak of Legionnaires’ disease caused by L. pneumophila occurs, it is necessary to discover the source of infection. A seven allele sequence-based typing scheme (SBT) has been very successful in providing the means to attribute outbreaks of L. pneumophila to a particular source or sources. Particular sequence types described by this scheme are known to exhibit specific phenotypes. For instance some types are seen often in clinical cases but are rarely isolated from the environment and vice versa. Of those causing human disease some types are thought to be more likely to cause more severe disease. It is possible that the genetic basis for these differences are vertically inherited and associated with particular genetic lineages within the population. In order to provide a framework within which to test this hypothesis and others relating to the population biology of L. pneumophila, a set of genomes covering the known diversity of the organism is required. Results Firstly, this study describes a means to group L. pneumophila strains into pragmatic clusters, using a methodology that takes into consideration the genetic forces operating on the population. These clusters can be used as a standardised nomenclature, so those wishing to describe a group of strains can do so. Secondly, the clusters generated from the first part of the study were used to select strains rationally for whole genome sequencing (WGS). The data generated was used to compare phylogenies derived from SBT and WGS. In general the SBT sequence type (ST) accurately reflects the whole genome-based genotype. Where there are exceptions and recombination has resulted in the ST no longer reflecting the genetic lineage described by the whole genome sequence, the clustering technique employed detects these sequence types as being admixed

  11. Comparison of acid anhydrides with carboxylic acids in enantioselective enzymatic esterification of racemic menthol.

    PubMed

    Xu, J; Zhu, J; Kawamoto, T; Atsuo, T; Hu, Y

    1997-01-01

    Optical resolution of racemic menthol has been efficiently achieved by lipase-catalyzed enantioselective esterification in an organic solvent. The performance of the reaction using an acid anhydride as an acyl donor was compared with that using its corresponding free acid. The reactivities of acid anhydrides were found to be higher than their corresponding free acids, but acid anhydrides were also found to be easily hydrolyzed into free acids under the catalysis of the same enzyme. The existence of a too-high concentration of an acid anhydride in a micro-aqueous reaction system will cause dehydration and thus deactivation of the enzyme, and will enhance non-selective esterification of a chiral alcohol, which will reduce the optical purity of the product. All these drawbacks, however, could be effectively overcome in a semi-batch reaction system into which propionic anhydride was continuously fed. This system showed some advantages over a batch reaction system using free propionic acid: the reaction time of dl-menthol was shortened by half, the stability of the enzyme was much enhanced, and the optical purity of the product (l-menthyl ester) was kept at a similarly high level (> 98% ee). PMID:9631262

  12. Comparison of D-gluconic acid production in selected strains of acetic acid bacteria.

    PubMed

    Sainz, F; Navarro, D; Mateo, E; Torija, M J; Mas, A

    2016-04-01

    The oxidative metabolism of acetic acid bacteria (AAB) can be exploited for the production of several compounds, including D-gluconic acid. The production of D-gluconic acid in fermented beverages could be useful for the development of new products without glucose. In the present study, we analyzed nineteen strains belonging to eight different species of AAB to select those that could produce D-gluconic acid from D-glucose without consuming D-fructose. We tested their performance in three different media and analyzed the changes in the levels of D-glucose, D-fructose, D-gluconic acid and the derived gluconates. D-Glucose and D-fructose consumption and D-gluconic acid production were heavily dependent on the strain and the media. The most suitable strains for our purpose were Gluconobacter japonicus CECT 8443 and Gluconobacter oxydans Po5. The strawberry isolate Acetobacter malorum (CECT 7749) also produced D-gluconic acid; however, it further oxidized D-gluconic acid to keto-D-gluconates. PMID:26848948

  13. Characterization of N-glycosylation and amino acid sequence features of immunoglobulins from swine.

    PubMed

    Lopez, Paul G; Girard, Lauren; Buist, Marjorie; de Oliveira, Andrey Giovanni Gomes; Bodnar, Edward; Salama, Apolline; Soulillou, Jean-Paul; Perreault, Hélène

    2016-02-01

    The primary goal of this study was to develop a method to study the N-glycosylation of IgG from swine in order to detect epitopes containing N-glycolylneuraminic acid (Neu5Gc) and/or terminal galactose residues linked in α1-3 susceptible to cause xenograft-related problems. Samples of immunoglobulin were isolated from porcine serum using protein-A affinity chromatography. The eluate was then separated on electrophoretic gel, and bands corresponding to the N-glycosylated heavy chains were cut off the gel and subjected to tryptic digestion. Peptides and glycopeptides were separated by reversed phase liquid chromatography and fractions were collected for matrix-assisted laser desorption/ionization time-of-flight mass spectrometric (MALDI-TOF-MS) analysis. Overall no α1-3 galactose was detected, as demonstrated by complete susceptibility of terminal galactose residues to β-galactosidase digestion. Neu5Gc was detected on singly sialylated structures. Two major N-glycopeptides were found, EEQFNSTYR and EAQFNSTYR as determined by tandem MS (MS/MS), as previously reported by Butler et al. (Immunogenetics, 61, 2009, 209-230), who found 11 subclasses for porcine IgG. Out of the 11, ten include the sequence corresponding to EEQFNSTYR, and only one codes for EAQFNSTYR. In this study, glycosylation patterns associated with both chains were slightly different, in that EEQFNSTYR had a higher content of galactose. The last step of this study consisted of peptide-mapping the 11 reported porcine IgG sequences. Although there was considerable overlap, at least one unique tryptic peptide was found per IgG sequence. The workflow presented in this manuscript constitutes the first study to use MALDI-TOF-MS in the investigation of porcine IgG structural features. PMID:26586247

  14. Human Retroviruses and AIDS. A compilation and analysis of nucleic acid and amino acid sequences: I--II; III--V

    SciTech Connect

    Myers, G.; Korber, B.; Wain-Hobson, S.; Smith, R.F.; Pavlakis, G.N.

    1993-12-31

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (I) HIV and SIV Nucleotide Sequences; (II) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. Information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium.

  15. Microbial Analysis of Bite Marks by Sequence Comparison of Streptococcal DNA

    PubMed Central

    Kennedy, Darnell M.; Stanton, Jo-Ann L.; García, José A.; Mason, Chris; Rand, Christy J.; Kieser, Jules A.; Tompkins, Geoffrey R.

    2012-01-01

    Bite mark injuries often feature in violent crimes. Conventional morphometric methods for the forensic analysis of bite marks involve elements of subjective interpretation that threaten the credibility of this field. Human DNA recovered from bite marks has the highest evidentiary value, however recovery can be compromised by salivary components. This study assessed the feasibility of matching bacterial DNA sequences amplified from experimental bite marks to those obtained from the teeth responsible, with the aim of evaluating the capability of three genomic regions of streptococcal DNA to discriminate between participant samples. Bite mark and teeth swabs were collected from 16 participants. Bacterial DNA was extracted to provide the template for PCR primers specific for streptococcal 16S ribosomal RNA (16S rRNA) gene, 16S–23S intergenic spacer (ITS) and RNA polymerase beta subunit (rpoB). High throughput sequencing (GS FLX 454), followed by stringent quality filtering, generated reads from bite marks for comparison to those generated from teeth samples. For all three regions, the greatest overlaps of identical reads were between bite mark samples and the corresponding teeth samples. The average proportions of reads identical between bite mark and corresponding teeth samples were 0.31, 0.41 and 0.31, and for non-corresponding samples were 0.11, 0.20 and 0.016, for 16S rRNA, ITS and rpoB, respectively. The probabilities of correctly distinguishing matching and non-matching teeth samples were 0.92 for ITS, 0.99 for 16S rRNA and 1.0 for rpoB. These findings strongly support the tenet that bacterial DNA amplified from bite marks and teeth can provide corroborating information in the identification of assailants. PMID:23284761

  16. [Comparison study of enhanced coagulation on humic acid and fulvic acid removal].

    PubMed

    Zhou, Ling-ling; Zhang, Yong-ji; Ye, He-xiu; Zhang, Yi-qing

    2012-08-01

    Enhanced coagulation effects of four coagulants, such as aluminium sulfate, ferric chloride, aluminium polychloride and poly-ferric chloride, were examined, with an emphasis on pH, turbidity, Ca+ and relative contents of humic acid and fulvic acid. The result showed that the removal efficiency of four kinds of coagulant for humic acid was higher than that for fulvic acid. Compared with aluminium polychloride and poly-ferric chloride, aluminium sulfate and ferric chloride possessed a better coagulation effect. At the coagulant dosage of 40 mg x L(-1), ferric chloride, aluminium sulfate, poly-ferric chloride and aluminium polychloride removed fulvic acid from 10 mg x L(-1) to 3.22 mg x L(-1), 4.34 mg x L(-1), 5.85 mg x L(-1) and 4.86 mg x L(-1) respectively, while the four coagulants removed humic acid from 10 mg x L(-1) to 1.13 mg x L(-1), 2.13 mg x L(-1), 3.44 mg x L(-1) and 2.50 mg x L(-1) respectively in water. At pH between 5.5 and 6.5, aluminium sullfate and ferric chloride had the best coagulation effect. The coagulant had the lower efficiency with increase of organic carbon in water. Especially, the content ratio of fulvic acid and humic acid was above 0.4, the coagulation effect markedly decreased. Turbidity has a little influence on organic carbon removal rate. With the concentration of Ca2+, the removal efficiency of humic acid and fulvic acid increased. PMID:23213890

  17. Lactic acid production from potato peel waste by anaerobic sequencing batch fermentation using undefined mixed culture.

    PubMed

    Liang, Shaobo; McDonald, Armando G; Coats, Erik R

    2015-11-01

    Lactic acid (LA) is a necessary industrial feedstock for producing the bioplastic, polylactic acid (PLA), which is currently produced by pure culture fermentation of food carbohydrates. This work presents an alternative to produce LA from potato peel waste (PPW) by anaerobic fermentation in a sequencing batch reactor (SBR) inoculated with undefined mixed culture from a municipal wastewater treatment plant. A statistical design of experiments approach was employed using set of 0.8L SBRs using gelatinized PPW at a solids content range from 30 to 50 g L(-1), solids retention time of 2-4 days for yield and productivity optimization. The maximum LA production yield of 0.25 g g(-1) PPW and highest productivity of 125 mg g(-1) d(-1) were achieved. A scale-up SBR trial using neat gelatinized PPW (at 80 g L(-1) solids content) at the 3 L scale was employed and the highest LA yield of 0.14 g g(-1) PPW and a productivity of 138 mg g(-1) d(-1) were achieved with a 1 d SRT. PMID:25708409

  18. Bacterial community compositions in sediment polluted by perfluoroalkyl acids (PFAAs) using Illumina high-throughput sequencing.

    PubMed

    Sun, Yajun; Wang, Tieyu; Peng, Xiawei; Wang, Pei; Lu, Yonglong

    2016-06-01

    The characterization of bacterial community compositions and the change in perfluoroalkyl acids (PFAAs) along a natural river distribution system were explored in the present study. Illumina high-throughput sequencing was used to explore bacterial community diversity and structure in sediment polluted by PFAAs from the Xiaoqing River, the area with concentrated fluorochemical facilities in China. The concentration of PFAAs was in the range of 8.44-465.60 ng/g dry weight (dw) in sediment. Perfluorooctanoic acid (PFOA) was the dominant PFAA in all samples, which accounted for 94.2 % of total PFAAs. High-level PFOA could lead to an obvious increase in relative abundance of Proteobacteria, ε-Proteobacteria, Thiobacillus, and Sulfurimonas and the decrease in relative abundance of other bacteria. Redundancy analysis revealed that PFOA played an important role in the formation of bacterial community, and PFOA at higher concentration could reduce the diversity of bacterial community. When the concentration of PFOA was below 100 ng/g dw in sediment, no significant effect on microbial community structure was observed. Thiobacillus and Sulfurimonas were positively correlated with the concentration of PFOA, suggesting that both genera were resistant to PFOA contamination. PMID:26780047

  19. Mass spectrometric detection of the amino acid sequence polymorphism of the hepatitis C virus antigen.

    PubMed

    Kaysheva, A L; Ivanov, Yu D; Frantsuzov, P A; Krohin, N V; Pavlova, T I; Uchaikin, V F; Konev, V А; Kovalev, O B; Ziborov, V S; Archakov, A I

    2016-03-01

    A method for detection and identification of the hepatitis C virus antigen (HCVcoreAg) in human serum with consideration for possible amino acid substitutions is proposed. The method is based on a combination of biospecific capturing and concentrating of the target protein on the surface of the chip for atomic force microscope (AFM chip) with subsequent protein identification by tandem mass spectrometric (MS/MS) analysis. Biospecific AFM-capturing of viral particles containing HCVcoreAg from serum samples was performed by use of AFM chips with monoclonal antibodies (anti-HCVcore) covalently immobilized on the surface. Biospecific complexes were registered and counted by AFM. Further MS/MS analysis allowed to reliably identify the HCVcoreAg in the complexes formed on the AFM chip surface. Analysis of MS/MS spectra, with the account taken of the possible polymorphisms in the amino acid sequence of the HCVcoreAg, enabled us to increase the number of identified peptides. PMID:26773170

  20. Peptide sequencing by using a combination of partial acid hydrolysis and fast-atom-bombardment mass spectrometry.

    PubMed Central

    De Angelis, F; Botta, M; Ceccarelli, S; Nicoletti, R

    1986-01-01

    To overcome the limit of the intensity of ions carrying sequence information in structural determinations of peptides by fast-atom-bombardment m.s., we have developed a method that consists in taking spectra of the peptide acid hydrolysates at different hydrolysis times. Peaks correspond to the oligomers arising from the peptide partial hydrolysis. The sequence can then be identified from the structurally overlapping fragments. PMID:2428356

  1. Canine preprorelaxin: nucleic acid sequence and localization within the canine placenta.

    PubMed

    Klonisch, T; Hombach-Klonisch, S; Froehlich, C; Kauffold, J; Steger, K; Steinetz, B G; Fischer, B

    1999-03-01

    Employing uteroplacental tissue at Day 35 of gestation, we determined the nucleic acid sequence of canine preprorelaxin using reverse transcription- and rapid amplification of cDNA ends-polymerase chain reaction. Canine preprorelaxin cDNA consisted of 534 base pairs encoding a protein of 177 amino acids with a signal peptide of 25 amino acids (aa), a B domain of 35 aa, a C domain of 93 aa, and an A domain of 24 aa. The putative receptor binding region in the N'-terminal part of the canine relaxin B domain GRDYVR contained two substitutions from the classical motif (E-->D and L-->Y). Canine preprorelaxin shared highest homology with porcine and equine preprorelaxin. Northern analysis revealed a 1-kilobase transcript present in total RNA of canine uteroplacental tissue but not of kidney tissue. Uteroplacental tissue from two bitches each at Days 30 and 35 of gestation were studied by in situ hybridization to localize relaxin mRNA. Immunohistochemistry for relaxin, cytokeratin, vimentin, and von Willebrand factor was performed on uteroplacental tissue at Day 30 of gestation. The basal cell layer at the core of the chorionic villi was devoid of relaxin mRNA and immunoreactive relaxin or vimentin but was immunopositive for cytokeratin and identified as cytotrophoblast cells. The cell layer surrounding the chorionic villi displayed specific hybridization signals for relaxin mRNA and immunoreactivity for relaxin and cytokeratin but not for vimentin, and was identified as syncytiotrophoblast. Those areas of the chorioallantoic tissue with most intense relaxin immunoreactivity were highly vascularized as demonstrated by immunoreactive von Willebrand factor expressed on vascular endothelium. The uterine glands and nonplacental uterine areas of the canine zonary girdle placenta were devoid of relaxin mRNA and relaxin. We conclude that the syncytiotrophoblast is the source of relaxin in the canine placenta. PMID:10026098

  2. Purification and partial amino acid sequence of the chloroplast cytochrome b-559.

    PubMed

    Widger, W R; Cramer, W A; Hermodson, M; Meyer, D; Gullifor, M

    1984-03-25

    The hydrophobic cytochrome b-559, purified from unstacked, ethanol-washed spinach thylakoid membranes, using extraction with 2% Triton X-100 in 4 M urea and three chromatographic steps in the presence of protease inhibitors, has a dominant band on sodium dodecyl sulfate-urea gels corresponding to Mr = 10,000. The yield of this preparation is 30-50% (5-10 mg) starting with 600 mg of chlorophyll. The heme content yields a calculated molecular weight of no more than 17,500/heme, and perhaps somewhat smaller after correction for impurities. The Mr = 10,000 band is stained by the tetramethylbenzidine-H2O2 heme reagent on lithium dodecyl sulfate gels run at 0 degrees C. The Mr = 10,000 protein, further separated by high performance liquid chromatography, contains a unique NH2 terminus that is not blocked, and the amino acid sequence for the first 27 residues is NH2-Ser-Gly-Ser-Thr-Gly-Glu-Arg-Ser-Phe-Ala-Asp-Ile-Ile-Thr-Ser-Ile-Arg-Tyr-Trp -Val-Ile-X-Ser-Ile-Thr-Ile-Pro. . . COOH. Approximately 55% of the amino acids are hydrophobic, based on amino acid analysis of the Mr = 10,000 peptide, which also indicated the presence of at least one histidine. Only one cytochrome b-559 component could be identified, whose yield indicated that it arises from a single b-559 protein in chloroplasts corresponding to the in situ high potential cytochrome of the chloroplast photosystem II. PMID:6706983

  3. Sequence-Specific Electrical Purification of Nucleic Acids with Nanoporous Gold Electrodes.

    PubMed

    Daggumati, Pallavi; Appelt, Sandra; Matharu, Zimple; Marco, Maria L; Seker, Erkin

    2016-06-22

    Nucleic-acid-based biosensors have enabled rapid and sensitive detection of pathogenic targets; however, these devices often require purified nucleic acids for analysis since the constituents of complex biological fluids adversely affect sensor performance. This purification step is typically performed outside the device, thereby increasing sample-to-answer time and introducing contaminants. We report a novel approach using a multifunctional matrix, nanoporous gold (np-Au), which enables both detection of specific target sequences in a complex biological sample and their subsequent purification. The np-Au electrodes modified with 26-mer DNA probes (via thiol-gold chemistry) enabled sensitive detection and capture of complementary DNA targets in the presence of complex media (fetal bovine serum) and other interfering DNA fragments in the range of 50-1500 base pairs. Upon capture, the noncomplementary DNA fragments and serum constituents of varying sizes were washed away. Finally, the surface-bound DNA-DNA hybrids were released by electrochemically cleaving the thiol-gold linkage, and the hybrids were iontophoretically eluted from the nanoporous matrix. The optical and electrophoretic characterization of the analytes before and after the detection-purification process revealed that low target DNA concentrations (80 pg/μL) can be successfully detected in complex biological fluids and subsequently released to yield pure hybrids free of polydisperse digested DNA fragments and serum biomolecules. Taken together, this multifunctional platform is expected to enable seamless integration of detection and purification of nucleic acid biomarkers of pathogens and diseases in miniaturized diagnostic devices. PMID:27244455

  4. Comparison of whole mitochondrial genome sequences from two clades of the invasive ascidian, Didemnum vexillum.

    PubMed

    Smith, Kirsty F; Abbott, Cathryn L; Saito, Yasunori; Fidler, Andrew E

    2015-02-01

    The mitochondria are the main source of cellular energy production and have an important role in development, fertility, and thermal limitations. Adaptive mitochondrial DNA mutations have the potential to be of great importance in determining aspects of the life history of an organism. Phylogenetic analyses of the globally invasive marine ascidian Didemnum vexillum using the mitochondrial cytochrome c oxidase 1 (COX1) coding region, revealed two distinct clades. Representatives of one clade (denoted by 'B') are geographically restricted to D. vexillum's native region (north-west Pacific Ocean, including Japan), whereas members of the other clade (denoted by 'A') have been introduced and become invasive in temperate coastal areas around the world. Persistence of clade B's restricted distribution may reflect it being inherently less invasive than clade A. To investigate this we sought to determine if the two clades differ significantly in other mitochondrial genes of functional significance, specifically, alterations in amino acids encoded in mitochondrial enzyme subunits. Differences in functional mitochondrial genes could indicate an increased ability for clade A colonies to tolerate a wider range of environmental temperature. Full mitochondrial genomic sequences from D. vexillum clades A and B were obtained and they predict significant sequence differences in genes encoding for enzymes involved in oxidative phosphorylation. Diversity levels were relatively high and showed divergence across almost all genes, with p-distance values between the two clades indicating recent divergence. Both clades showed an excess of rare variants, which is consistent with balancing selection or a recent population expansion. Results presented here will inform future research focusing on examining the functional properties of the corresponding mitochondrial respiration enzymes, of A and B clade enzymes. By comparing closely related taxa that have differing distributions it is possible

  5. A structural and functional comparison of nematode and crustacean PDH-like sequences.

    PubMed

    Meelkop, E; Marco, H G; Janssen, T; Temmerman, L; Vanhove, M P M; Schoofs, L

    2012-03-01

    The elucidation of the whole genome of the nematode Caenorhabditis elegans allowed for the identification of ortholog genes belonging to the pigment dispersing hormone/factor (PDH/PDF) peptide family. Members of this peptide family are known from crustaceans, insects and nematodes and seem to exist exclusively in ecdysozoans where they play a role in different processes, ranging from the dispersion of integumental and eye (retinal) pigments in decapod crustaceans to circadian rhythms in insects and locomotion in C. elegans. Two pdf genes (pdf-1 and pdf-2) encoding three different peptides: PDF-1a, PDF-1b and PDF-2 have been identified in C. elegans. These three C. elegans PDH-like peptides are similar but not identical in primary structure to PDHs from decapod crustaceans. We investigate whether this divergence has an influence on the pigment dispersing function of the peptides in a decapod crustacean, namely the shrimp Palaemon pacificus. We show that C. elegans PDF-1a and b peptides display cross-functional activity by dispersing pigments in the epithelium of P. pacificus at physiological doses. Moreover, by means of a comparative amino acid sequence analysis of nematode and crustacean PDH-like peptides, we can pinpoint several potentially important residues for eliciting pigment dispersing activity in decapod crustaceans. Although there is no sequence information on a receptor for PDH in decapod crustaceans, we postulate that there is general conservation of the PDH/PDF signaling system based on structural similarities of precursor proteins and receptors (including those from a branchiopod crustacean and from C. elegans). PMID:22115566

  6. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

    NASA Astrophysics Data System (ADS)

    McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  7. Comparison of phenotypic and molecular tests to identify lactic acid bacteria

    PubMed Central

    Moraes, Paula Mendonça; Perin, Luana Martins; Júnior, Abelardo Silva; Nero, Luís Augusto

    2013-01-01

    Twenty-nine lactic acid bacteria (LAB) isolates were submitted for identification using Biolog, API50CHL, 16S rDNA sequencing, and species-specific PCR reactions. The identification results were compared, and it was concluded that a polyphasic approach is necessary for proper LAB identification, being the molecular analyzes the most reliable. PMID:24159291

  8. Molecular cloning and sequence analysis of the Sta58 major antigen gene of Rickettsia tsutsugamushi: sequence homology and antigenic comparison of Sta58 to the 60-kilodalton family of stress proteins.

    PubMed Central

    Stover, C K; Marana, D P; Dasch, G A; Oaks, E V

    1990-01-01

    The scrub typhus 58-kilodalton (kDa) antigen (Sta58) of Rickettsia tsutsugamushi is a major protein antigen often recognized by humans infected with scrub typhus rickettsiae. A 2.9-kilobase HindIII fragment containing a complete sta58 gene was cloned in Escherichia coli and found to express the entire Sta58 antigen and a smaller protein with an apparent molecular mass of 11 kDa (Stp11). DNA sequence analysis of the 2.9-kilobase HindIII fragment revealed two adjacent open reading frames encoding proteins of 11 (Stp11) and 60 (Sta58) kDa. Comparisons of deduced amino acid sequences disclosed a high degree of homology between the R. tsutsugamushi proteins Stp11 and Sta58 and the E. coli proteins GroES and GroEL, respectively, and the family of primordial heat shock proteins designated Hsp10 Hsp60. Although the sequence homology between the Sta58 antigen and the Hsp60 protein family is striking, the Sta58 protein appeared to be antigenically distinct among a sample of other bacterial Hsp60 homologs, including the typhus group of rickettsiae. The antigenic uniqueness of the Sta58 antigen indicates that this protein may be a potentially protective antigen and a useful diagnostic reagent for scrub typhus fever. Images PMID:2108930

  9. Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis KOD1 and comparison with Pyrococcus genomes

    PubMed Central

    Fukui, Toshiaki; Atomi, Haruyuki; Kanai, Tamotsu; Matsumi, Rie; Fujiwara, Shinsuke; Imanaka, Tadayuki

    2005-01-01

    The genus Thermococcus, comprised of sulfur-reducing hyperthermophilic archaea, belongs to the order Thermococcales in Euryarchaeota along with the closely related genus Pyrococcus. The members of Thermococcus are ubiquitously present in natural high-temperature environments, and are therefore considered to play a major role in the ecology and metabolic activity of microbial consortia within hot-water ecosystems. To obtain insight into this important genus, we have determined and annotated the complete 2,088,737-base genome of Thermococcus kodakaraensis strain KOD1, followed by a comparison with the three complete genomes of Pyrococcus spp. A total of 2306 coding DNA sequences (CDSs) have been identified, among which half (1165 CDSs) are annotatable, whereas the functions of 41% (936 CDSs) cannot be predicted from the primary structures. The genome contains seven genes for probable transposases and four virus-related regions. Several proteins within these genetic elements show high similarities to those in Pyrococcus spp., implying the natural occurrence of horizontal gene transfer of such mobile elements among the order Thermococcales. Comparative genomics clarified that 1204 proteins, including those for information processing and basic metabolisms, are shared among T. kodakaraensis and the three Pyrococcus spp. On the other hand, among the set of 689 proteins unique to T. kodakaraensis, there are several intriguing proteins that might be responsible for the specific trait of the genus Thermococcus, such as proteins involved in additional pyruvate oxidation, nucleotide metabolisms, unique or additional metal ion transporters, improved stress response system, and a distinct restriction system. PMID:15710748

  10. Removal of typical endocrine disrupting chemicals by membrane bioreactor: in comparison with sequencing batch reactor.

    PubMed

    Zhou, Yingjun; Huang, Xia; Zhou, Haidong; Chen, Jianhua; Xue, Wenchao

    2011-01-01

    The removal of endocrine disrupting chemicals (EDCs) by a laboratory-scale membrane bioreactor (MBR) fed with synthetic sewage was evaluated and moreover, compared with that by a sequencing batch reactor (SBR) operated under same conditions in parallel. Eight kinds of typical EDCs, including 17β-estradiol (E2), estrone (E1), estriol (E3), 17α-ethynilestradiol (EE2), 4-octylphenol (4-OP), 4-nonylphenol (4-NP), bisphenol A (BPA) and nonylphenol ethoxylates (NPnEO), were spiked into the feed. Their concentrations in influent, effluent and supernatant were determined by gas chromatography-mass spectrometry method. The overall estrogenecity was evaluated as 17β-estradiol equivalent quantity (EEQ), determined via yeast estrogen screen (YES) assay. E2, E3, BPA and 4-OP were well removed by both MBR and SBR, with removal rates more than 95% and no significant differences between the two reactors. However, with regard to the other four EDCs, of which the removal rates were lower, MBR performed better. Comparison between supernatant and effluent of the two reactors indicated that membrane separation of sludge and effluent, compared with sedimentation, can relatively improve elimination of target EDCs and total estrogenecity. By applying different solids retention times (SRTs) (5, 10, 20 and 40 d) to the MBR, 10 and 5 d were found to be the lower critical SRTs for efficient target EDCs and EEQ removal, respectively. PMID:22105134

  11. A comparison between equations describing in vivo MT: The effects of noise and sequence parameters

    NASA Astrophysics Data System (ADS)

    Cercignani, Mara; Barker, Gareth J.

    2008-04-01

    Quantitative models of magnetization transfer (MT) allow the estimation of physical properties of tissue which are thought to reflect myelination, and are therefore likely to be useful for clinical application. Although a model describing a two-pool system under continuous wave-saturation has been available for two decades, generalizing such a model to pulsed MT, and therefore to in vivo applications, is not straightforward, and only recently have a range of equations predicting the outcome of pulsed MT experiments been proposed. These solutions of the 2-pool model are based on differing assumptions and involve differing degrees of complexity, so their individual advantages and limitations are not always obvious. This paper is concerned with the comparison of three differing signal equations. After reviewing the theory behind each of them, their accuracy and precision is investigated using numerical simulations under variable experimental conditions such as degree of T1-weighting of the acquisition sequence and SNR, and the consistency of numerical results is tested using in vivo data. We show that while in conditions of minimal T1-weighting, high SNR, and large duty cycle the solutions of the three equations are consistent, they have a different tolerance to deviations from the basic assumptions behind their development, which should be taken into account when designing a quantitative MT protocol.

  12. Identification of Simple Sequence Repeat Biomarkers through Cross-Species Comparison in a Tag Cloud Representation

    PubMed Central

    2014-01-01

    Simple sequence repeats (SSRs) are not only applied as genetic markers in evolutionary studies but they also play an important role in gene regulatory activities. Efficient identification of conserved and exclusive SSRs through cross-species comparison is helpful for understanding the evolutionary mechanisms and associations between specific gene groups and SSR motifs. In this paper, we developed an online cross-species comparative system and integrated it with a tag cloud visualization technique for identifying potential SSR biomarkers within fourteen frequently used model species. Ultraconserved or exclusive SSRs among cross-species orthologous genes could be effectively retrieved and displayed through a friendly interface design. Four different types of testing cases were applied to demonstrate and verify the retrieved SSR biomarker candidates. Through statistical analysis and enhanced tag cloud representation on defined functional related genes and cross-species clusters, the proposed system can correctly represent the patterns, loci, colors, and sizes of identified SSRs in accordance with gene functions, pattern qualities, and conserved characteristics among species. PMID:24800246

  13. Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design.

    PubMed

    Voigt, C A; Gordon, D B; Mayo, S L

    2000-06-01

    Finding the minimum energy amino acid side-chain conformation is a fundamental problem in both homology modeling and protein design. To address this issue, numerous computational algorithms have been proposed. However, there have been few quantitative comparisons between methods and there is very little general understanding of the types of problems that are appropriate for each algorithm. Here, we study four common search techniques: Monte Carlo (MC) and Monte Carlo plus quench (MCQ); genetic algorithms (GA); self-consistent mean field (SCMF); and dead-end elimination (DEE). Both SCMF and DEE are deterministic, and if DEE converges, it is guaranteed that its solution is the global minimum energy conformation (GMEC). This provides a means to compare the accuracy of SCMF and the stochastic methods. For the side-chain placement calculations, we find that DEE rapidly converges to the GMEC in all the test cases. The other algorithms converge on significantly incorrect solutions; the average fraction of incorrect rotamers for SCMF is 0.12, GA 0.09, and MCQ 0.05. For the protein design calculations, design positions are progressively added to the side-chain placement calculation until the time required for DEE diverges sharply. As the complexity of the problem increases, the accuracy of each method is determined so that the results can be extrapolated into the region where DEE is no longer tractable. We find that both SCMF and MCQ perform reasonably well on core calculations (fraction amino acids incorrect is SCMF 0.07, MCQ 0.04), but fail considerably on the boundary (SCMF 0.28, MCQ 0.32) and surface calculations (SCMF 0.37, MCQ 0.44). PMID:10835284

  14. Comparison of methods of extracting messenger Ribonucleic Acid from ejaculated Porcine (Sus Scrofa) Spermatozoa

    Technology Transfer Automated Retrieval System (TEKTRAN)

    H. D. Guthrie, G.R. Welch, and L. A. Blomberg. Comparison of Methods of Extracting Messenger Ribonucleic Acid from Ejaculated Porcine (Sus Scrofa) Spermatozoa. Biotechnology and Germplasm Laboratory, Agricultural Research Service U. S. Department of Agriculture, Beltsville, MD 20705 The purpos...

  15. Exon-intron organization and sequence comparison of human and murine T11 (CD2) genes

    SciTech Connect

    Diamond, D.J.; Clayton, L.K.; Sayre, P.H.; Reinherz, E.L.

    1988-03-01

    Genomic DNA clones containing the human and murine genes coding for the 50-kDa T11 (CD2) T-cell surface glycoprotein were characterized. The human T11 gene is approx. = 12 kilobases long and comprised of five exons. A leader exon (L) contains the 5'-untranslated region and most of the nucleotides defining the signal peptide (amino acids (aa) -24 to -5). Two exons encode the extracellular segment; exon Ex1 is 321 base pairs (bp) long and codes for four residues of the leader peptide and aa 1-103 of the mature protein, and exon Ex2 is 231 bp long and encodes aa 104-180. Exon TM is 123 bp long and codes for the single transmembrane region of the molecule (aa 181-221). Exon C is a large 765-bp exon encoding virtually the entire cytoplasmic domain (aa 222-327) and the 3'-untranslated region. The murine region T11 gene has a similar organization with exon-intron boundaries essentially identical to the human gene. Substantial conservation of nucleotide sequences between species in both 5'- and 3'-gene flanking regions equivalent to that among homologous exons suggests that murine and human genes may be regulated in a similar fashion. The probable relationship of the individual T11 exons to functional and structural protein domains is discussed.

  16. Complete amino acid sequence of the medium-chain S-acyl fatty acid synthetase thio ester hydrolase from rat mammary gland

    SciTech Connect

    Randhawa, Z.I.; Smith, S.

    1987-03-10

    The complete amino acid sequence of the medium-chain S-acyl fatty acid synthetase thio ester hydrolase (thioesterase II) from rat mammary gland is presented. Most of the sequence was derived by analysis of (/sup 14/C)-labelled peptide fragments produced by cleavage at methionyl, glutamyl, lysyl, arginyl, and tryptophanyl residues. A small section of the sequence was deduced from a previously analyzed cDNA clone. The protein consists of 260 residues and has a blocked amino-terminal methionine and calculated M/sub r/ of 29,212. The carboxy-terminal sequence, verified by Edman degradation of the carboxy-terminal cyanogen bromide fragment and carboxypeptidase Y digestion of the intact thioesterase II, terminates with a serine residue and lacks three additional residues predicted by the cDNA sequence. The native enzyme contains three cysteine residues but no disulfide bridges. The active site serine residue is located at position 101. The rat mammary gland thioesterase II exhibits approximately 40% homology with a thioesterase from mallard uropygial gland, the sequence of which was recently determined by cDNA analysis. Thus the two enzymes may share similar structural features and a common evolutionary origin. The location of the active site in these thioesterases differs from that of other serine active site esterases; indeed, the enzymes do not exhibit any significant homology with other serine esterases, suggesting that they may constitute a separate new family of serine active site enzymes.

  17. A novel phospholipase A(2) from the venom glands of Bungarus candidus: cloning and sequence-comparison.

    PubMed

    Tsai, Inn-Ho; Hsu, Hwa-Yao; Wang, Ying-Ming

    2002-09-01

    The presence of phospholipase A(2) (PLA(2)) in the venom of Malayan krait (Bungarus candidus) and its structure were studied. The PLA(2) cDNAs from the venom gland of B. candidus (Indonesia origin) were amplified by the polymerase chain reactions (PCR) and cloned. The primers used were based on the cDNA sequences of several homologous B. multicinctus venom PLA(2)s. In addition to the A-chains of beta-bungarotoxins, a novel B. candidus PLA(2) was cloned and its full amino acid sequence deduced. Having totally 125 amino acid residues, the PLA(2) contains a pancreatic loop and is 61% identical to the acidic PLA(2) of king cobra venom. However, the enzyme was not detected from the venom sample. Its structural relationships to other elapid venom PLA(2)s were analyzed with a phylogenetic tree and discussed. PMID:12220723

  18. Indigenous and introduced potyviruses of legumes and Passiflora spp. from Australia: biological properties and comparison of coat protein nucleotide sequences.

    PubMed

    Coutts, Brenda A; Kehoe, Monica A; Webster, Craig G; Wylie, Stephen J; Jones, Roger A C

    2011-10-01

    Five Australian potyviruses, passion fruit woodiness virus (PWV), passiflora mosaic virus (PaMV), passiflora virus Y, clitoria chlorosis virus (ClCV) and hardenbergia mosaic virus (HarMV), and two introduced potyviruses, bean common mosaic virus (BCMV) and cowpea aphid-borne mosaic virus (CAbMV), were detected in nine wild or cultivated Passiflora and legume species growing in tropical, subtropical or Mediterranean climatic regions of Western Australia. When ClCV (1), PaMV (1), PaVY (8) and PWV (5) isolates were inoculated to 15 plant species, PWV and two PaVY P. foetida isolates infected P. edulis and P. caerulea readily but legumes only occasionally. Another PaVY P. foetida isolate resembled five PaVY legume isolates in infecting legumes readily but not infecting P. edulis. PaMV resembled PaVY legume isolates in legumes but also infected P. edulis. ClCV did not infect P. edulis or P. caerulea and behaved differently from PaVY legume isolates and PaMV when inoculated to two legume species. When complete coat protein (CP) nucleotide (nt) sequences of 33 new isolates were compared with 41 others, PWV (8), HarMV (4), PaMV (1) and ClCV (1) were within a large group of Australian isolates, while PaVY (14), CAbMV (1) and BCMV (3) isolates were in three other groups. Variation among PWV and PaVY isolates was sufficient for division into four clades each (I-IV). A variable block of 56 amino acid residues at the N-terminal region of the CPs of PaMV and ClCV distinguished them from PWV. Comparison of PWV, PaMV and ClCV CP sequences showed that nt identities were both above and below the 76-77% potyvirus species threshold level. This research gives insights into invasion of new hosts by potyviruses at the natural vegetation and cultivated area interface, and illustrates the potential of indigenous viruses to emerge to infect introduced plants. PMID:21744001

  19. Molecular Characterization of Buffalo Haptoglobin: Sequence Based Structural Comparison Indicates Convergent Evolution Between Ruminants and Human.

    PubMed

    Niranjan, S K; Goyal, S; Dubey, P K; Vohra, V; Singh, S; Kathiravan, P; Kataria, R S

    2016-01-01

    Haptoglobin (Hp) protein has high affinity for hemoglobin (Hb) binding during intravascular hemolysis and scavenges the hemoglobin induced free radicals. Earlier reports indicate about uniqueness of Hp molecule in human and cattle, but in other animals, it is not much studied. In this paper, we characterized buffalo Hp molecule and determined its molecular structure, evolutionary importance, and tissue expression. Comparative analysis and predicted domain structure indicated that the buffalo Hp has an internal duplicated region in α-chain only similar to an alternate Hp2 allele in human. This duplicated part encoded for an extra complement control protein CCP domain. Phylogenetic analysis revealed that buffalo and other ruminants were found to group together separated from all other non-ruminants, including human. The key amino acid residues involved in Hp and Hb as well as Hp and macrophage scavenger receptor, CD163 interactions in buffalo, depicted a significant variation in comparison to other non-ruminant species. Constitutive expression of Hp was also confirmed across all the vital tissues of buffalo, for the first time. Results revealed that buffalo Hp is both structurally and functionally conserved, having internal duplication in α-chain similar to human Hp2 and other ruminant species, which might have evolved separately as a convergent evolutionary process. Furthermore, the presence of extra Hp CCP domain possibly in all ruminants may have an effect during dimerization of molecule in these species. PMID:26646629

  20. Nucleotide sequence and genomic organization of Aleutian mink disease parvovirus (ADV): sequence comparisons between a nonpathogenic and a pathogenic strain of ADV.

    PubMed Central

    Bloom, M E; Alexandersen, S; Perryman, S; Lechner, D; Wolfinbarger, J B

    1988-01-01

    A DNA sequence of 4,592 nucleotides (nt) was derived for the nonpathogenic ADV-G strain of Aleutian mink disease parvovirus (ADV). The 3'(left) end of the virion strand contained a 117-nt palindrome that could assume a Y-shaped configuration similar to, but less stable than, that of other parvoviruses. The sequence obtained for the 5' end was incomplete and did not contain the 5' (right) hairpin structure but ended just after a 25-nt A + T-rich direct repeat. Features of ADV genomic organization are (i) major left (622 amino acids) and right (702 amino acids) open reading frames (ORFs) in different translational frames of the plus-sense strand, (ii) two short mid-ORFs, (iii) eight potential promoter motifs (TATA boxes), including ones at 3 and 36 map units, and (iv) six potential polyadenylation sites, including three clustered near the termination of the right ORF. Although the overall homology to other parvoviruses is less than 50%, there are short conserved amino acid regions in both major ORFs. However, two regions in the right ORF allegedly conserved among the parvoviruses were not present in ADV. At the DNA level, ADV-G is 97.5% related to the pathogenic ADV-Utah 1. A total of 22 amino acid changes were found in the right ORF; changes were found in both hydrophilic and hydrophobic regions and generally did not affect the theoretical hydropathy. However, there is a short heterogeneous region at 64 to 65 map units in which 8 out of 11 residues have diverged; this hypervariable segment may be analogous to short amino acid regions in other parvoviruses that determine host range and pathogenicity. These findings suggested that this region may harbor some of the determinants responsible for the differences in pathogenicity of ADV-G and ADV-Utah 1. PMID:2839709

  1. The developmental transcriptome landscape of bovine skeletal muscle defined by Ribo-Zero ribonucleic acid sequencing.

    PubMed

    Sun, X; Li, M; Sun, Y; Cai, H; Li, R; Wei, X; Lan, X; Huang, Y; Lei, C; Chen, H

    2015-12-01

    Ribonucleic acid sequencing (RNA-Seq) libraries are normally prepared with oligo(dT) selection of poly(A)+ mRNA, but it depends on intact total RNA samples. Recent studies have described Ribo-Zero technology, a novel method that can capture both poly(A)+ and poly(A)- transcripts from intact or fragmented RNA samples. We report here the first application of Ribo-Zero RNA-Seq for the analysis of the bovine embryonic, neonatal, and adult skeletal muscle whole transcriptome at an unprecedented depth. Overall, 19,893 genes were found to be expressed, with a high correlation of expression levels between the calf and the adult. Hundreds of genes were found to be highly expressed in the embryo and decreased at least 10-fold after birth, indicating their potential roles in embryonic muscle development. In addition, we present for the first time the analysis of global transcript isoform discovery in bovine skeletal muscle and identified 36,694 transcript isoforms. Transcriptomic data were also analyzed to unravel sequence variations; 185,036 putative SNP and 12,428 putative short insertions-deletions (InDel) were detected. Specifically, many stop-gain, stop-loss, and frameshift mutations were identified that probably change the relative protein production and sequentially affect the gene function. Notably, the numbers of stage-specific transcripts, alternative splicing events, SNP, and InDel were greater in the embryo than in the calf and the adult, suggesting that gene expression is most active in the embryo. The resulting view of the transcriptome at a single-base resolution greatly enhances the comprehensive transcript catalog and uncovers the global trends in gene expression during bovine skeletal muscle development. PMID:26641174

  2. Complete genome sequence of Saccharothrix espanaensis DSM 44229T and comparison to the other completely sequenced Pseudonocardiaceae

    PubMed Central

    2012-01-01

    Background The genus Saccharothrix is a representative of the family Pseudonocardiaceae, known to include producer strains of a wide variety of potent antibiotics. Saccharothrix espanaensis produces both saccharomicins A and B of the promising new class of heptadecaglycoside antibiotics, active against both bacteria and yeast. Results To better assess its capabilities, the complete genome sequence of S. espanaensis was established. With a size of 9,360,653 bp, coding for 8,501 genes, it stands alongside other Pseudonocardiaceae with large genomes. Besides a predicted core genome of 810 genes shared in the family, S. espanaensis has a large number of accessory genes: 2,967 singletons when compared to the family, of which 1,292 have no clear orthologs in the RefSeq database. The genome analysis revealed the presence of 26 biosynthetic gene clusters potentially encoding secondary metabolites. Among them, the cluster coding for the saccharomicins could be identified. Conclusion S. espanaensis is the first completely sequenced species of the genus Saccharothrix. The genome discloses the cluster responsible for the biosynthesis of the saccharomicins, the largest oligosaccharide antibiotic currently identified. Moreover, the genome revealed 25 additional putative secondary metabolite gene clusters further suggesting the strain’s potential for natural product synthesis. PMID:22958348

  3. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  4. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    PubMed

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. PMID:26037172

  5. Characterization and cDNA sequence of Bothriechis schlegeliil-amino acid oxidase with antibacterial activity.

    PubMed

    Vargas Muñoz, Leidy Johana; Estrada-Gomez, Sebastian; Núñez, Vitelbina; Sanz, Libia; Calvete, Juan J

    2014-08-01

    Snake venoms are complex mixtures of proteins including l-amino acid oxidase (lAAO). A lAAO (named BslAAO) with a mass of 56kDa and a theoretical Ip of 5.79, was purified from Bothriechis schlegelii venom through size-exclusion, ion exchange and affinity chromatography. The entire protein sequence of 498 amino acids, was determined from cDNA using reverse-transcribed mRNA isolated from venom gland. The enzyme showed dose-dependent inhibition of bacterial growth. BslAAO showed inhibitory effect against S. aureus with a MIC of 4μg/mL and a MBC of 8μg/mL. Against Acinetobacter baumannii, showed a MIC of 2μg/mL and MBC of 4μg/mL, No effect was observed in Escherichia coli. This antibacterial activity was inhibited by catalase, indicating that antimicrobial activity was due to H2O2 production. BslAAO did not show any cytotoxic activity toward mouse myoblast cell line C2C12 or peripheral blood mononuclear cells. The enzyme oxidated l-Leu, with a Km of 16.37μM and a Vmax of 0.39μM/min. Snake venoms lAAOs, are potential frames of different therapeutics molecules since these enzymes exhibit low MICs and MBCs and show to be harmless to human cells due to microorganisms being generally several fold more sensitive to reactive oxygen species than human tissues. PMID:24875315

  6. Genome Sequence of a Candidate World Health Organization Reference Strain of Zika Virus for Nucleic Acid Testing

    PubMed Central

    Trösemeier, Jan-Hendrik; Musso, Didier; Blümel, Johannes; Thézé, Julien; Pybus, Oliver G.

    2016-01-01

    We report here the sequence of a candidate reference strain of Zika virus (ZIKV) developed on behalf of the World Health Organization (WHO). The ZIKV reference strain is intended for use in nucleic acid amplification (NAT)-based assays for the detection and quantification of ZIKV RNA. PMID:27587826

  7. Genome Sequence of Schizochytrium sp. CCTCC M209059, an Effective Producer of Docosahexaenoic Acid-Rich Lipids

    PubMed Central

    Ji, Xiao-Jun; Mo, Kai-Qiang; Ren, Lu-Jing; Li, Gan-Lu; Huang, Jian-Zhong

    2015-01-01

    Schizochytrium is an effective species for producing omega-3 docosahexaenoic acid (DHA). Here, we report a genome sequence of Schizochytrium sp. CCTCC M209059, which has a genome size of 39.09 Mb. It will provide the genomic basis for further insights into the metabolic and regulatory mechanisms underlying the DHA formation. PMID:26251485

  8. Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

    PubMed Central

    Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  9. Evolutionary distance of amino acid sequence orthologs across macaque subspecies: identifying candidate genes for SIV resistance in Chinese rhesus macaques.

    PubMed

    Ross, Cody T; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  10. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus CFL1, a Lactic Acid Bacterium Isolated from French Handcrafted Fermented Milk.

    PubMed

    Meneghel, Julie; Dugat-Bony, Eric; Irlinger, Françoise; Loux, Valentin; Vidal, Marie; Passot, Stéphanie; Béal, Catherine; Layec, Séverine; Fonseca, Fernanda

    2016-01-01

    Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) is a lactic acid bacterium widely used for the production of yogurt and cheeses. Here, we report the genome sequence of L. bulgaricus CFL1 to improve our knowledge on its stress-induced damages following production and end-use processes. PMID:26941141

  11. Draft Genome Sequence of Cutaneotrichosporon curvatus DSM 101032 (Formerly Cryptococcus curvatus), an Oleaginous Yeast Producing Polyunsaturated Fatty Acids.

    PubMed

    Hofmeyer, Thomas; Hackenschmidt, Silke; Nadler, Florian; Thürmer, Andrea; Daniel, Rolf; Kabisch, Johannes

    2016-01-01

    Cutaneotrichosporon curvatus DSM 101032 is an oleaginous yeast that can be isolated from various habitats and is capable of producing substantial amounts of polyunsaturated fatty acids. Here, we present the first draft genome sequence of any C. curvatus species. PMID:27174275

  12. Complete genome sequence of Lactobacillus plantarum ZS2058, a probiotic strain with high conjugated linoleic acid production ability.

    PubMed

    Yang, Bo; Chen, Haiqin; Tian, Fengwei; Zhao, Jianxin; Gu, Zhennan; Zhang, Hao; Chen, Yong Q; Chen, Wei

    2015-11-20

    Lactobacillus plantarum ZS2058 was isolated from sauerkraut and identified to synthesize the beneficial metabolite conjugated linoleic acid. The genome contains a 319,7363-bp chromosome and three plasmids. The sequence will facilitate identification and characterization of the genetic determinants for its putative biological benefits. PMID:26439428

  13. Draft Genome Sequence of Burkholderia stabilis LA20W, a Trehalose Producer That Uses Levulinic Acid as a Substrate

    PubMed Central

    Sato, Yuya; Koike, Hideaki; Kondo, Susumu; Hori, Tomoyuki; Kanno, Manabu; Kimura, Nobutada; Morita, Tomotake; Kirimura, Kohtaro

    2016-01-01

    Burkholderia stabilis LA20W produces trehalose using levulinic acid (LA) as a substrate. Here, we report the 7.97-Mb draft genome sequence of B. stabilis LA20W, which will be useful in investigations of the enzymes involved in LA metabolism and the mechanism of LA-induced trehalose production. PMID:27491978

  14. Draft Genome Sequence of Acetobacter tropicalis Type Strain NBRC16470, a Producer of Optically Pure d-Glyceric Acid.

    PubMed

    Koike, Hideaki; Sato, Shun; Morita, Tomotake; Fukuoka, Tokuma; Habe, Hiroshi

    2014-01-01

    Here we report the 3.7-Mb draft genome sequence of Acetobacter tropicalis NBRC16470(T), which can produce optically pure d-glyceric acid (d-GA; 99% enantiomeric excess) from raw glycerol feedstock derived from biodiesel fuel production processes. PMID:25523780

  15. Genome Sequence of a Candidate World Health Organization Reference Strain of Zika Virus for Nucleic Acid Testing.

    PubMed

    Trösemeier, Jan-Hendrik; Musso, Didier; Blümel, Johannes; Thézé, Julien; Pybus, Oliver G; Baylis, Sally A

    2016-01-01

    We report here the sequence of a candidate reference strain of Zika virus (ZIKV) developed on behalf of the World Health Organization (WHO). The ZIKV reference strain is intended for use in nucleic acid amplification (NAT)-based assays for the detection and quantification of ZIKV RNA. PMID:27587826

  16. Draft Genome Sequence of Burkholderia stabilis LA20W, a Trehalose Producer That Uses Levulinic Acid as a Substrate.

    PubMed

    Sato, Yuya; Koike, Hideaki; Kondo, Susumu; Hori, Tomoyuki; Kanno, Manabu; Kimura, Nobutada; Morita, Tomotake; Kirimura, Kohtaro; Habe, Hiroshi

    2016-01-01

    Burkholderia stabilis LA20W produces trehalose using levulinic acid (LA) as a substrate. Here, we report the 7.97-Mb draft genome sequence of B. stabilis LA20W, which will be useful in investigations of the enzymes involved in LA metabolism and the mechanism of LA-induced trehalose production. PMID:27491978

  17. Draft Genome Sequence of Cutaneotrichosporon curvatus DSM 101032 (Formerly Cryptococcus curvatus), an Oleaginous Yeast Producing Polyunsaturated Fatty Acids

    PubMed Central

    Hofmeyer, Thomas; Hackenschmidt, Silke; Nadler, Florian; Thürmer, Andrea; Daniel, Rolf

    2016-01-01

    Cutaneotrichosporon curvatus DSM 101032 is an oleaginous yeast that can be isolated from various habitats and is capable of producing substantial amounts of polyunsaturated fatty acids. Here, we present the first draft genome sequence of any C. curvatus species. PMID:27174275

  18. Ultra high-throughput nucleic acid sequencing as a tool for virus discovery in the turkey gut.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recently, the use of the next generation of nucleic acid sequencing technology (i.e., 454 pyrosequencing, as developed by Roche/454 Life Sciences) has allowed an in-depth look at the uncultivated microorganisms present in complex environmental samples, including samples with agricultural importance....

  19. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus CFL1, a Lactic Acid Bacterium Isolated from French Handcrafted Fermented Milk

    PubMed Central

    Meneghel, Julie; Irlinger, Françoise; Loux, Valentin; Vidal, Marie; Passot, Stéphanie; Béal, Catherine; Layec, Séverine

    2016-01-01

    Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) is a lactic acid bacterium widely used for the production of yogurt and cheeses. Here, we report the genome sequence of L. bulgaricus CFL1 to improve our knowledge on its stress-induced damages following production and end-use processes. PMID:26941141

  20. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...

  1. Sequence-Specific Recognition of MicroRNAs and Other Short Nucleic Acids with Solid-State Nanopores.

    PubMed

    Zahid, Osama K; Wang, Fanny; Ruzicka, Jan A; Taylor, Ethan W; Hall, Adam R

    2016-03-01

    The detection and quantification of short nucleic acid sequences has many potential applications in studying biological processes, monitoring disease initiation and progression, and evaluating environmental systems, but is challenging by nature. We present here an assay based on the solid-state nanopore platform for the identification of specific sequences in solution. We demonstrate that hybridization of a target nucleic acid with a synthetic probe molecule enables discrimination between duplex and single-stranded molecules with high efficacy. Our approach requires limited preparation of samples and yields an unambiguous translocation event rate enhancement that can be used to determine the presence and abundance of a single sequence within a background of nontarget oligonucleotides. PMID:26824296

  2. Sequence heterogeneity of cannabidiolic- and tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical phenotype.

    PubMed

    Onofri, Chiara; de Meijer, Etienne P M; Mandolino, Giuseppe

    2015-08-01

    Sequence variants of THCA- and CBDA-synthases were isolated from different Cannabis sativa L. strains expressing various wild-type and mutant chemical phenotypes (chemotypes). Expressed and complete sequences were obtained from mature inflorescences. Each strain was shown to have a different specificity and/or ability to convert the precursor CBGA into CBDA and/or THCA type products. The comparison of the expressed sequences led to the identification of different mutations, all of them due to SNPs. These SNPs were found to relate to the cannabinoid composition of the inflorescence at maturity and are therefore proposed to have a functional significance. The amount of variation was found to be higher within the CBDAS sequence family than in the THCAS family, suggesting a more recent evolution of THCA-forming enzymes from the CBDAS group. We therefore consider CBDAS as the ancestral type of these synthases. PMID:25865737

  3. Human parainfluenza type 3 virus hemagglutinin-neuraminidase glycoprotein: nucleotide sequence of mRNA and limited amino acid sequence of the purified protein.

    PubMed Central

    Elango, N; Coligan, J E; Jambou, R C; Venkatesan, S

    1986-01-01

    The nucleotide sequence of mRNA for the hemagglutinin-neuraminidase (HN) protein of human parainfluenza type 3 virus obtained from the corresponding cDNA clone had a single long open reading frame encoding a putative protein of 64,254 daltons consisting of 572 amino acids. The deduced protein sequence was confirmed by limited N-terminal amino acid microsequencing of CNBr cleavage fragments of native HN that was purified by immunoprecipitation. The HN protein is moderately hydrophobic and has four potential sites (Asn-X-Ser/Thr) of N-glycosylation in the C-terminal half of the molecule. It is devoid of both the N-terminal signal sequence and the C-terminal membrane anchorage domain characteristic of the hemagglutinin of influenza virus and the fusion (F0) protein of the paramyxoviruses. Instead, it has a single prominent hydrophobic region capable of membrane insertion beginning at 32 residues from the N terminus. This N-terminal membrane insertion is similar to that of influenza virus neuraminidase and the recently reported structures of HN proteins of Sendai virus and simian virus 5. Images PMID:3003381

  4. Sequence dependent N-terminal rearrangement and degradation of peptide nucleic acid (PNA) in aqueous solution

    NASA Technical Reports Server (NTRS)

    Eriksson, M.; Christensen, L.; Schmidt, J.; Haaima, G.; Orgel, L.; Nielsen, P. E.

    1998-01-01

    The stability of the PNA (peptide nucleic acid) thymine monomer inverted question markN-[2-(thymin-1-ylacetyl)]-N-(2-aminoaminoethyl)glycine inverted question mark and those of various PNA oligomers (5-8-mers) have been measured at room temperature (20 degrees C) as a function of pH. The thymine monomer undergoes N-acyl transfer rearrangement with a half-life of 34 days at pH 11 as analyzed by 1H NMR; and two reactions, the N-acyl transfer and a sequential degradation, are found by HPLC analysis to occur at measurable rates for the oligomers at pH 9 or above. Dependent on the amino-terminal sequence, half-lives of 350 h to 163 days were found at pH 9. At pH 12 the half-lives ranged from 1.5 h to 21 days. The results are discussed in terms of PNA as a gene therapeutic drug as well as a possible prebiotic genetic material.

  5. Structural analysis of complementary DNA and amino acid sequences of human and rat androgen receptors

    SciTech Connect

    Chang, C.; Kokontis, J.; Liao, S. )

    1988-10-01

    Structural analysis of cDNAs for human and rat androgen receptors (ARs) indicates that the amino-terminal regions of ARs are rich in oligo- and poly(amino acid) motifs as in some homeotic genes. The human AR has a long stretch of repeated glycines, whereas rat AR has a long stretch of glutamines. There is a considerable sequence similarity among ARs and the receptors for glucocorticoids, progestins, and mineralocorticoids within the steroid-binding domains. The cysteine-rich DNA-binding domains are well conserved. Translation of mRNA transcribed from AR cDNAs yielded 94- and 76-kDa proteins and smaller forms that bind to DNA and have high affinity toward androgens. These rat or human ARs were recognized by human autoantibodies to natural Ars. Molecular hybridization studies, using AR cDNAs as probes, indicated that the ventral prostate and other male accessory organs are rich in AR mRNA and that the production of AR mRNA in the target organs may be autoregulated by androgens.

  6. Rapid and Sensitive Isothermal Detection of Nucleic-acid Sequence by Multiple Cross Displacement Amplification

    PubMed Central

    Wang, Yi; Wang, Yan; Ma, Ai-Jing; Li, Dong-Xun; Luo, Li-Juan; Liu, Dong-Xin; Jin, Dong; Liu, Kai; Ye, Chang-Yun

    2015-01-01

    We have devised a novel amplification strategy based on isothermal strand-displacement polymerization reaction, which was termed multiple cross displacement amplification (MCDA). The approach employed a set of ten specially designed primers spanning ten distinct regions of target sequence and was preceded at a constant temperature (61–65 °C). At the assay temperature, the double-stranded DNAs were at dynamic reaction environment of primer-template hybrid, thus the high concentration of primers annealed to the template strands without a denaturing step to initiate the synthesis. For the subsequent isothermal amplification step, a series of primer binding and extension events yielded several single-stranded DNAs and single-stranded single stem-loop DNA structures. Then, these DNA products enabled the strand-displacement reaction to enter into the exponential amplification. Three mainstream methods, including colorimetric indicators, agarose gel electrophoresis and real-time turbidity, were selected for monitoring the MCDA reaction. Moreover, the practical application of the MCDA assay was successfully evaluated by detecting the target pathogen nucleic acid in pork samples, which offered advantages on quick results, modest equipment requirements, easiness in operation, and high specificity and sensitivity. Here we expounded the basic MCDA mechanism and also provided details on an alternative (Single-MCDA assay, S-MCDA) to MCDA technique. PMID:26154567

  7. Snake venoms. The amino acid sequences of two proteinase inhibitor homologues from Dendroaspis angusticeps venom.

    PubMed

    Joubert, F J; Taljaard, N

    1980-05-01

    Toxins C13S1C3 and C13S2C3 from D. angusticeps venom were purified by gel filtration and ion exchange chromatography. Whereas C13S1C3 contains 57 amino acids, C13S2C3 contains 59 but each include six half-cystine residues. The complete primary structure of the low toxicity proteins have been elucidated. The sequences and the invariant residues of toxins C13S1C3 and C13S2C3 from D. angusticeps venom resemble, respectively, those of the proteinase inhibitor homologues K and I from D. polylepis polylepis venom and they are also homologous to the active proteinase inhibitors from various sources. In C13S1C3 and K the active site lysyl residue of active bovine pancreatic proteinase inhibitor is conserved but the site residue alanine, is replaced by lysine. In C13S2C3 and I the active site residue is replaced by tyrosine. PMID:7429422

  8. From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies

    PubMed Central

    Benos, Panayiotis V.; Gatt, Melanie K.; Murphy, Lee; Harris, David; Barrell, Bart; Ferraz, Concepcion; Vidal, Sophie; Brun, Christine; Demaille, Jacques; Cadieu, Edouard; Dreano, Stephane; Gloux, Stéphanie; Lelaure, Valerie; Mottier, Stephanie; Galibert, Francis; Borkova, Dana; Miñana, Belen; Kafatos, Fotis C.; Bolshakov, Slava; Sidén-Kiamos, Inga; Papagiannakis, George; Spanos, Lefteris; Louis, Christos; Madueño, Encarnación; de Pablos, Beatriz; Modolell, Juan; Peter, Annette; Schöttler, Petra; Werner, Meike; Mourkioti, Fotini; Beinert, Nicole; Dowe, Gordon; Schäfer, Ulrich; Jäckle, Herbert; Bucheton, Alain; Callister, Debbie; Campbell, Lorna; Henderson, Nadine S.; McMillan, Paul J.; Salles, Cathy; Tait, Evelyn; Valenti, Phillipe; Saunders, Robert D.C.; Billaud, Alain; Pachter, Lior; Glover, David M.; Ashburner, Michael

    2001-01-01

    We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different transposable elements. We show that an interval between bands 3A2 and 3C2, believed in the 1970s to show a correlation between the number of bands on the polytene chromosomes and the 20 genes identified by conventional genetics, is predicted to contain 45 genes from its DNA sequence. We have determined the insertion sites of P-elements from 111 mutant lines, about half of which are in a position likely to affect the expression of novel predicted genes, thus representing a resource for subsequent functional genomic analysis. We compare the European Drosophila Genome Project sequence with the corresponding part of the independently assembled and annotated Joint Sequence determined through “shotgun” sequencing. Discounting differences in the distribution of known transposable elements between the strains sequenced in the two projects, we detected three major sequence differences, two of which are probably explained by errors in assembly; the origin of the third major difference is unclear. In addition there are eight sequence gaps within the Joint Sequence. At least six of these eight gaps are likely to be sites of transposable elements; the other two are complex. Of the 275 genes in common to both projects, 60% are identical within 1% of their predicted amino-acid sequence and 31% show minor differences such as in choice of translation initiation or termination codons; the remaining 9% show major differences in interpretation. [All of the sequences analyzed in this paper have been deposited in the EMBL-Bank database under the following accession nos.: AL009146, AL009147, AL009171, AL009188–AL009196, AL021067, AL021086, AL021106–AL021108, AL021726, AL

  9. Comparison of peak shape in hydrophilic interaction chromatography using acidic salt buffers and simple acid solutions.

    PubMed

    Heaton, James C; Russell, Joseph J; Underwood, Tim; Boughtflower, Robert; McCalley, David V

    2014-06-20

    The retention and peak shape of neutral, basic and acidic solutes was studied on hydrophilic interaction chromatography (HILIC) stationary phases that showed both strong and weak ionic retention characteristics, using aqueous-acetonitrile mobile phases containing either formic acid (FA), ammonium formate (AF) or phosphoric acid (PA). The effect of organic solvent concentration on the results was also studied. Peak shape was good for neutrals under most mobile phase conditions. However, peak shapes for ionised solutes, particularly for basic compounds, were considerably worse in FA than AF. Even neutral compounds showed deterioration in performance with FA when the mobile phase water concentration was reduced. The poor performance in FA cannot be entirely attributed to the negative impact of ionic retention on ionised silanols on the underlying silica base materials, as results using PA at lower pH (where their ionisation is suppressed) were inferior to those in AF. Besides the moderating influence of the salt cation on ionic retention, it is likely that salt buffers improve peak shape due to the increased ionic strength of the mobile phase and its impact on the formation of the water layer on the column surface. PMID:24813934

  10. Nucleotide and predicted amino acid sequence of a cDNA clone encoding part of human transketolase.

    PubMed

    Abedinia, M; Layfield, R; Jones, S M; Nixon, P F; Mattick, J S

    1992-03-31

    Transketolase is a key enzyme in the pentose-phosphate pathway which has been implicated in the latent human genetic disease, Wernicke-Korsakoff syndrome. Here we report the cloning and partial characterisation of the coding sequences encoding human transketolase from a human brain cDNA library. The library was screened with oligonucleotide probes based on the amino acid sequence of proteolytic fragments of the purified protein. Northern blots showed that the transketolase mRNA is approximately 2.2 kb, close to the minimum expected, of which approximately 60% was represented in the largest cDNA clone. Sequence analysis of the transketolase coding sequences reveals a number of homologies with related enzymes from other species. PMID:1567394

  11. 5S ribosomal ribonucleic acid sequences in Bacteroides and Fusobacterium: evolutionary relationships within these genera and among eubacteria in general

    NASA Technical Reports Server (NTRS)

    Van den Eynde, H.; De Baere, R.; Shah, H. N.; Gharbia, S. E.; Fox, G. E.; Michalik, J.; Van de Peer, Y.; De Wachter, R.

    1989-01-01

    The 5S ribosomal ribonucleic acid (rRNA) sequences were determined for Bacteroides fragilis, Bacteroides thetaiotaomicron, Bacteroides capillosus, Bacteroides veroralis, Porphyromonas gingivalis, Anaerorhabdus furcosus, Fusobacterium nucleatum, Fusobacterium mortiferum, and Fusobacterium varium. A dendrogram constructed by a clustering algorithm from these sequences, which were aligned with all other hitherto known eubacterial 5S rRNA sequences, showed differences as well as similarities with respect to results derived from 16S rRNA analyses. In the 5S rRNA dendrogram, Bacteroides clustered together with Cytophaga and Fusobacterium, as in 16S rRNA analyses. Intraphylum relationships deduced from 5S rRNAs suggested that Bacteroides is specifically related to Cytophaga rather than to Fusobacterium, as was suggested by 16S rRNA analyses. Previous taxonomic considerations concerning the genus Bacteroides, based on biochemical and physiological data, were confirmed by the 5S rRNA sequence analysis.

  12. Sample Prep, Workflow Automation and Nucleic Acid Fractionation for Next Generation Sequencing

    SciTech Connect

    Roskey, Mark

    2010-06-03

    Mark Roskey of Caliper LifeSciences discusses how the company's technologies fit into the next generation sequencing workflow on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  13. Sequence-specific DNA binding by long hairpin pyrrole-imidazole polyamides containing an 8-amino-3,6-dioxaoctanoic acid unit.

    PubMed

    Sawatani, Yoshito; Kashiwazaki, Gengo; Chandran, Anandhakumar; Asamitsu, Sefan; Guo, Chuanxin; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-08-15

    With the aim of improving aqueous solubility, we designed and synthesized five N-methylpyrrole (Py)-N-methylimidazole (Im) polyamides capable of recognizing 9-bp sequences. Their DNA-binding affinities and sequence specificities were evaluated by SPR and Bind-n-Seq analyses. The design of polyamide 1 was based on a conventional model, with three consecutive Py or Im rings separated by a β-alanine to match the curvature and twist of long DNA helices. Polyamides 2 and 3 contained an 8-amino-3,6-dioxaoctanoic acid (AO) unit, which has previously only been used as a linker within linear Py-Im polyamides or between Py-Im hairpin motifs for tandem hairpin. It is demonstrated herein that AO also functions as a linker element that can extend to 2-bp in hairpin motifs. Notably, although the AO-containing unit can fail to bind the expected sequence, polyamide 4, which has two AO units facing each other in a hairpin form, successfully showed the expected motif and a KD value of 16nM was recorded. Polyamide 5, containing a β-alanine-β-alanine unit instead of the AO of polyamide 2, was synthesized for comparison. The aqueous solubilities and nuclear localization of three of the polyamides were also examined. The results suggest the possibility of applying the AO unit in the core of Py-Im polyamide compounds. PMID:27301681

  14. Evolution of vertebrate IgM: complete amino acid sequence of the constant region of Ambystoma mexicanum mu chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Wiles, M V; Charlemagne, J; Schwager, J

    1992-10-01

    cDNA clones coding for the constant region of the Mexican axolotl (Ambystoma mexicanum) mu heavy immunoglobulin chain were selected from total spleen RNA, using a cDNA polymerase chain reaction technique. The specific 5'-end primer was an oligonucleotide homologous to the JH segment of Xenopus laevis mu chain. One of the clones, JHA/3, corresponded to the complete constant region of the axolotl mu chain, consisting of a 1362-nucleotide sequence coding for a polypeptide of 454 amino acids followed in 3' direction by a 179-nucleotide untranslated region and a polyA+ tail. The axolotl C mu is divided into four typical domains (C mu 1-C mu 4) and can be aligned with the Xenopus C mu with an overall identity of 56% at the nucleotide level. Percent identities were particularly high between C mu 1 (59%) and C mu 4 (71%). The C-terminal 20-amino acid segment which constitutes the secretory part of the mu chain is strongly homologous to the equivalent sequences of chondrichthyans and of other tetrapods, including a conserved N-linked oligosaccharide, the penultimate cysteine and the C-terminal lysine. The four C mu domains of 13 vertebrate species ranging from chondrichthyans to mammals were aligned and compared at the amino acid level. The significant number of mu-specific residues which are conserved into each of the four C mu domains argues for a continuous line of evolution of the vertebrate mu chain. This notion was confirmed by the ability to reconstitute a consistent vertebrate evolution tree based on the phylogenic parsimony analysis of the C mu 4 sequences. PMID:1382992

  15. Complete Genome Sequence of the Pokeweed Mosaic Virus (PkMV)-New Jersey Isolate and Its Comparison to PkMV-MD and PkMV-PA.

    PubMed

    Di, Rong

    2016-01-01

    Pokeweed mosaic virus (PkMV) causes systemically mosaic symptoms on pokeweed (Phytolacca americana L.) plants. The genome of the PkMV-NJ (New Jersey) isolate was cloned by PCR and sequenced by the Sanger sequencing method. The sequence comparison indicates that PkMV-NJ is more divergent from the other two sequenced isolates, PkMV-MD and PkMV-PA. PMID:27609914

  16. Low levels of haptoglobin and putative amino acid sequence in Taiwanese Lanyu miniature pigs.

    PubMed

    Yueh, Sunny C H; Wang, Yao Horng; Lin, Kuan Yu; Tseng, Chi Feng; Chu, Hsien Pin; Chen, Kuen Jaw; Wang, Shih Sheng; Lai, I Hsiang; Mao, Simon J T

    2008-04-01

    Porcine haptoglobin (Hp) is an acute phase protein. Its plasma level increases significantly during inflammation and infection. One of the main functions of Hp is to bind free hemoglobin (Hb) and inhibit its oxidative activity. In the present report, we studied the Hp phenotype of Taiwanese Lanyu miniature pigs (TLY minipigs; n=43) and found their Hp structure to be a homodimer (beta-alpha-alpha-beta) similar to human Hp 1-1. Interestingly, Western blot and high performance liquid chromatographic (HPLC) analysis showed that 25% of the TLY minipigs possessed low or no plasma Hp level (<0.05 mg/ml). The Hp cDNA of these TLY minipigs was then cloned, and the translated amino acid sequence was analyzed. No sequences were found to be deficient; they showed a 99.7% identity with domestic pigs (NP_999165). The mean overall Hp level of the TLY minipigs (0.21 +/- 0.25 mg/ml; n=43) determined by enzyme-linked immunosorbent assay (ELISA) was markedly lower than that of domestic pigs (0.78 +/- 0.45 mg/ml; p<0.001), while 25% of the TLY minipigs had an Hp level that was extremely low (<0.05 mg/ml). In addition, the initial recovery rate (first 40 min) in the circulation of infused fluorescein isothiocyanate (FITC)-Hb was significantly higher in the TLY minipigs with extremely low Hp levels than those with high levels. This data suggests that the low concentration of Hp-Hb complex is responsible for the higher recovery rate of Hb in the circulation. TLY minipigs have been used as an experimental model for cardiovascular diseases; whether they can be used as a model for inflammatory diseases, with Hp as a marker, remains a topic of interest. However, since the Hp level varies significantly among individual TLY minipigs, it is necessary to prescreen the Hp levels of the animals to minimize variation in the experimental baseline. The present study may provide a reference value for future use of the TLY minipig as an animal model for inflammation-associated diseases. PMID:18460833

  17. Comparison of Whole-Genome Sequencing and Molecular-Epidemiological Techniques for Clostridium difficile Strain Typing.

    PubMed

    Dominguez, Samuel R; Anderson, Lydia J; Kotter, Cassandra V; Littlehorn, Cynthia A; Arms, Lesley E; Dowell, Elaine; Todd, James K; Frank, Daniel N

    2016-09-01

    We analyzed in parallel 27 pediatric Clostridium difficile isolates by repetitive sequence-based polymerase chain reaction (RepPCR), pulsed-field gel electrophoresis (PFGE), and whole-genome next-generation sequencing. Next-generation sequencing distinguished 3 groups of isolates that were indistinguishable by RepPCR and 1 isolate that clustered in the same PFGE group as other isolates. PMID:26407257

  18. Application of MLST and pilus gene sequence comparisons to investigate the population structures of Actinomyces naeslundii and Actinomyces oris.

    PubMed

    Henssge, Uta; Do, Thuy; Gilbert, Steven C; Cox, Steven; Clark, Douglas; Wickström, Claes; Ligtenberg, A J M; Radford, David R; Beighton, David

    2011-01-01

    Actinomyces naeslundii and Actinomyces oris are members of the oral biofilm. Their identification using 16S rRNA sequencing is problematic and better achieved by comparison of metG partial sequences. A. oris is more abundant and more frequently isolated than A. naeslundii. We used a multi-locus sequence typing approach to investigate the genotypic diversity of these species and assigned A. naeslundii (n = 37) and A. oris (n = 68) isolates to 32 and 68 sequence types (ST), respectively. Neighbor-joining and ClonalFrame dendrograms derived from the concatenated partial sequences of 7 house-keeping genes identified at least 4 significant subclusters within A. oris and 3 within A. naeslundii. The strain collection we had investigated was an under-representation of the total population since at least 3 STs composed of single strains may represent discrete clusters of strains not well represented in the collection. The integrity of these sub-clusters was supported by the sequence analysis of fimP and fimA, genes coding for the type 1 and 2 fimbriae, respectively. An A. naeslundii subcluster was identified with both fimA and fimP genes and these strains were able to bind to MUC7 and statherin while all other A. naeslundii strains possessed only fimA and did not bind to statherin. An A. oris subcluster harboured a fimA gene similar to that of Actinomyces odontolyticus but no detectable fimP failed to bind significantly to either MUC7 or statherin. These data are evidence of extensive genotypic and phenotypic diversity within the species A. oris and A. naeslundii but the status of the subclusters identified here will require genome comparisons before their phylogenic position can be unequivocally established. PMID:21738661

  19. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein‐Coding Regions

    PubMed Central

    Lelieveld, Stefan H.; Spielmann, Malte; Mundlos, Stefan; Veltman, Joris A.

    2015-01-01

    ABSTRACT For next‐generation sequencing technologies, sufficient base‐pair coverage is the foremost requirement for the reliable detection of genomic variants. We investigated whether whole‐genome sequencing (WGS) platforms offer improved coverage of coding regions compared with whole‐exome sequencing (WES) platforms, and compared single‐base coverage for a large set of exome and genome samples. We find that WES platforms have improved considerably in the last years, but at comparable sequencing depth, WGS outperforms WES in terms of covered coding regions. At higher sequencing depth (95x–160x), WES successfully captures 95% of the coding regions with a minimal coverage of 20x, compared with 98% for WGS at 87‐fold coverage. Three different assessments of sequence coverage bias showed consistent biases for WES but not for WGS. We found no clear differences for the technologies concerning their ability to achieve complete coverage of 2,759 clinically relevant genes. We show that WES performs comparable to WGS in terms of covered bases if sequenced at two to three times higher coverage. This does, however, go at the cost of substantially more sequencing biases in WES approaches. Our findings will guide laboratories to make an informed decision on which sequencing platform and coverage to choose. PMID:25973577

  20. Phylogeny of the Sphaerotilus-Leptothrix group inferred from morphological comparisons, genomic fingerprinting, and 16S ribosomal DNA sequence analyses.

    PubMed

    Siering, P L; Ghiorse, W C

    1996-01-01

    Phase-contrast light microscopy revealed that only one of eight cultivated strains belonging to the Sphaerotilus-Leptothrix group of sheathed bacteria actually produced a sheath in standard growth media. Two Sphaerotilus natans strains produced branched cells, but other morphological characteristics that were used to identify these bacteria were consistent with previously published descriptions. Genomic fingerprints, which were obtained by performing PCR amplification with primers corresponding to enterobacterial repetitive intergenic consensus sequences, were useful for distinguishing between the genera Sphaerotilus and Leptothrix, as well as among individual strains. The complete 16S ribosomal DNA (rDNA) sequences of two strains of "Leptothrix discophora" (strains SP-6 and SS-1) were determined. In addition, partial sequences (approximately 300 nucleotides) of one strain of Leptothrix cholodnii (strain LMG 7171), an unidentified Leptothrix strain (strain NC-1), and four strains of Sphaerotilus natans (strains ATCC 13338T [T = type strain], ATCC 15291, ATCC 29329, and ATCC 29330) were determined. We found that two of the S. natans strains (ATCC 15291 and ATCC 13338T), which differed in morphology and in their genomic fingerprints, had identical sequences in the 300-nucleotide region sequenced. Both parsimony and distance matrix methods were used to infer the evolutionary relationships of the eight strains in a comparison of the 16S rDNA sequences of these organisms with 16S rDNA sequences obtained from ribosomal sequence databases. All of the strains clustered in the Rubrivivax subdivision of the beta subclass of the Proteobacteria, which confirmed previously published conclusions concerning selected individual strains. Additional analyses revealed that all of the S. natans strains clustered in one closely related group, while the Leptothrix strains clustered in two separate lineages that were approximately equidistant from the S. natans cluster. This finding

  1. Amino acid sequence of mouse nidogen, a multidomain basement membrane protein with binding activity for laminin, collagen IV and cells.

    PubMed Central

    Mann, K; Deutzmann, R; Aumailley, M; Timpl, R; Raimondi, L; Yamada, Y; Pan, T C; Conway, D; Chu, M L

    1989-01-01

    The whole amino acid sequence of nidogen was deduced from cDNA clones isolated from expression libraries and confirmed to approximately 50% by Edman degradation of peptides. The protein consists of some 1217 amino acid residues and a 28-residue signal peptide. The data support a previously proposed dumb-bell model of nidogen by demonstrating a large N-terminal globular domain (641 residues), five EGF-like repeats constituting the rod-like domain (248 residues) and a smaller C-terminal globule (328 residues). Two more EGF-like repeats interrupt the N-terminal and terminate the C-terminal sequences. Weak sequence homologies (25%) were detected between some regions of nidogen, the LDL receptor, thyroglobulin and the EGF precursor. Nidogen contains two consensus sequences for tyrosine sulfation and for asparagine beta-hydroxylation, two N-linked carbohydrate acceptor sites and, within one of the EGF-like repeats an Arg-Gly-Asp sequence. The latter was shown to be functional in cell attachment to nidogen. Binding sites for laminin and collagen IV are present on the C-terminal globule but not yet precisely localized. Images PMID:2496973

  2. Jack bean α-mannosidase: amino acid sequencing and N-glycosylation analysis of a valuable glycomics tool.

    PubMed

    Gnanesh Kumar, B S; Pohlentz, Gottfried; Schulte, Mona; Mormann, Michael; Siva Kumar, Nadimpalli

    2014-03-01

    Jack bean (Canavalia ensiformis) seeds contain several biologically important proteins among which α-mannosidase (EC 3.2.1.24) has been purified, its biochemical properties studied and widely used in glycan analysis. In the present study, we have used the purified enzyme and derived its amino acid sequence covering both the known subunits (molecular mass of ∼66,000 and ∼44,000 Da) hitherto not known in its entirety. Peptide de novo sequencing and structural elucidation of N-glycopeptides obtained either directly from proteolytic digestion or after zwitterionic hydrophilic interaction liquid chromatography solid phase extraction-based separation were performed by use of nanoelectrospray ionization quadrupole time-of-flight mass spectrometry and low-energy collision-induced dissociation experiments. De novo sequencing provided new insights into the disulfide linkage organization, intersection of subunits and complete N-glycan structures along with site specificities. The primary sequence suggests that the enzyme belongs to glycosyl hydrolase family 38 and the N-glycan sequence analysis revealed high-mannose oligosaccharides, which were found to be heterogeneous with varying number of hexoses viz, Man8-9GlcNAc2 and Glc1Man9GlcNAc2 in an evolutionarily conserved N-glycosylation site. This site with two proximal cysteines is present in all the acidic α-mannosidases reported so far in eukaryotes. Further, a truncated paucimannose type was identified to be lacking terminal two mannose, Man1(Xyl)GlcNAc2 (Fuc). PMID:24295789

  3. Complete Genome Sequence of Enterococcus mundtii QU 25, an Efficient l-(+)-Lactic Acid-Producing Bacterium

    PubMed Central

    Shiwa, Yuh; Yanase, Hiroaki; Hirose, Yuu; Satomi, Shohei; Araya-Kojima, Tomoko; Watanabe, Satoru; Zendo, Takeshi; Chibazakura, Taku; Shimizu-Kadota, Mariko; Yoshikawa, Hirofumi; Sonomoto, Kenji

    2014-01-01

    Enterococcus mundtii QU 25, a non-dairy bacterial strain of ovine faecal origin, can ferment both cellobiose and xylose to produce l-lactic acid. The use of this strain is highly desirable for economical l-lactate production from renewable biomass substrates. Genome sequence determination is necessary for the genetic improvement of this strain. We report the complete genome sequence of strain QU 25, primarily determined using Pacific Biosciences sequencing technology. The E. mundtii QU 25 genome comprises a 3 022 186-bp single circular chromosome (GC content, 38.6%) and five circular plasmids: pQY182, pQY082, pQY039, pQY024, and pQY003. In all, 2900 protein-coding sequences, 63 tRNA genes, and 6 rRNA operons were predicted in the QU 25 chromosome. Plasmid pQY024 harbours genes for mundticin production. We found that strain QU 25 produces a bacteriocin, suggesting that mundticin-encoded genes on plasmid pQY024 were functional. For lactic acid fermentation, two gene clusters were identified—one involved in the initial metabolism of xylose and uptake of pentose and the second containing genes for the pentose phosphate pathway and uptake of related sugars. This is the first complete genome sequence of an E. mundtii strain. The data provide insights into lactate production in this bacterium and its evolution among enterococci. PMID:24568933

  4. Gastropod arginine kinases from Cellana grata and Aplysia kurodai. Isolation and cDNA-derived amino acid sequences.

    PubMed

    Suzuki, T; Inoue, N; Higashi, T; Mizobuchi, R; Sugimura, N; Yokouchi, K; Furukohri, T

    2000-12-01

    Arginine kinase (AK) was isolated from the radular muscle of the gastropod molluscs Cellana grata (subclass Prosobranchia) and Aplysia kurodai (subclass Opisthobranchia), respectively, by ammonium sulfate fractionation, Sephadex G-75 gel filtration and DEAE-ion exchange chromatography. The denatured relative molecular mass values were estimated to be 40 kDa by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The isolated enzyme from Aplysia gave a Km value of 0.6 mM for arginine and a Vmax value of 13 micromole Pi min(-1) mg protein(-1) for the forward reaction. These values are comparable to other molluscan AKs. The cDNAs encoding Cellana and Aplysia AKs were amplified by polymerase chain reaction, and the nucleotide sequences of 1,608 and 1,239 bp, respectively, were determined. The open reading frame for Cellana AK is 1044 nucleotides in length and encodes a protein with 347 amino acid residues, and that for A. kurodai is 1077 nucleotides and 354 residues. The cDNA-derived amino acid sequences were validated by chemical sequencing of internal lysyl endopeptidase peptides. The amino acid sequences of Cellana and Aplysia AKs showed the highest percent identity (66-73%) with those of the abalone Nordotis and turbanshell Battilus belonging to the same class Gastropoda. These AK sequences still have a strong homology (63-71%) with that of the chiton Liolophura (class Polyplacophora), which is believed to be one of the most primitive molluscs. On the other hand, these AK sequences are less homologous (55-57%) with that of the clam Pseudocardium (class Bivalvia), suggesting that the biological position of the class Polyplacophora should be reconsidered. PMID:11281267

  5. Interlaboratory comparison of measurements of acid-volatile sulfide and simultaneously extracted nickel in spiked sediments

    USGS Publications Warehouse

    Brumbaugh, William G.; Hammerschmidt, Chad R.; Zanella, Luciana; Rogevich, Emily; Salata, Gregory; Bolek, Radoslaw

    2011-01-01

    An interlaboratory comparison of acid-volatile sulfide (AVS) and simultaneously extracted nickel (SEM_Ni) measurements of sediments was conducted among five independent laboratories. Relative standard deviations for the seven test samples ranged from 5.6 to 71% (mean?=?25%) for AVS and from 5.5 to 15% (mean?=?10%) for SEM_Ni. These results are in stark contrast to a recently published study that indicated AVS and SEM analyses were highly variable among laboratories.

  6. Interlaboratory comparison of measurements of acid-volatile sulfide and simultaneously extracted nickel in spiked sediments

    USGS Publications Warehouse

    Brumbaugh, W.G.; Hammerschmidt, C.R.; Zanella, L.; Rogevich, E.; Salata, G.; Bolek, R.

    2011-01-01

    An interlaboratory comparison of acid-volatile sulfide (AVS) and simultaneously extracted nickel (SEM-Ni) measurements of sediments was conducted among five independent laboratories. Relative standard deviations for the seven test samples ranged from 5.6 to 71% (mean=25%) for AVS and from 5.5 to 15% (mean=10%) for SEM-Ni. These results are in stark contrast to a recently published study that indicated AVS and SEM analyses were highly variable among laboratories. ?? 2011 SETAC.

  7. Studies on the high-sulphur proteins of reduced Merino wool. Amino acid sequence of protein SCMKB-IIIB4

    PubMed Central

    Swart, L. S.; Haylett, T.

    1971-01-01

    The complete amino acid sequence of protein SCMKB-IIIB4 is presented. It is closely related to the sequence of protein SCMKB-IIIB3 (Haylett, Swart & Parris, 1971) differing in only four positions. The peptic and thermolysin peptides of protein SCMKB-IIIB4 were analysed by the dansyl–Edman method (Gray, 1967) and by tritium-labelling of C-terminal residues (Matsuo, Fujimoto & Tatsuno, 1966). This protein is the third member of a group of high-sulphur wool proteins with molecular weight of about 11400. It consists of 98 residues and has acetylalanine and carboxymethylcysteine as N- and C-terminal residues respectively. PMID:4942536

  8. Validation of targeted next-generation sequencing for RAS mutation detection in FFPE colorectal cancer tissues: comparison with Sanger sequencing and ARMS-Scorpion real-time PCR

    PubMed Central

    Gao, Jie; Wu, Huanwen; Wang, Li; Zhang, Hui; Duan, Huanli; Lu, Junliang; Liang, Zhiyong

    2016-01-01

    Objective To validate the targeted next-generation sequencing (NGS) platform-Ion Torrent PGM for KRAS exon 2 and expanded RAS mutations detection in formalin-fixed paraffin-embedded (FFPE) colorectal cancer (CRC) specimens, with comparison of Sanger sequencing and ARMS-Scorpion real-time PCR. Setting Beijing, China. Participants 51 archived FFPE CRC samples (36 men, 15 women) were retrospectively randomly selected and then checked by an experienced pathologist for sequencing based on histological confirmation of CRC and availability of sufficient tissue. Methods RAS mutations were detected in the 51 FFPE CRC samples by PGM analysis, Sanger sequencing and the Therascreen KRAS assay, respectively. Agreement among the 3 methods was assessed. Assay sensitivity was further determined by sequencing serially diluted DNA from FFPE cell lines with known mutation statuses. Results 13 of 51 (25.5%) cases had a mutation in KRAS exon 2, as determined by PGM analysis. PGM analysis showed 100% (51/51) concordance with Sanger sequencing (κ=1.000, 95% CI 1 to 1) and 98.04% (50/51) agreement with the Therascreen assay (κ=0.947, 95% CI 0.844 to 1) for detecting KRAS exon 2 mutations, respectively. The only discrepant case harboured a KRAS exon 2 mutation (c.37G>T) that was not covered by the Therascreen kit. The dilution series experiment results showed that PGM was able to detect KRAS mutations at a frequency of as low as 1%. Importantly, RAS mutations other than KRAS exon 2 mutations were also detected in 10 samples by PGM. Furthermore, mutations in other CRC-related genes could be simultaneously detected in a single test by PGM. Conclusions The targeted NGS platform is specific and sensitive for KRAS exon 2 mutation detection and is appropriate for use in routine clinical testing. Moreover, it is sample saving and cost-efficient and time-efficient, and has great potential for clinical application to expand testing to include mutations in RAS and other CRC-related genes. PMID

  9. Comparison of Sample Preparation Methods Used for the Next-Generation Sequencing of Mycobacterium tuberculosis

    PubMed Central

    Tyler, Andrea D.; Christianson, Sara; Knox, Natalie C.; Mabon, Philip; Wolfe, Joyce; Van Domselaar, Gary; Graham, Morag R.; Sharma, Meenu K.

    2016-01-01

    The advent and widespread application of next-generation sequencing (NGS) technologies to the study of microbial genomes has led to a substantial increase in the number of studies in which whole genome sequencing (WGS) is applied to the analysis of microbial genomic epidemiology. However, microorganisms such as Mycobacterium tuberculosis (MTB) present unique problems for sequencing and downstream analysis based on their unique physiology and the composition of their genomes. In this study, we compare the quality of sequence data generated using the Nextera and TruSeq isolate preparation kits for library construction prior to Illumina sequencing-by-synthesis. Our results confirm that MTB NGS data quality is highly dependent on the purity of the DNA sample submitted for sequencing and its guanine-cytosine content (or GC-content). Our data additionally demonstrate that the choice of library preparation method plays an important role in mitigating downstream sequencing quality issues. Importantly for MTB, the Illumina TruSeq library preparation kit produces more uniform data quality than the Nextera XT method, regardless of the quality of the input DNA. Furthermore, specific genomic sequence motifs are commonly missed by the Nextera XT method, as are regions of especially high GC-content relative to the rest of the MTB genome. As coverage bias is highly undesirable, this study illustrates the importance of appropriate protocol selection when performing NGS studies in order to ensure that sound inferences can be made regarding mycobacterial genomes. PMID:26849565

  10. Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish

    Technology Transfer Automated Retrieval System (TEKTRAN)

    RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...

  11. Genomic sequence and virulence comparison of four type 2 porcine reproductive and respiratory syndrome virus strains

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Porcine reproductive and respiratory syndrome virus (PRRSV) is a ubiquitous and costly virus that exhibits substantial sequence and virulence disparity among diverse isolates. In this study, we compared the whole genomic sequence and virulence of 4 North American Type 2 PRRSV isolates. Among the 4 i...

  12. Comparison of Sample Preparation Methods Used for the Next-Generation Sequencing of Mycobacterium tuberculosis.

    PubMed

    Tyler, Andrea D; Christianson, Sara; Knox, Natalie C; Mabon, Philip; Wolfe, Joyce; Van Domselaar, Gary; Graham, Morag R; Sharma, Meenu K

    2016-01-01

    The advent and widespread application of next-generation sequencing (NGS) technologies to the study of microbial genomes has led to a substantial increase in the number of studies in which whole genome sequencing (WGS) is applied to the analysis of microbial genomic epidemiology. However, microorganisms such as Mycobacterium tuberculosis (MTB) present unique problems for sequencing and downstream analysis based on their unique physiology and the composition of their genomes. In this study, we compare the quality of sequence data generated using the Nextera and TruSeq isolate preparation kits for library construction prior to Illumina sequencing-by-synthesis. Our results confirm that MTB NGS data quality is highly dependent on the purity of the DNA sample submitted for sequencing and its guanine-cytosine content (or GC-content). Our data additionally demonstrate that the choice of library preparation method plays an important role in mitigating downstream sequencing quality issues. Importantly for MTB, the Illumina TruSeq library preparation kit produces more uniform data quality than the Nextera XT method, regardless of the quality of the input DNA. Furthermore, specific genomic sequence motifs are commonly missed by the Nextera XT method, as are regions of especially high GC-content relative to the rest of the MTB genome. As coverage bias is highly undesirable, this study illustrates the importance of appropriate protocol selection when performing NGS studies in order to ensure that sound inferences can be made regarding mycobacterial genomes. PMID:26849565

  13. A computer program for the estimation of protein and nucleic acid sequence diversity in random point mutagenesis libraries

    PubMed Central

    Volles, Michael J.; Lansbury, Peter T.

    2005-01-01

    A computer program for the generation and analysis of in silico random point mutagenesis libraries is described. The program operates by mutagenizing an input nucleic acid sequence according to mutation parameters specified by the user for each sequence position and type of point mutation. The program can mimic almost any type of random mutagenesis library, including those produced via error-prone PCR (ep-PCR), mutator Escherichia coli strains, chemical mutagenesis, and doped or random oligonucleotide synthesis. The program analyzes the generated nucleic acid sequences and/or the associated protein library to produce several estimates of library diversity (number of unique sequences, point mutations, and single point mutants) and the rate of saturation of these diversities during experimental screening or selection of clones. This information allows one to select the optimal screen size for a given mutagenesis library, necessary to efficiently obtain a certain coverage of the sequence-space. The program also reports the abundance of each specific protein mutation at each sequence position, which is useful as a measure of the level and type of mutation bias in the library. Alternatively, one can use the program to evaluate the relative merits of preexisting libraries, or to examine various hypothetical mutation schemes to determine the optimal method for creating a library that serves the screen/selection of interest. Simulated libraries of at least 109 sequences are accessible by the numerical algorithm with currently available personal computers; an analytical algorithm is also available which can rapidly calculate a subset of the numerical statistics in libraries of arbitrarily large size. A multi-type double-strand stochastic model of ep-PCR is developed in an appendix to demonstrate the applicability of the algorithm to amplifying mutagenesis procedures. Estimators of DNA polymerase mutation-type-specific error rates are derived using the model. Analyses of an

  14. Comparison of winding-number sequences for symmetric and asymmetric oscillatory systems.

    PubMed

    Englisch, Volker; Parlitz, Ulrich; Lauterborn, Werner

    2015-08-01

    The bifurcation sets of symmetric and asymmetric periodically driven oscillators are investigated and classified by means of winding numbers. It is shown that periodic windows within chaotic regions are forming winding-number sequences on different levels. These sequences can be described by a simple formula that makes it possible to predict winding numbers at bifurcation points. Symmetric and asymmetric systems follow similar rules for the development of winding numbers within different sequences and these sequences can be combined into a single general rule. The role of the two distinct period-doubling cascades is investigated in the light of the winding-number sequences discovered. Examples are taken from the double-well Duffing oscillator, a special two-parameter Duffing oscillator, and a bubble oscillator. PMID:26382476

  15. Innovation sequence application to aircraft sensor fault detection: comparison of checking covariance matrix algorithms

    PubMed

    Caliskan; Hajiyev

    2000-01-01

    In this paper, the algorithms verifying the covariance matrix of the Kalman filter innovation sequence are compared with respect to detected minimum fault rate and detection time. Four algorithms are dealt with; the algorithm verifying the trace of the covariance matrix of the innovation sequence, the algorithm verifying the sum of all elements of the inverse covariance matrix of the innovation sequence, the optimal algorithm verifying the ratio of two quadratic forms of which matrices are theoretic and selected covariance matrices of Kalman filter innovation sequence, and the algorithm verifying the generalized variance of the covariance matrix of the innovation sequence. The algorithms are implemented for longitudinal dynamics of an aircraft to detect sensor faults, and some suggestions are given on the use of the algorithms in flight control systems. PMID:10826285

  16. DNA Sequence and Expression Variation of Hop (Humulus lupulus) Valerophenone Synthase (VPS), a Key Gene in Bitter Acid Biosynthesis

    PubMed Central

    Castro, Consuelo B.; Whittock, Lucy D.; Whittock, Simon P.; Leggett, Grey; Koutoulis, Anthony

    2008-01-01

    Background The hop plant (Humulus lupulus) is a source of many secondary metabolites, with bitter acids essential in the beer brewing industry and others having potential applications for human health. This study investigated variation in DNA sequence and gene expression of valerophenone synthase (VPS), a key gene in the bitter acid biosynthesis pathway of hop. Methods Sequence variation was studied in 12 varieties, and expression was analysed in four of the 12 varieties in a series across the development of the hop cone. Results Nine single nucleotide polymorphisms (SNPs) were detected in VPS, seven of which were synonymous. The two non-synonymous polymorphisms did not appear to be related to typical bitter acid profiles of the varieties studied. However, real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR) analysis of VPS expression during hop cone development showed a clear link with the bitter acid content. The highest levels of VPS expression were observed in two triploid varieties, ‘Symphony’ and ‘Ember’, which typically have high bitter acid levels. Conclusions In all hop varieties studied, VPS expression was lowest in the leaves and an increase in expression was consistently observed during the early stages of cone development. PMID:18519445

  17. Comparison of sterols and fatty acids in two species of Ganoderma

    PubMed Central

    2012-01-01

    Background Two species of Ganoderma, G. sinense and G. lucidum, are used as Lingzhi in China. Howerver, the content of triterpenoids and polysaccharides, main actives compounds, are significant different, though the extracts of both G. lucidum and G. sinense have antitumoral proliferation effect. It is suspected that other compounds contribute to their antitumoral activity. Sterols and fatty acids have obvious bioactivity. Therefore, determination and comparison of sterols and fatty acids is helpful to elucidate the active components of Lingzhi. Results Ergosterol, a specific component of fungal cell membrane, was rich in G. lucidum and G. sinense. But its content in G. lucidum (median content 705.0 μg·g-1, range 189.1-1453.3 μg·g-1, n = 19) was much higher than that in G. sinense (median content 80.1 μg·g-1, range 16.0-409.8 μg·g-1, n = 13). Hierarchical clustering analysis based on the content of ergosterol showed that 32 tested samples of Ganoderma were grouped into two main clusters, G. lucidum and G. sinense. Hierarchical clustering analysis based on the contents of ten fatty acids showed that two species of Ganoderma had no significant difference though two groups were also obtained. The similarity of two species of Ganoderma in fatty acids may be related to their antitumoral proliferation effect. Conclusions The content of ergosterol is much higher in G. lucidum than in G. sinense. Palmitic acid, linoleic acid, oleic acid, stearic acid are main fatty acids in Ganoderma and their content had no significant difference between G. lucidum and G. sinense, which may contribute to their antitumoral proliferation effect. PMID:22293530

  18. The amino acid sequence of protein SCMK-B2A from the high-sulphur fraction of wool keratin

    PubMed Central

    Elleman, T. C.

    1972-01-01

    1. The amino acid sequence of protein SCMK-B2A, a reduced and S-carboxymethylated protein from the high-sulphur fraction of wool, has been determined. 2. This protein of 171 amino acid residues displays both a high degree of internal homology and extensive external homology with other members of the SCMK-B2 group of proteins. 3. Evidence is presented which suggests that the SCMK-B2 group of proteins are produced by separate non-allelic genes. ImagesPLATE 1 PMID:4679226

  19. Environmental comparison of biobased chemicals from glutamic acid with their petrochemical equivalents.

    PubMed

    Lammens, Tijs M; Potting, José; Sanders, Johan P M; De Boer, Imke J M

    2011-10-01

    Glutamic acid is an important constituent of waste streams from biofuels production. It is an interesting starting material for the synthesis of biobased chemicals, thereby decreasing the dependency on fossil fuels. The objective of this paper was to compare the environmental impact of four biobased chemicals from glutamic acid with their petrochemical equivalents, that is, N-methylpyrrolidone (NMP), N-vinylpyrrolidone (NVP), acrylonitrile (ACN), and succinonitrile (SCN). A consequential life cycle assessment was performed, wherein glutamic acid was obtained from sugar beet vinasse. The removed glutamic acid was substituted with cane molasses and ureum. The comparison between the four biobased and petrochemical products showed that for NMP and NVP the biobased version had less impact on the environment, while for ACN and SCN the petrochemical version had less impact on the environment. For the latter two an optimized scenario was computed, which showed that the process for SCN can be improved to a level at which it can compete with the petrochemical process. For biobased ACN large improvements are required to make it competitive with its petrochemical equivalent. The results of this LCA and the research preceding it also show that glutamic acid can be a building block for a variety of molecules that are currently produced from petrochemical resources. Currently, most methods to produce biobased products are biotechnological processes based on sugar, but this paper demonstrates that the use of amino acids from low-value byproducts can certainly be a method as well. PMID:21870885

  20. High-affinity homologous peptide nucleic acid probes for targeting a quadruplex-forming sequence from a MYC promoter element.

    PubMed

    Roy, Subhadeep; Tanious, Farial A; Wilson, W David; Ly, Danith H; Armitage, Bruce A

    2007-09-18

    Guanine-rich DNA and RNA sequences are known to fold into secondary structures known as G-quadruplexes. Recent biochemical evidence along with the discovery of an increasing number of sequences in functionally important regions of the genome capable of forming G-quadruplexes strongly indicates important biological roles for these structures. Thus, molecular probes that can selectively target quadruplex-forming sequences (QFSs) are envisioned as tools to delineate biological functions of quadruplexes as well as potential therapeutic agents. Guanine-rich peptide nucleic acids have been previously shown to hybridize to homologous DNA or RNA sequences forming PNA-DNA (or RNA) quadruplexes. For this paper we studied the hybridization of an eight-mer G-rich PNA to a quadruplex-forming sequence derived from the promoter region of the MYC proto-oncogene. UV melting analysis, fluorescence assays, and surface plasmon resonance experiments reveal that this PNA binds to the MYC QFS in a 2:1 stoichiometry and with an average binding constant Ka = (2.0 +/- 0.2) x 10(8) M(-1) or Kd = 5.0 nM. In addition, experiments carried out with short DNA targets revealed a dependence of the affinity on the sequence of bases in the loop region of the DNA. A structural model for the hybrid quadruplex is proposed, and implications for gene targeting by G-rich PNAs are discussed. PMID:17718513

  1. A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

    PubMed

    García-Remesal, Miguel; Maojo, Victor; Crespo, José

    2010-01-01

    In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences. PMID:21096556

  2. Ferredoxin:NADP oxidoreductase of Cyanophora paradoxa: purification, partial characterization, and N-terminal amino acid sequence.

    PubMed

    Gebhart, U B; Maier, T L; Stevanović, S; Bayer, M G; Schenk, H E

    1992-06-01

    The ferredoxin:NADP+ oxidoreductase of the protist Cyanophora paradoxa, as a descendant of a former symbiotic consortium, an important model organism in view of the Endosymbiosis Theory, is the first enzyme purified from a formerly original endocytobiont (cyanelle) that is found to be encoded in the nucleus of the host. This cyanoplast enzyme was isolated by FPLC (19% yield) and characterized with respect to the uv-vis spectrum, pH optimum (pH 9), molecular mass of 34 kDa, and an N-terminal amino acid sequence (24 residues). The enzyme shows, as known from other organisms, molecular heterogeneity. The N-terminus of a further ferredoxin:NADP+ oxidoreductase polypeptide represents a shorter sequence missing the first four amino acids of the mature enzyme. PMID:1392619

  3. Purification, characterization, and amino acid sequencing of a. delta. /sup 5/-3-oxosteroid isomerase from Pseudomonas putida biotype B

    SciTech Connect

    Linden, K.G.

    1986-01-01

    Studies were performed on the ..delta../sup 5/-3-oxosteroid isomerase from Pseudomonas putida biotype B. The studies have involved three broad areas: improvement in the purification of the enzyme, further characterization of the purified enzyme, and completion of the amino acid sequence of the enzyme. For the purification of the enzyme, techniques for removing the isomerase from whole cells were studied, the effects of ionic strength on the binding of the isomerase to steroidal affinity resins was explored, and a new affinity resin was developed. Absorption spectra and the proton NMR spectra of the isomerase were obtained. Amino acid sequencing of the oxosteroid isomerase indicates that the enzyme is a dimeric protein consisting of two identical subunits each consisting of a polypeptide chain of 131 residues and a M/sub r/ = 14,536.

  4. A comparison of ARMS and DNA sequencing for mutation analysis in clinical biopsy samples

    PubMed Central

    2010-01-01

    Background We have compared mutation analysis by DNA sequencing and Amplification Refractory Mutation System™ (ARMS™) for their ability to detect mutations in clinical biopsy specimens. Methods We have evaluated five real-time ARMS assays: BRAF 1799T>A, [this includes V600E and V600K] and NRAS 182A>G [Q61R] and 181C>A [Q61K] in melanoma, EGFR 2573T>G [L858R], 2235-2249del15 [E746-A750del] in non-small-cell lung cancer, and compared the results to DNA sequencing of the mutation 'hot-spots' in these genes in formalin-fixed paraffin-embedded tumour (FF-PET) DNA. Results The ARMS assays maximised the number of samples that could be analysed when both the quality and quantity of DNA was low, and improved both the sensitivity and speed of analysis compared with sequencing. ARMS was more robust with fewer reaction failures compared with sequencing and was more sensitive as it was able to detect functional mutations that were not detected by DNA sequencing. DNA sequencing was able to detect a small number of lower frequency recurrent mutations across the exons screened that were not interrogated using the specific ARMS assays in these studies. Conclusions ARMS was more sensitive and robust at detecting defined somatic mutations than DNA sequencing on clinical samples where the predominant sample type was FF-PET. PMID:20925915

  5. Identification of novel rice low phytic acid mutations via TILLING by sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Phytic acid (myo-inositol-1,2,3,4,5,6-hexakisphosphate or InsP6) accounts for 75-85% of the total phosphorus in seeds. Low phytic acid (lpa) mutants exhibit decreases in seed InsP6 with corresponding increases in inorganic P which, unlike phytic acid P, is readily utilized by humans and monogastric ...

  6. Snake venoms. The amino-acid sequence of trypsin inhibitor E of Dendroaspis polylepis polylepis (Black Mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1978-06-01

    Trypsin inhibitor E from black mamba venom comprises 59 amino acid residues in a single polypeptide chain, cross-linked by three intrachain disulphide bridges. The complete primary structure of inhibitor E was elucidated. The sequence is homologous with trypsin inhibitors from different sources. Unique among this homologous series of proteinase inhibitors, inhibitor E has an affinity for transition metal ions, exemplified here by Cu2 and Co2+. PMID:668688

  7. Draft Genome Sequence of Escherichia coli Strain VKPM B-10182, Producing the Enzyme for Synthesis of Cephalosporin Acids

    PubMed Central

    Mardanov, Andrey V.; Eldarov, Mikhail A.; Sklyarenko, Anna V.; Dumina, Maria V.; Beletsky, Alexey V.; Yarotsky, Sergey V.

    2014-01-01

    Escherichia coli strain VKPM B-10182, obtained by chemical mutagenesis from E. coli strain ATCC 9637, produces cephalosporin acid synthetase employed in the synthesis of β-lactam antibiotics, such as cefazolin. The draft genome sequence of strain VKPM B-10182 revealed 32 indels and 1,780 point mutations that might account for the improvement in antibiotic synthesis that we observed. PMID:25414512

  8. Complete sequence of the genome of avian paramyxovirus type 2 (strain Yucaipa) and comparison with other paramyxoviruses

    PubMed Central

    Subbiah, Madhuri; Xiao, Sa; Collins, Peter L.; Samal, Siba K

    2009-01-01

    The complete RNA genome sequence of avian paramyxovirus (APMV) serotype 2, strain Yucaipa isolated from chicken has been determined. With genome size of 14,904 nucleotides (nt), strain Yucaipa is consistent with the “rule of six” and is the smallest virus reported to date among the members of subfamily Paramyxovirinae. The genome contains six non-overlapping genes in the order 3′-N-P/V-M-F-HN-L-5′. The genes are flanked on either side by highly-conserved transcription start and stop signals and have intergenic sequences varying in length from 3 to 23 nt. The genome contains a 55 nt leader sequence at 3′ end and a 154 nt trailer sequence at 5′ end. Alignment and phylogenetic analysis of the predicted amino acid sequences of strain Yucaipa proteins with the cognate proteins of viruses of all of the five genera of family Paramyxoviridae showed that APMV-2 strain Yucaipa is more closely related to APMV-6 than APMV-1. PMID:18603323

  9. First draft genome sequencing of indole acetic acid producing and plant growth promoting fungus Preussia sp. BSL10.

    PubMed

    Khan, Abdul Latif; Asaf, Sajjad; Khan, Abdur Rahim; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung

    2016-05-10

    Preussia sp. BSL10, family Sporormiaceae, was actively producing phytohormone (indole-3-acetic acid) and extra-cellular enzymes (phosphatases and glucosidases). The fungus was also promoting the growth of arid-land tree-Boswellia sacra. Looking at such prospects of this fungus, we sequenced its draft genome for the first time. The Illumina based sequence analysis reveals an approximate genome size of 31.4Mbp for Preussia sp. BSL10. Based on ab initio gene prediction, total 32,312 coding sequences were annotated consisting of 11,967 coding genes, pseudogenes, and 221 tRNA genes. Furthermore, 321 carbohydrate-active enzymes were predicted and classified into many functional families. PMID:26995610

  10. A simple ligation-based method to increase the information density in sequencing reactions used to deconvolute nucleic acid selections

    PubMed Central

    Childs-Disney, Jessica L.; Disney, Matthew D.

    2008-01-01

    Herein, a method is described to increase the information density of sequencing experiments used to deconvolute nucleic acid selections. The method is facile and should be applicable to any selection experiment. A critical feature of this method is the use of biotinylated primers to amplify and encode a BamHI restriction site on both ends of a PCR product. After amplification, the PCR reaction is captured onto streptavidin resin, washed, and digested directly on the resin. Resin-based digestion affords clean product that is devoid of partially digested products and unincorporated PCR primers. The product's complementary ends are annealed and ligated together with T4 DNA ligase. Analysis of ligation products shows formation of concatemers of different length and little detectable monomer. Sequencing results produced data that routinely contained three to four copies of the library. This method allows for more efficient formulation of structure-activity relationships since multiple active sequences are identified from a single clone. PMID:18065718

  11. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    SciTech Connect

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g. Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.

  12. A novel T-cell-defined HLA-DR polymorphism not predicted from the linear amino acid sequence.

    PubMed

    Termijtelen, A; van den Elsen, P; Koning, F; de Koster, S; Schroeijers, W; Vanderkerckhove, B

    1989-09-01

    Recent investigations have shown that alloreactive T cells are capable of responding to structures defined by specific linear amino acid sequences on class II molecules. In the present study we show that also a polymorphism can be recognized that is not defined by such linear amino acid sequences. Two human T-cell clones, sensitized to DRw13 haplotypes, are described. The description of clone c50 serves to exemplify the first model. This DRB1-specific clone responds to stimulator cells that carry DR molecules, different in their DRB1 first and second hypervariable regions (HV1 and HV2) but identical in their HV3 regions (i.e., DRw13,Dw18; DRw13,Dw19; DR4,Dw10; and DRw11,LDVII). The second clone, c1443, behaves nonconventionally. It responds to DRw13,Dw18; DRw13,Dw19; and DR4,Dw4 stimulator cells, although no specific amino acid sequence is shared between these specificities. The latter pattern of reactivity suggests the existence of a novel polymorphism recognized by alloreactive T cells. This particular polymorphism may also be biologically significant. PMID:2476425

  13. cDNA-derived amino-acid sequence of a land turtle (Geochelone carbonaria) beta-chain hemoglobin.

    PubMed

    Bordin, S; Meza, A N; Saad, S T; Ogo, S H; Costa, F F

    1997-06-01

    The cDNA sequence encoding the turtle Geochelone carbonaria beta-chain was determinated. The isolation of hemoglobin mRNA was based on degenerate primers' PCR in combination with 5'- and 3'-RACE protocol. The full length cDNA is 615 bp with the ATG start codon at position 53 and TGA stop codon at position 495; The AATAAA polyadenylation signal is found at position 599. The deduced polypeptyde contains 146 amino-acid residues. The predicted amino acid sequence shares 83% identity with the beta-globin of a related specie, the aquatic turtle C. p. belli. Otherwise, identity is higher when compared with chicken beta-Hb (80%) than with other reptilian orders (Squamata, 69%, and Crocodilia, 61%). Compared with human HbA, there is 67% identity, and at least three amino acid substitutions could be of some functional significance (Glu43 beta-->Ser, His116 beta-->Thr and His143 beta-->Leu). To our knowledge this represents the first cDNA sequence of a reptile globin gene described. PMID:9238523

  14. Amino acid sequence of the serine-repeat antigen (SERA) of Plasmodium falciparum determined from cloned cDNA.

    PubMed

    Bzik, D J; Li, W B; Horii, T; Inselburg, J

    1988-09-01

    We report the isolation of cDNA clones for a Plasmodium falciparum gene that encodes the complete amino acid sequence of a previously identified exported blood stage antigen. The Mr of this antigen protein had been determined by sodium dodecylsulphate-polyacrylamide gel electrophoresis analysis, by different workers, to be 113,000, 126,000, and 140,000. We show, by cDNA nucleotide sequence analysis, that this antigen gene encodes a 989 amino acid protein (111 kDa) that contains a potential signal peptide, but not a membrane anchor domain. In the FCR3 strain the serine content of the protein was 11%, of which 57% of the serine residues were localized within a 201 amino acid sequence that included 35 consecutive serine residues. The protein also contained three possible N-linked glycosylation sites and numerous possible O-linked glycosylation sites. The mRNA was abundant during late trophozoite-schizont parasite stages. We propose to identity this antigen, which had been called p126, by the acronym SERA, serine-repeat antigen, based on its complete structure. The usefulness of the cloned cDNA as a source of a possible malaria vaccine is considered in view of the previously demonstrated ability of the antigen to induce parasite-inhibitory antibodies and a protective immune response in Saimiri monkeys. PMID:2847041

  15. Amino acid sequences of lysozymes newly purified from invertebrates imply wide distribution of a novel class in the lysozyme family.

    PubMed

    Ito, Y; Yoshikawa, A; Hotani, T; Fukuda, S; Sugimura, K; Imoto, T

    1999-01-01

    Lysozymes were purified from three invertebrates: a marine bivalve, a marine conch, and an earthworm. The purified lysozymes all showed a similar molecular weight of 13 kDa on SDS/PAGE. Their N-terminal sequences up to the 33rd residue determined here were apparently homologous among them; in addition, they had a homology with a partial sequence of a starfish lysozyme which had been reported before. The complete sequence of the bivalve lysozyme was determined by peptide mapping and subsequent sequence analysis. This was composed of 123 amino acids including as many as 14 cysteine residues and did not show a clear homology with the known types of lysozymes. However, the homology search of this protein on the protein or nucleic acid database revealed two homologous proteins. One of them was a gene product, CELF22 A3.6 of C. elegans, which was a functionally unknown protein. The other was an isopeptidase of a medicinal leech, named destabilase. Thus, a new type of lysozyme found in at least four species across the three classes of the invertebrates demonstrates a novel class of protein/lysozyme family in invertebrates. The bivalve lysozyme, first characterized here, showed extremely high protein stability and hen lysozyme-like enzymatic features. PMID:9914527

  16. Complete Genome Sequences of Escherichia coli O157:H7 Strains SRCC 1675 and 28RC, Which Vary in Acid Resistance

    PubMed Central

    Baranzoni, Gian Marco; Reichenberger, Erin R.; Kim, Gwang-Hee; Breidt, Frederick; Kay, Kathryn; Oh, Deog-Hwan

    2016-01-01

    The level of acid resistance among Escherichia coli O157:H7 strains varies, and strains with higher resistance to acid may have a lower infectious dose. The complete genome sequences belonging to two strains of Escherichia coli O157:H7 with different levels of acid resistance are presented here. PMID:27469964

  17. Complete Genome Sequences of Escherichia coli O157:H7 Strains SRCC 1675 and 28RC, Which Vary in Acid Resistance.

    PubMed

    Baranzoni, Gian Marco; Fratamico, Pina M; Reichenberger, Erin R; Kim, Gwang-Hee; Breidt, Frederick; Kay, Kathryn; Oh, Deog-Hwan

    2016-01-01

    The level of acid resistance among Escherichia coli O157:H7 strains varies, and strains with higher resistance to acid may have a lower infectious dose. The complete genome sequences belonging to two strains of Escherichia coli O157:H7 with different levels of acid resistance are presented here. PMID:27469964

  18. Complete genome sequences of Escherichia coli O157:H7 strains SRCC 1675 and 28RC that vary in acid resistance

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The level of acid resistance among Escherichia coli O157:H7 strains varies, and strains with higher resistance to acid may have a lower infectious dose. The complete genome sequences belonging to two strains of Escherichia coli O157:H7 with different levels of acid resistance are presented....

  19. Solving a sequencing problem in the vertebrate mitochondrial control region using phylogenetic comparisons.

    PubMed

    Feinstein, Julie; Cracraft, Joel

    2004-01-01

    The mitochondrial control region (mtCR) of the bird-of-paradise, Phonygammus keraudrenii, the Trumpet Manucode, contains a unique arrangement of homopolymers and short tandem repeats. Homopolymers occur within a few hundred bases of each other, trapping sequence information between unsequenceable barriers. A comparative strategy, involving other manucode species, allowed the prediction of primer sites in the inaccessible region. The method is suggested for similar sequencing problems. PMID:15621664

  20. Pierre Robin Sequence and Treacher Collins Hypoplastic Mandible Comparison Using Three-Dimensional Morphometric Analysis

    PubMed Central

    Chung, Michael T.; Levi, Benjamin; Hyun, Jeong S.; Lo, David D.; Montoro, Daniel T.; Lisiecki, Jeffrey; Bradley, James P.; Buchman, Steven R.; Longaker, Michael T.; Wan, Derrick C.

    2012-01-01

    Pierre Robin sequence and Treacher Collins syndrome are both associated with mandibular hypoplasia. It has been hypothesized, however, that the mandible may be differentially affected. The purpose of this study was to therefore compare mandibular morphology in children with Pierre Robin sequence to children with Treacher Collins syndrome using three-dimensional analysis of computed tomography (CT) scans. A retrospective analysis was performed identifying children with Pierre Robin sequence and Treacher Collins syndrome receiving CT scans. Three-dimensional reconstruction was performed and ramus height, mandibular body length, and gonial angle were measured. These were then compared to control children with normal mandibles and to clinical norms corrected for age and sex based on previously published measurements. Mandibular body length was found to be significantly shorter for children with Pierre Robin sequence while ramus height was significantly shorter for children with Treacher Collins syndrome. This resulted in distinctly different ramus height/mandibular body length ratios. In addition, the gonial angle was more obtuse in both the Pierre Robin sequence and Treacher Collins syndrome groups compared with the controls. Three-dimensional mandibular morphometric analysis in patients with Pierre Robin sequence and Treacher Collins syndrome thus revealed distinctly different patterns of mandibular hypoplasia relative to normal controls. These findings underscore distinct considerations which must be made in surgical planning for reconstruction. PMID:23154353

  1. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications.

    PubMed

    Harris, R Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P; Hong, Chibo; Downey, Sara L; Johnson, Brett E; Fouse, Shaun D; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J; Gu, Junchen; Echipare, Lorigail; O'Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B; Bernstein, Bradley E; Hawkins, R David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q; Haussler, David; Ecker, Joseph R; Li, Wei; Farnham, Peggy J; Waterland, Robert A; Meissner, Alexander; Marra, Marco A; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F

    2010-10-01

    Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA binding domain sequencing (MBD-seq). We applied all four methods to biological replicates of human embryonic stem cells to assess their genome-wide CpG coverage, resolution, cost, concordance and the influence of CpG density and genomic context. The methylation levels assessed by the two bisulfite methods were concordant (their difference did not exceed a given threshold) for 82% for CpGs and 99% of the non-CpG cytosines. Using binary methylation calls, the two enrichment methods were 99% concordant and regions assessed by all four methods were 97% concordant. We combined MeDIP-seq with methylation-sensitive restriction enzyme (MRE-seq) sequencing for comprehensive methylome coverage at lower cost. This, along with RNA-seq and ChIP-seq of the ES cells enabled us to detect regions with allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635

  2. Molecular characterization of the body site-specific human epidermal cytokeratin 9: cDNA cloning, amino acid sequence, and tissue specificity of gene expression.

    PubMed

    Langbein, L; Heid, H W; Moll, I; Franke, W W

    1993-12-01

    Differentiation of human plantar and palmar epidermis is characterized by the suprabasal synthesis of a major special intermediate-sized filament (IF) protein, the type I (acidic) cytokeratin 9 (CK 9). Using partial amino acid (aa) sequence information obtained by direct Edman sequencing of peptides resulting from proteolytic digestion of purified CK 9, we synthesized several redundant primers by 'back-translation'. Amplification by polymerase chain reaction (PCR) of cDNAs obtained by reverse transcription of mRNAs from human foot sole epidermis, including 5'-primer extension, resulted in multiple overlapping cDNA clones, from which the complete cDNA (2353 bp) could be constructed. This cDNA encoded the CK 9 polypeptide with a calculated molecular weight of 61,987 and an isoelectric point at about pH 5.0. The aa sequence deduced from cDNA was verified in several parts by comparison with the peptide sequences and showed the typical structure of type I CKs, with a head (153 aa), and alpha-helical coiled-coil-forming rod (306 aa), and a tail (163 aa) domain. The protein displayed the highest homology to human CK 10, not only in the highly conserved rod domain but also in large parts of the head and the tail domains. On the other hand, the aa sequence revealed some remarkable differences from CK 10 and other CKs, even in the most conserved segments of the rod domain. The nuclease digestion pattern seen on Southern blot analysis of human genomic DNA indicated the existence of a unique CK 9 gene. Using CK 9-specific riboprobes for hybridization on Northern blots of RNAs from various epithelia, a mRNA of about 2.4 kb in length could be identified only in foot sole epidermis, and a weaker cross-hybridization signal was seen in RNA from bovine heel pad epidermis at about 2.0 kb. A large number of tissues and cell cultures were examined by PCR of mRNA-derived cDNAs, using CK 9-specific primers. But even with this very sensitive signal amplification, only palmar

  3. Fad7 gene identification and fatty acids phenotypic variation in an olive collection by EcoTILLING and sequencing approaches.

    PubMed

    Sabetta, Wilma; Blanco, Antonio; Zelasco, Samanta; Lombardo, Luca; Perri, Enzo; Mangini, Giacomo; Montemurro, Cinzia

    2013-08-01

    The ω-3 fatty acid desaturases (FADs) are enzymes responsible for catalyzing the conversion of linoleic acid to α-linolenic acid localized in the plastid or in the endoplasmic reticulum. In this research we report the genotypic and phenotypic variation of Italian Olea europaea L. germoplasm for the fatty acid composition. The phenotypic oil characterization was followed by the molecular analysis of the plastidial-type ω-3 FAD gene (fad7) (EC 1.14.19), whose full-length sequence has been here identified in cultivar Leccino. The gene consisted of 2635 bp with 8 exons and 5'- and 3'-UTRs of 336 and 282 bp respectively, and showed a high level of heterozygousity (1/110 bp). The natural allelic variation was investigated both by a LiCOR EcoTILLING assay and the PCR product direct sequencing. Only three haplotypes were identified among the 96 analysed cultivars, highlighting the strong degree of conservation of this gene. PMID:23685785

  4. Sequence-independent and reversible photocontrol of transcription/expression systems using a photosensitive nucleic acid binder

    PubMed Central

    Estévez-Torres, André; Crozatier, Cécile; Diguet, Antoine; Hara, Tomoaki; Saito, Hirohide; Yoshikawa, Kenichi; Baigl, Damien

    2009-01-01

    To understand non-trivial biological functions, it is crucial to develop minimal synthetic models that capture their basic features. Here, we demonstrate a sequence-independent, reversible control of transcription and gene expression using a photosensitive nucleic acid binder (pNAB). By introducing a pNAB whose affinity for nucleic acids is tuned by light, in vitro RNA production, EGFP translation, and GFP expression (a set of reactions including both transcription and translation) were successfully inhibited in the dark and recovered after a short illumination at 365 nm. Our results indicate that the accessibility of the protein machinery to one or several nucleic acid binding sites can be efficiently regulated by changing the conformational/condensation state of the nucleic acid (DNA conformation or mRNA aggregation), thus regulating gene activity in an efficient, reversible, and sequence-independent manner. The possibility offered by our approach to use light to trigger various gene expression systems in a system-independent way opens interesting perspectives to study gene expression dynamics as well as to develop photocontrolled biotechnological procedures. PMID:19617550

  5. Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray.

    PubMed

    Murakami, Yoshiki; Tanahashi, Toshihito; Okada, Rina; Toyoda, Hidenori; Kumada, Takashi; Enomoto, Masaru; Tamori, Akihiro; Kawada, Norifumi; Taguchi, Y-h; Azuma, Takeshi

    2014-01-01

    MicroRNA (miRNA) expression profiling has proven useful in diagnosing and understanding the development and progression of several diseases. Microarray is the standard method for analyzing miRNA expression profiles; however, it has several disadvantages, including its limited detection of miRNAs. In recent years, advances in genome sequencing have led to the development of next-generation sequencing (NGS) technologies, which significantly advance genome sequencing speed and discovery. In this study, we compared the expression profiles obtained by next generation sequencing (NGS) with the profiles created using microarray to assess if NGS could produce a more accurate and complete miRNA profile. Total RNA from 14 hepatocellular carcinoma tumors (HCC) and 6 matched non-tumor control tissues were sequenced with Illumina MiSeq 50-bp single-end reads. Micro RNA expression profiles were estimated using miRDeep2 software. As a comparison, miRNA expression profiles for 11 out of 14 HCCs were also established by microarray (Agilent human microRNA microarray). The average total sequencing exceeded 2.2 million reads per sample and of those reads, approximately 57% mapped to the human genome. The average correlation for miRNA expression between microarray and NGS and subtraction were 0.613 and 0.587, respectively, while miRNA expression between technical replicates was 0.976. The diagnostic accuracy of HCC, p-value, and AUC were 90.0%, 7.22×10(-4), and 0.92, respectively. In summary, NGS created an miRNA expression profile that was reproducible and comparable to that produced by microarray. Moreover, NGS discovered novel miRNAs that were otherwise undetectable by microarray. We believe that miRNA expression profiling by NGS can be a useful diagnostic tool applicable to multiple fields of medicine. PMID:25215888

  6. [International comparison APMP.QM-P23: determination of benzoic acid in orange juice].

    PubMed

    Guo, Zhen; Fu, Hui; Li, Xiuqin; Zhang, Qinghe

    2013-12-01

    A method was developed for the separation and determination of benzoic acid in orange juice by high performance liquid chromatography-ultraviolet detection (HPLC-UV) and liquid chromatography-isotope dilution mass spectrometry (LC-IDMS). National Institute of Metrology (NIM) of China participated the international comparison activity organized by Asia Pacific Metrology Programme (APMP) and got good results using this method. The effects of several important factors, such as the chromatographic conditions and sample preparation conditions, were investigated to acquire optimum conditions. A method of uncertainty determination was also developed which can be used in similar measurements of uncertainty. The limit of detection (LOD, S/N > 3) of HPLC-UV method was 0.75 mg/kg, and the recovery of benzoic acid in orange juice at the spiked level 100 mg/kg was 99.4%. The LOD of LC-IDMS method was 0.05 mg/kg, and the recovery of benzoic acid at the same spiked level was 99.6%. The final determination result of benzoic acid in the orange juice sample by both methods was (102.0 +/- 2.1) mg/kg (coverage factor kappa = 2). The two methods are both simple, accurate, reliable and reproducible. The LC-IDMS method is more suitable for the determination of benzoic acid at low concentrations due to its high sensitivity. PMID:24669711

  7. Next-Generation Sequencing of the Bacterial 16S rRNA Gene for Forensic Soil Comparison: A Feasibility Study.

    PubMed

    Jesmok, Ellen M; Hopkins, James M; Foran, David R

    2016-05-01

    Soil has the potential to be valuable forensic evidence linking a person or item to a crime scene; however, there is no established soil individualization technique. In this study, the utility of soil bacterial profiling via next-generation sequencing of the 16S rRNA gene was examined for associating soils with their place of origin. Soil samples were collected from ten diverse and nine similar habitats over time, and within three habitats at various horizontal and vertical distances. Bacterial profiles were analyzed using four methods: abundance charts and nonmetric multidimensional scaling provided simplification and visualization of the massive datasets, potentially aiding in expert testimony, while analysis of similarities and k-nearest neighbor offered objective statistical comparisons. The vast majority of soil bacterial profiles (95.4%) were classified to their location of origin, highlighting the potential of bacterial profiling via next-generation sequencing for the forensic analysis of soil samples. PMID:27122396

  8. A direct comparison of MELCOR 1.8.3 and MAAP4 results for several PWR & BWR accident sequences

    SciTech Connect

    Leonard, M.T.; Ashbaugh, S.G.; Cole, R.K.; Bergeron, K.D.; Nagashima, K.

    1996-08-01

    This paper presents a comparison of calculations of severe accident progression for several postulated accident sequences for representative Pressurized Water Reactors (PWR) and Boiling Water Reactors (BWR) nuclear power plants performed with the MELCOR 1.8.3 and the MAAP4 computer codes. The PWR system examined in this study is a 1100 MWe system similar in design to a Westinghouse 3-loop plant with a large dry containment; the BWR is a 1100 MWe system similar in design to General Electric BWR/4 with a Mark I containment. A total of nine accident sequences were studied with both codes. Results of these calculations are compared to identify major differences in the timing of key events in the calculated accident progression or other important aspects of severe accident behavior, and to identify specific sources of the observed differences.

  9. Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight?

    PubMed Central

    Fetrow, Jacquelyn S.; Siew, Naomi; Di Gennaro, Jeannine A.; Martinez-Yamout, Maria; Dyson, H. Jane; Skolnick, Jeffrey

    2001-01-01

    A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone. PMID:11316881

  10. Quantitative phase-flow MR imaging in dogs by using standard sequences: comparison with in vivo flow-meter measurements.

    PubMed

    Pettigrew, R I; Dannels, W; Galloway, J R; Pearson, T; Millikan, W; Henderson, J M; Peterson, J; Bernardino, M E

    1987-02-01

    For evaluation of the feasibility and clinical potential of using the phase data from standard MR imaging sequences to measure blood flow, 11 vessels with diameters of 4 to 7 mm were imaged in seven dogs. The flow in either the superior mesenteric vein or the inferior vena cava was measured first at laparotomy (in ml/min) with electromagnetic flow meters. Immediately thereafter, these vessels were imaged by MR in 25-mm thick sections by using a standard spin echo (SE) 750/30 sequence with a Philips 0.5-T imager. Previous phase-flow calibration of the imager and sequence allowed calculation of the blood flow rates from the phase images that were used to measure the vessels' cross-sectional areas and blood phase values. Comparison of the measurements obtained with each technique showed a significant correlation (r = .977, p less than .05) between MR-imaging values and flow-meter measurements when the blood velocity was less than approximately 40 cm/sec, the known upper limit of the flow dynamic range for the MR hardware and sequence used. There was no correlation for blood velocities greater than 40 cm/sec. However, the range of blood flow velocities in dogs and man extends to more than 100 cm/sec. Thus, these results suggest that this technique might yield valuable adjunctive flow data in routine clinical imaging provided that improvements in hardware and software permit a larger dynamic range. PMID:2948376

  11. Enzymatic generation of peptides flanked by basic amino acids to obtain MS/MS spectra with 2× sequence coverage

    PubMed Central

    Ebhardt, H Alexander; Nan, Jie; Chaulk, Steven G; Fahlman, Richard P; Aebersold, Ruedi

    2014-01-01

    RATIONALE Tandem mass (MS/MS) spectra generated by collision-induced dissociation (CID) typically lack redundant peptide sequence information in the form of e.g. b- and y-ion series due to frequent use of sequence-specific endopeptidases cleaving C- or N-terminal to Arg or Lys residues. METHODS Here we introduce arginyl-tRNA protein transferase (ATE, EC 2.3.2.8) for proteomics. ATE recognizes acidic amino acids or oxidized Cys at the N-terminus of a substrate peptide and conjugates an arginine from an aminoacylated tRNAArg onto the N-terminus of the substrate peptide. This enzymatic reaction is carried out under physiological conditions and, in combination with Lys-C/Asp-N double digest, results in arginylated peptides with basic amino acids on both termini. RESULTS We demonstrate that in vitro arginylation of peptides using yeast arginyl tRNA protein transferase 1 (yATE1) is a robust enzymatic reaction, specific to only modifying N-terminal acidic amino acids. Precursors originating from arginylated peptides generally have an increased protonation state compared with their non-arginylated forms. Furthermore, the product ion spectra of arginylated peptides show near complete 2× fragment ladders within the same MS/MS spectrum using commonly available electrospray ionization peptide fragmentation modes. Unexpectedly, arginylated peptides generate complete y- and c-ion series using electron transfer dissociation (ETD) despite having an internal proline residue. CONCLUSIONS We introduce a rapid enzymatic method to generate peptides flanked on either terminus by basic amino acids, resulting in a rich, redundant MS/MS fragment pattern. © 2014 The Authors. Rapid Communications in Mass Spectrometry published by John Wiley & Sons Ltd. PMID:25380496

  12. Site-directed gene mutation at mixed sequence targets by psoralen-conjugated pseudo-complementary peptide nucleic acids.

    PubMed

    Kim, Ki-Hyun; Nielsen, Peter E; Glazer, Peter M

    2007-01-01

    Sequence-specific DNA-binding molecules such as triple helix-forming oligonucleotides (TFOs) provide a means for inducing site-specific mutagenesis and recombination at chromosomal sites in mammalian cells. However, the utility of TFOs is limited by the requirement for homopurine stretches in the target duplex DNA. Here, we report the use of pseudo-complementary peptide nucleic acids (pcPNAs) for intracellular gene targeting at mixed sequence sites. Due to steric hindrance, pcPNAs are unable to form pcPNA-pcPNA duplexes but can bind to complementary DNA sequences by Watson-Crick pairing via double duplex-invasion complex formation. We show that psoralen-conjugated pcPNAs can deliver site-specific photoadducts and mediate targeted gene modification within both episomal and chromosomal DNA in mammalian cells without detectable off-target effects. Most of the induced psoralen-pcPNA mutations were single-base substitutions and deletions at the predicted pcPNA-binding sites. The pcPNA-directed mutagenesis was found to be dependent on PNA concentration and UVA dose and required matched pairs of pcPNAs. Neither of the individual pcPNAs alone had any effect nor did complementary PNA pairs of the same sequence. These results identify pcPNAs as new tools for site-specific gene modification in mammalian cells without purine sequence restriction, thereby providing a general strategy for designing gene targeting molecules. PMID:17977869

  13. Zucchini yellow mosaic virus: biological properties, detection procedures and comparison of coat protein gene sequences.

    PubMed

    Coutts, B A; Kehoe, M A; Webster, C G; Wylie, S J; Jones, R A C

    2011-12-01

    Between 2006 and 2010, 5324 samples from at least 34 weed, two cultivated legume and 11 native species were collected from three cucurbit-growing areas in tropical or subtropical Western Australia. Two new alternative hosts of zucchini yellow mosaic virus (ZYMV) were identified, the Australian native cucurbit Cucumis maderaspatanus, and the naturalised legume species Rhyncosia minima. Low-level (0.7%) seed transmission of ZYMV was found in seedlings grown from seed collected from zucchini (Cucurbita pepo) fruit infected with isolate Cvn-1. Seed transmission was absent in >9500 pumpkin (C. maxima and C. moschata) seedlings from fruit infected with isolate Knx-1. Leaf samples from symptomatic cucurbit plants collected from fields in five cucurbit-growing areas in four Australian states were tested for the presence of ZYMV. When 42 complete coat protein (CP) nucleotide (nt) sequences from the new ZYMV isolates obtained were compared to those of 101 complete CP nt sequences from five other continents, phylogenetic analysis of the 143 ZYMV sequences revealed three distinct groups (A, B and C), with four subgroups in A (I-IV) and two in B (I-II). The new Australian sequences grouped according to collection location, fitting within A-I, A-II and B-II. The 16 new sequences from one isolated location in tropical northern Western Australia all grouped into subgroup B-II, which contained no other isolates. In contrast, the three sequences from the Northern Territory fitted into A-II with 94.6-99.0% nt identities with isolates from the United States, Iran, China and Japan. The 23 new sequences from the central west coast and two east coast locations all fitted into A-I, with 95.9-98.9% nt identities to sequences from Europe and Japan. These findings suggest that (i) there have been at least three separate ZYMV introductions into Australia and (ii) there are few changes to local isolate CP sequences following their establishment in remote growing areas. Isolates from A-I and B

  14. Comparison of Custom Capture for Targeted Next-Generation DNA Sequencing

    PubMed Central

    Samorodnitsky, Eric; Datta, Jharna; Jewell, Benjamin M.; Hagopian, Raffi; Miya, Jharna; Wing, Michele R.; Damodaran, Senthilkumar; Lippus, Juliana M.; Reeser, Julie W.; Bhatt, Darshna; Timmers, Cynthia D.; Roychowdhury, Sameek

    2016-01-01

    Targeted, capture-based DNA sequencing is a cost-effective method to focus sequencing on a coding region or other customized region of the genome. There are multiple targeted sequencing methods available, but none has been systematically investigated and compared. We evaluated four commercially available custom-targeted DNA technologies for next-generation sequencing with respect to on-target sequencing, uniformity, and ability to detect single-nucleotide variations (SNVs) and copy number variations. The technologies that used sonication for DNA fragmentation displayed impressive uniformity of capture, whereas the others had shorter preparation times, but sacrificed uniformity. One of those technologies, which uses transposase for DNA fragmentation, has a drawback requiring sample pooling, and the last one, which uses restriction enzymes, has a limitation depending on restriction enzyme digest sites. Although all technologies displayed some level of concordance for calling SNVs, the technologies that require restriction enzymes or transposase missed several SNVs largely because of the lack of coverage. All technologies performed well for copy number variation calling when compared to single-nucleotide polymorphism arrays. These results enable laboratories to compare these methods to make informed decisions for their intended applications. PMID:25528188

  15. Comparison of custom capture for targeted next-generation DNA sequencing.

    PubMed

    Samorodnitsky, Eric; Datta, Jharna; Jewell, Benjamin M; Hagopian, Raffi; Miya, Jharna; Wing, Michele R; Damodaran, Senthilkumar; Lippus, Juliana M; Reeser, Julie W; Bhatt, Darshna; Timmers, Cynthia D; Roychowdhury, Sameek

    2015-01-01

    Targeted, capture-based DNA sequencing is a cost-effective method to focus sequencing on a coding region or other customized region of the genome. There are multiple targeted sequencing methods available, but none has been systematically investigated and compared. We evaluated four commercially available custom-targeted DNA technologies for next-generation sequencing with respect to on-target sequencing, uniformity, and ability to detect single-nucleotide variations (SNVs) and copy number variations. The technologies that used sonication for DNA fragmentation displayed impressive uniformity of capture, whereas the others had shorter preparation times, but sacrificed uniformity. One of those technologies, which uses transposase for DNA fragmentation, has a drawback requiring sample pooling, and the last one, which uses restriction enzymes, has a limitation depending on restriction enzyme digest sites. Although all technologies displayed some level of concordance for calling SNVs, the technologies that require restriction enzymes or transposase missed several SNVs largely because of the lack of coverage. All technologies performed well for copy number variation calling when compared to single-nucleotide polymorphism arrays. These results enable laboratories to compare these methods to make informed decisions for their intended applications. PMID:25528188

  16. Complete amino acid sequence of human plasma Zn-. cap alpha. /sub 2/-glycoprotein and its homology to histocompatibility antigens

    SciTech Connect

    Araki, T.; Gejyo, F.; Takagaki, K.; Haupt, H.; Schwick, H.G.; Buergi, W.; Marti, T.; Schaller, J.; Rickli, E.; Brossmer, R.

    1988-02-01

    In the present study the complete amino acid sequence of human plasma Zn-..cap alpha../sub 2/-glycoprotein was determined. This protein whose biological function is unknown consists of a single polypeptide chain of 276 amino acid residues including 8 tryptophan residues and has a pyroglutamyl residue at the amino terminus. The location of the two disulfide bonds in the polypeptide chain was also established. The three glycans, whose structure was elucidated with the aid of 500 MHz /sup 1/H NMR spectroscopy, were sialylated N-biantennas. The molecular weight calculated from the polypeptide and carbohydrate structure is 38,478, which is close to the reported value of approx. = 41,000 based on physicochemical measurements. The predicted secondary structure appeared to comprised of 23% ..cap alpha..-helix, 27% ..beta..-sheet, and 22% ..beta..-turns. The three N-glycans were found to be located in ..beta..-turn regions. An unexpected finding was made by computer analysis of the sequence data; this revealed that Zn-..cap alpha../sub 2/-glycoprotein is closely related to antigens of the major histocompatibility complex in amino acid sequence and in domain structure. There was an unusually high degree of sequence homology with the ..cap alpha.. chains of class I histocompatibility antigens. Moreover, this plasma protein was shown to be a member of the immunoglobulin gene superfamily. Zn-..cap alpha../sub 2/-glycoprotein appears to be truncated secretory major histocompatibility complex-related molecule, and it may have a role in the expression of the immune response.

  17. A comparison of virus genome sequences with their host silkworm, Bombyx mori.

    PubMed

    Tang, Xu-Dong; Yue, Ya-Jie; Wang, Wei; Li, Nan; Shen, Zhong-Yuan

    2016-01-15

    With the recent availability of the genomes of many viruses and the silkworm, Bombyx mori, as well as a variety of Basic Local Alignment Search Tool (BLAST) programs, a new opportunity to gain insight into the interaction of viruses with the silkworm is possible. This study aims to determine the possible existence of sequence identities between the genomes of viruses and the silkworm and attempts to explain this phenomenon. BLAST searches of the genomes of viruses against the silkworm genome were performed using the resources of the National Center for Biotechnology Information. All studied viruses contained variable numbers of short regions with sequence identity to the genome of the silkworm. The short regions of sequence identity in the genome of the silkworm may be derived from the genomes of viruses in the long history of silkworm-virus interaction. This study is the first to compare these genomes, and may contribute to research on the interaction between viruses and the silkworm. PMID:26432002

  18. Comparison of pulse sequences for R1-based electron paramagnetic resonance oxygen imaging.

    PubMed

    Epel, Boris; Halpern, Howard J

    2015-05-01

    Electron paramagnetic resonance (EPR) spin-lattice relaxation (SLR) oxygen imaging has proven to be an indispensable tool for assessing oxygen partial pressure in live animals. EPR oxygen images show remarkable oxygen accuracy when combined with high precision and spatial resolution. Developing more effective means for obtaining SLR rates is of great practical, biological and medical importance. In this work we compared different pulse EPR imaging protocols and pulse sequences to establish advantages and areas of applicability for each method. Tests were performed using phantoms containing spin probes with oxygen concentrations relevant to in vivo oxymetry. We have found that for small animal size objects the inversion recovery sequence combined with the filtered backprojection reconstruction method delivers the best accuracy and precision. For large animals, in which large radio frequency energy deposition might be critical, free induction decay and three pulse stimulated echo sequences might find better practical usage. PMID:25828242

  19. Comparison of pulse sequences for R1-based electron paramagnetic resonance oxygen imaging

    NASA Astrophysics Data System (ADS)

    Epel, Boris; Halpern, Howard J.

    2015-05-01

    Electron paramagnetic resonance (EPR) spin-lattice relaxation (SLR) oxygen imaging has proven to be an indispensable tool for assessing oxygen partial pressure in live animals. EPR oxygen images show remarkable oxygen accuracy when combined with high precision and spatial resolution. Developing more effective means for obtaining SLR rates is of great practical, biological and medical importance. In this work we compared different pulse EPR imaging protocols and pulse sequences to establish advantages and areas of applicability for each method. Tests were performed using phantoms containing spin probes with oxygen concentrations relevant to in vivo oxymetry. We have found that for small animal size objects the inversion recovery sequence combined with the filtered backprojection reconstruction method delivers the best accuracy and precision. For large animals, in which large radio frequency energy deposition might be critical, free induction decay and three pulse stimulated echo sequences might find better practical usage.

  20. Genotypic comparison of five isolates of Rickettsia prowazekii by multilocus sequence typing.

    PubMed

    Ge, Hong; Tong, Min; Jiang, Ju; Dasch, Gregory A; Richards, Allen L

    2007-06-01

    Genetic traits of five Rickettsia prowazekii isolates, including the first from Africa and North America, and representatives from human and flying squirrels were compared using multilocus sequence typing. Four rickettsial genes encoding 17 kDa genus-common antigen (17 kDa gene), citrate synthase (gltA), OmpB immunodominant antigen (ompB) and 120 kDa cytoplasmic antigen (sca4) were examined. Sequence identities of 17 kDa gene and gltA were 100% among the isolates. Limited sequence diversity of ompB (0.02-0.11%) and sca4 (0.03-0.20%) was enough to distinguish the isolates, and evaluation of the combined four genes provided a method to easily differentiate R. prowazekii from other rickettsiae. PMID:17419766

  1. Comparison of Pulse Sequences for R1–based Electron Paramagnetic Resonance Oxygen Imaging

    PubMed Central

    Epel, Boris; Halpern, Howard J.

    2015-01-01

    Electron paramagnetic resonance (EPR) spin-lattice relaxation (SLR) oxygen imaging has proven to be an indispensable tool for assessing oxygen partial pressure in live animals. EPR oxygen images show remarkable oxygen accuracy when combined with high precision and spatial resolution. Developing more effective means for obtaining SLR rates is of great practical, biological and medical importance. In this work we compared different pulse EPR imaging protocols and pulse sequences to establish advantages and areas of applicability for each method. Tests were performed using phantoms containing spin probes with oxygen concentrations relevant to in vivo oxymetry. We have found that for small animal size objects the inversion recovery sequence combined with the filtered backprojection reconstruction method delivers the best accuracy and precision. For large animals, in which large radio frequency energy deposition might be critical, free induction decay and three pulse stimulated echo sequences might find better practical usage. PMID:25828242

  2. A Phylogenetic Analysis of the Brassicales Clade Based on an Alignment-Free Sequence Comparison Method

    PubMed Central

    Hatje, Klas; Kollmar, Martin

    2012-01-01

    Phylogenetic analyses reveal the evolutionary derivation of species. A phylogenetic tree can be inferred from multiple sequence alignments of proteins or genes. The alignment of whole genome sequences of higher eukaryotes is a computational intensive and ambitious task as is the computation of phylogenetic trees based on these alignments. To overcome these limitations, we here used an alignment-free method to compare genomes of the Brassicales clade. For each nucleotide sequence a Chaos Game Representation (CGR) can be computed, which represents each nucleotide of the sequence as a point in a square defined by the four nucleotides as vertices. Each CGR is therefore a unique fingerprint of the underlying sequence. If the CGRs are divided by grid lines each grid square denotes the occurrence of oligonucleotides of a specific length in the sequence (Frequency Chaos Game Representation, FCGR). Here, we used distance measures between FCGRs to infer phylogenetic trees of Brassicales species. Three types of data were analyzed because of their different characteristics: (A) Whole genome assemblies as far as available for species belonging to the Malvidae taxon. (B) EST data of species of the Brassicales clade. (C) Mitochondrial genomes of the Rosids branch, a supergroup of the Malvidae. The trees reconstructed based on the Euclidean distance method are in general agreement with single gene trees. The Fitch–Margoliash and Neighbor joining algorithms resulted in similar to identical trees. Here, for the first time we have applied the bootstrap re-sampling concept to trees based on FCGRs to determine the support of the branchings. FCGRs have the advantage that they are fast to calculate, and can be used as additional information to alignment based data and morphological characteristics to improve the phylogenetic classification of species in ambiguous cases. PMID:22952468

  3. Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing.

    PubMed

    Teer, Jamie K; Bonnycastle, Lori L; Chines, Peter S; Hansen, Nancy F; Aoyama, Natsuyo; Swift, Amy J; Abaan, Hatice Ozel; Albert, Thomas J; Margulies, Elliott H; Green, Eric D; Collins, Francis S; Mullikin, James C; Biesecker, Leslie G

    2010-10-01

    Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data. PMID:20810667

  4. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures

    PubMed Central

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. PMID:26982818

  5. The sequence diversity and expression among genes of the folic acid biosynthesis pathway in industrial Saccharomyces strains.

    PubMed

    Goncerzewicz, Anna; Misiewicz, Anna

    2015-01-01

    Folic acid is an important vitamin in human nutrition and its deficiency in pregnant women's diets results in neural tube defects and other neurological damage to the fetus. Additionally, DNA synthesis, cell division and intestinal absorption are inhibited in case of adults. Since this discovery, governments and health organizations worldwide have made recommendations concerning folic acid supplementation of food for women planning to become pregnant. In many countries this has led to the introduction of fortifications, where synthetic folic acid is added to flour. It is known that Saccharomyces strains (brewing and bakers' yeast) are one of the main producers of folic acid and they can be used as a natural source of this vitamin. Proper selection of the most efficient strains may enhance the folate content in bread, fermented vegetables, dairy products and beer by 100% and may be used in the food industry. The objective of this study was to select the optimal producing yeast strain by determining the differences in nucleotide sequences in the FOL2, FOL3 and DFR1 genes of folic acid biosynthesis pathway. The Multitemperature Single Strand Conformation Polymorphism (MSSCP) method and further nucleotide sequencing for selected strains were applied to indicate SNPs in selected gene fragments. The RT qPCR technique was also applied to examine relative expression of the FOL3 gene. Furthermore, this is the first time ever that industrial yeast strains were analysed regarding genes of the folic acid biosynthesis pathway. It was observed that a correlation exists between the folic acid amount produced by industrial yeast strains and changes in the nucleotide sequence of adequate genes. The most significant changes occur in the DFR1 gene, mostly in the first part, which causes major protein structure modifications in KKP 232, KKP 222 and KKP 277 strains. Our study shows that the large amount of SNP contributes to impairment of the selected enzymes and S. cerevisiae and S

  6. Fatty acid profile and Unigene-derived simple sequence repeat markers in tung tree (Vernicia fordii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple se...

  7. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  8. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  9. A comparison of nucleotide sequences of measles virus L genes derived from wild-type viruses and SSPE brain tissues.

    PubMed

    Komase, K; Rima, B K; Pardowitz, I; Kunz, C; Billeter, M A; ter Meulen, V; Baczko, K

    1995-04-20

    The nucleotide sequences of the large protein (L) gene derived from two wild-type measles viruses (MV) and two SSPE brain-derived viruses have been determined. All sequences have single large open reading frames encoding 2183 amino acid residues. The deduced L proteins are well conserved and the proposed functional domains which have been identified for rhabdo- and paramyxoviruses are completely conserved in all strains. The degree of variability of L proteins is the lowest of all structural proteins of MV, reflecting its role in virus reproduction and persistence. Biased hypermutation was not observed in the L genes derived from SSPE brain tissue. None of the nucleotide changes can be associated with the attenuated phenotype of the Edmonston vaccine viruses. PMID:7747453

  10. Applicability Comparison of Methods for Acid Generation Assessment of Rock Samples

    NASA Astrophysics Data System (ADS)

    Oh, Chamteut; Ji, Sangwoo; Yim, Giljae; Cheong, Youngwook

    2014-05-01

    Minerals including various forms of sulfur could generate AMD (Acid Mine Drainage) or ARD (Acid Rock Drainage), which can have serious effects on the ecosystem and even on human when exposed to air and/or water. To minimize the hazards by acid drainage, it is necessary to assess in advance the acid generation possibility of rocks and estimate the amount of acid generation. Because of its relatively simple and effective experiment procedure, the method of combining the results of ABA (Acid Base Accounting) and NAG (Net Acid Generation) tests have been commonly used in determining acid drainage conditions. The simplicity and effectiveness of the above method however, are derived from massive assumptions of simplified chemical reactions and this often leads to results of classifying the samples as UC (Uncertain) which would then require additional experimental or field data to reclassify them properly. This paper therefore, attempts to find the reasons that cause samples to be classified as UC and suggest new series of experiments where samples can be reclassified appropriately. Study precedents on evaluating potential acid generation and neutralization capacity were reviewed and as a result three individual experiments were selected in the light of applicability and compatibility of minimizing unnecessary influence among other experiments. The proposed experiments include sulfur speciation, ABCC (Acid Buffering Characteristic Curve), and Modified NAG which are all improved versions of existing experiments of Total S, ANC (Acid Neutralizing Capacity), and NAG respectively. To assure the applicability of the experiments, 36 samples from 19 sites with diverse geologies, field properties, and weathering conditions were collected. The samples were then subject to existing experiments and as a result, 14 samples which either were classified as UC or could be used as a comparison group had been selected. Afterwards, the selected samples were used to conduct the suggested

  11. "De-novo" amino acid sequence elucidation of protein G'e by combined "Top-Down" and "Bottom-Up" mass spectrometry

    NASA Astrophysics Data System (ADS)

    Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F. M.; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L.; Glocker, Michael O.

    2015-03-01

    Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein Ǵ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α- N-gluconoylation and α- N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α- N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant ( K d ) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins.

  12. "De-novo" amino acid sequence elucidation of protein G'e by combined "top-down" and "bottom-up" mass spectrometry.

    PubMed

    Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F M; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L; Glocker, Michael O

    2015-03-01

    Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein G´ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α-N-gluconoylation and α-N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α-N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant (K(d)) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins. PMID:25560987

  13. Complete genome sequence and comparison of two Shiga toxin-producing Escherichia coli O104 isolates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Shiga toxin-producing Escherichia coli (STEC) O104 strains have been associated with sporadic cases of illness and have caused outbreaks associated with milk and sprouts. E. coli O104:H21 caused an outbreak associated with milk in the U.S. in 1994. In this study, next generation sequencing techno...

  14. Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities.

    PubMed

    Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan

    2016-01-01

    Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities. PMID:26866802

  15. Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities

    PubMed Central

    Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan

    2016-01-01

    Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities. PMID:26866802

  16. Bovine herpesvirus-1: comparison and differentiation of vaccine and field strains based on genomic sequence variation.

    PubMed

    Fulton, R W; d'Offay, J M; Eberle, R

    2013-03-01

    Bovine herpesvirus-1 (BoHV-1) causes significant disease in cattle including respiratory, fetal diseases, and reproductive tract infections. Control programs usually include vaccination with a modified live viral (MLV) vaccine. On occasion BoHV-1 strains are isolated from diseased animals or fetuses postvaccination. Currently there are no markers for differentiating MLV strains from field strains of BoHV-1. In this study several BoHV-1 strains were sequenced using whole-genome sequencing technologies and the data analyzed to identify single nucleotide polymorphisms (SNPs). Strains sequenced included the reference BoHV-1 Cooper strain (GenBank Accession JX898220), eight commercial MLV vaccine strains, and 14 field strains from cases presented for diagnosis. Based on SNP analyses, the viruses could be classified into groups having similar SNP patterns. The eight MLV strains could be differentiated from one another although some were closely related to each other. A number of field strains isolated from animals with a history of prior vaccination had SNP patterns similar to specific MLV viruses, while other field isolates were very distinct from all vaccine strains. The results indicate that some BoHV-1 isolates from clinically ill cattle/fetuses can be associated with a prior MLV vaccination history, but more information is needed on the rate of BoHV-1 genome sequence change before irrefutable associations can be drawn. PMID:23333211

  17. Bringing Next-Generation Sequencing into the Classroom through a Comparison of Molecular Biology Techniques

    ERIC Educational Resources Information Center

    Bowling, Bethany; Zimmer, Erin; Pyatt, Robert E.

    2014-01-01

    Although the development of next-generation (NextGen) sequencing technologies has revolutionized genomic research and medicine, the incorporation of these topics into the classroom is challenging, given an implied high degree of technical complexity. We developed an easy-to-implement, interactive classroom activity investigating the similarities…

  18. Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence

    PubMed Central

    Kim, Jae-In; Kim, Taejung

    2016-01-01

    Epipolar resampling is the procedure of eliminating vertical disparity between stereo images. Due to its importance, many methods have been developed in the computer vision and photogrammetry field. However, we argue that epipolar resampling of image sequences, instead of a single pair, has not been studied thoroughly. In this paper, we compare epipolar resampling methods developed in both fields for handling image sequences. Firstly we briefly review the uncalibrated and calibrated epipolar resampling methods developed in computer vision and photogrammetric epipolar resampling methods. While it is well known that epipolar resampling methods developed in computer vision and in photogrammetry are mathematically identical, we also point out differences in parameter estimation between them. Secondly, we tested representative resampling methods in both fields and performed an analysis. We showed that for epipolar resampling of a single image pair all uncalibrated and photogrammetric methods tested could be used. More importantly, we also showed that, for image sequences, all methods tested, except the photogrammetric Bayesian method, showed significant variations in epipolar resampling performance. Our results indicate that the Bayesian method is favorable for epipolar resampling of image sequences. PMID:27011186

  19. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    PubMed

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. PMID:27237775

  20. Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence.

    PubMed

    Kim, Jae-In; Kim, Taejung

    2016-01-01

    Epipolar resampling is the procedure of eliminating vertical disparity between stereo images. Due to its importance, many methods have been developed in the computer vision and photogrammetry field. However, we argue that epipolar resampling of image sequences, instead of a single pair, has not been studied thoroughly. In this paper, we compare epipolar resampling methods developed in both fields for handling image sequences. Firstly we briefly review the uncalibrated and calibrated epipolar resampling methods developed in computer vision and photogrammetric epipolar resampling methods. While it is well known that epipolar resampling methods developed in computer vision and in photogrammetry are mathematically identical, we also point out differences in parameter estimation between them. Secondly, we tested representative resampling methods in both fields and performed an analysis. We showed that for epipolar resampling of a single image pair all uncalibrated and photogrammetric methods tested could be used. More importantly, we also showed that, for image sequences, all methods tested, except the photogrammetric Bayesian method, showed significant variations in epipolar resampling performance. Our results indicate that the Bayesian method is favorable for epipolar resampling of image sequences. PMID:27011186

  1. Comparison of Ribotyping and sequence-based typing for discriminating among isolates of Bordetella bronchiseptica

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aims: Our goal was to compare the discriminatory power of PvuII ribotyping and MLST using a single set of diverse Bordetella bronchiseptica isolates and to determine whether subtyping based on repeat region sequences of the pertactin gene (prn) provides additional resolution. Methods and Results: ...

  2. Amino Acids Generated from Hydrated Titan Tholins: Comparison with Miller-Urey Electric Discharge Products

    NASA Technical Reports Server (NTRS)

    Cleaves, H. James, II; Neish, Catherine; Callahan, Michael P.; Parker, Eric; Fernandez, Facundo M.; Dworkin, Jason P.

    2014-01-01

    Various analogues of Titan haze particles (termed tholins) have been made in the laboratory. In certain geologic environments on Titan, these haze particles may come into contact with aqueous ammonia (NH3) solutions, hydrolyzing them into molecules of astrobiological interest. A Titan tholin analogue hydrolyzed in aqueous NH3 at room temperature for 2.5 years was analyzed for amino acids using highly sensitive ultra-high performance liquid chromatography coupled with fluorescence detection and time-of-flight mass spectrometry (UHPLC-FDToF-MS) analysis after derivatization with a fluorescent tag. We compare here the amino acids produced from this reaction sequence with those generated from room temperature Miller-Urey (MU) type electric discharge reactions. We find that most of the amino acids detected in low temperature MU CH4N2H2O electric discharge reactions are generated in Titan simulation reactions, as well as in previous simulations of Triton chemistry. This argues that many processes provide very similar mixtures of amino acids, and possibly other types of organic compounds, in disparate environments, regardless of the order of hydration. Although it is unknown how life began, it is likely that given reducing conditions, similar materials were available throughout the early Solar System and throughout the universe to facilitate chemical evolution.

  3. Amino acids generated from hydrated Titan tholins: Comparison with Miller-Urey electric discharge products

    NASA Astrophysics Data System (ADS)

    Cleaves, H. James; Neish, Catherine; Callahan, Michael P.; Parker, Eric; Fernández, Facundo M.; Dworkin, Jason P.

    2014-07-01

    Various analogues of Titan haze particles (termed ‘tholins’) have been made in the laboratory. In certain geologic environments on Titan, these haze particles may come into contact with aqueous ammonia (NH3) solutions, hydrolyzing them into molecules of astrobiological interest. A Titan tholin analogue hydrolyzed in aqueous NH3 at room temperature for 2.5 years was analyzed for amino acids using highly sensitive ultra-high performance liquid chromatography coupled with fluorescence detection and time-of-flight mass spectrometry (UHPLC-FD/ToF-MS) analysis after derivatization with a fluorescent tag. We compare here the amino acids produced from this reaction sequence with those generated from room temperature Miller-Urey (MU) type electric discharge reactions. We find that most of the amino acids detected in low temperature MU CH4/N2/H2O electric discharge reactions are generated in Titan simulation reactions, as well as in previous simulations of Triton chemistry. This argues that many processes provide very similar mixtures of amino acids, and possibly other types of organic compounds, in disparate environments, regardless of the order of hydration. Although it is unknown how life began, it is likely that given reducing conditions, similar materials were available throughout the early Solar System and throughout the universe to facilitate chemical evolution.

  4. Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation.

    PubMed

    Daughdrill, Gary W; Narayanaswami, Pranesh; Gilmore, Sara H; Belczyk, Agniezka; Brown, Celeste J

    2007-09-01

    Proteins or regions of proteins that do not form compact globular structures are classified as intrinsically unstructured proteins (IUPs). IUPs are common in nature and have essential molecular functions, but even a limited understanding of the evolution of their dynamic behavior is lacking. The primary objective of this work was to test the evolutionary conservation of dynamic behavior for a particular class of IUPs that form intrinsically unstructured linker domains (IULD) that tether flanking folded domains. This objective was accomplished by measuring the backbone flexibility of several IULD homologues using nuclear magnetic resonance (NMR) spectroscopy. The backbone flexibility of five IULDs, representing three kingdoms, was measured and analyzed. Two IULDs from animals, one IULD from fungi, and two IULDs from plants showed similar levels of backbone flexibility that were consistent with the absence of a compact globular structure. In contrast, the amino acid sequences of the IULDs from these three taxa showed no significant similarity. To investigate how the dynamic behavior of the IULDs could be conserved in the absence of detectable sequence conservation, evolutionary rate studies were performed on a set of nine mammalian IULDs. The results of this analysis showed that many sites in the IULD are evolving neutrally, suggesting that dynamic behavior can be maintained in the absence of natural selection. This work represents the first experimental test of the evolutionary conservation of dynamic behavior and demonstrates that amino acid sequence conservation is not required for the conservation of dynamic behavior and presumably molecular function. PMID:17721672

  5. Cloning and nucleotide sequencing of genes for three small, acid-soluble proteins from Bacillus subtilis spores.

    PubMed Central

    Connors, M J; Mason, J M; Setlow, P

    1986-01-01

    Three Bacillus subtilis genes (termed sspA, sspB, and sspD) which code for small, acid-soluble spore proteins (SASPs) have been cloned, and their complete nucleotide sequence has been determined. The amino acid sequences of the SASPs coded for by these genes are similar to each other and to those of the SASP-1 of B. subtilis (coded for by the sspC gene) and the SASP-A/C family of B. megaterium. The sspA and sspB genes are expressed only in sporulation, in parallel with each other and with the sspC gene. Two regions upstream of the postulated transcription start sites for the sspA and B genes have significant homology with the analogous regions of the sspC gene and the SASP-A/C gene family. Purification of two of the three major B, subtilis SASPs (alpha and beta) and determination of their amino-terminal sequences indicated that the sspA gene codes for SASP-alpha and that the sspB gene codes for SASP-beta. This was confirmed by the introduction of deletion mutations into the cloned sspA and sspB genes and transfer of these deletions into the B. subtilis chromosome with concomitant loss of the wild-type gene. Images PMID:3009398

  6. Nucleotide sequence of the fadR gene, a multifunctional regulator of fatty acid metabolism in Escherichia coli.

    PubMed Central

    DiRusso, C C

    1988-01-01

    The Escherichia coli fadR gene is a multifunctional regulator of fatty acid and acetate metabolism. In the present work the nucleotide sequence of the 1.3 kb DNA fragment which encodes FadR has been determined. The coding sequence of the fadR gene is 714 nucleotides long and is preceded by a typical E. coli ribosome binding site and is followed by a sequence predicted to be sufficient for factor-independent chain termination. Primer extension experiments demonstrated that the transcription of the fadR gene initiates with an adenine nucleotide 33 nucleotides upstream from the predicted start of translation. The derived fadR peptide has a calculated molecular weight of 26,972. This is in reasonable agreement with the apparent molecular weight of 29,000 previously estimated on the basis of maxi-cell analysis of plasmid encoded proteins. There is a segment of twenty amino acids within the predicted peptide which resembles the DNA recognition and binding site of many transcriptional regulatory proteins. Images PMID:2843809

  7. The amino acid sequence of protein SCMK-B2C from the high-sulphur fraction of wool keratin

    PubMed Central

    Elleman, T. C.

    1972-01-01

    1. The amino acid sequence of a protein from the reduced and carboxymethylated high-sulphur fraction of wool has been determined. 2. The sequence of this S-carboxymethylkerateine (SCMK-B2C) of 151 amino acid residues displays much internal homology and an unusual residue distribution. Thus a ten-residue sequence occurs four times near the N-terminus and five times near the C-terminus with few changes. These regions contain much of the molecule's half-cystine, whereas between them there is a region of 19 residues that are mainly small and devoid of cystine and proline. 3. Certain models of the wool fibre based on its mechanical and physical properties propose a matrix of small compact globular units linked together to form beaded chains. The unusual distribution of the component residues of protein SCMK-B2C suggests structures in the wool-fibre matrix compatible with certain features of the proposed models. PMID:4678578

  8. The Complete Mitochondrial Genome Sequence of Bactericera cockerelli and Comparison with Three Other Psylloidea Species

    PubMed Central

    Wu, Fengnian; Cen, Yijing; Wallis, Christopher M.; Trumble, John T.; Prager, Sean; Yokomi, Ray; Zheng, Zheng; Deng, Xiaoling; Chen, Jianchi; Liang, Guangwen

    2016-01-01

    Potato psyllid (Bactericera cockerelli) is an important pest of potato, tomato and pepper. Not only could a toxin secreted by nymphs results in serious phytotoxemia in some host plants, but also over the past few years B. cockerelli was shown to transmit “Candidatus Liberibacter solanacearum”, the putative bacterial pathogen of potato zebra chip (ZC) disease, to potato and tomato. ZC has caused devastating losses to potato production in the western U.S., Mexico, and elsewhere. New knowledge of the genetic diversity of the B. cockerelli is needed to develop improved strategies to manage pest populations. Mitochondrial genome (mitogenome) sequencing provides important knowledge about insect evolution and diversity in and among populations. This report provides the first complete B. cockerelli mitogenome sequence as determined by next generation sequencing technology (Illumina MiSeq). The circular B. cockerelli mitogenome had a size of 15,220 bp with 13 protein-coding gene (PCGs), 2 ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), and a non-coding region of 975 bp. The overall gene order of the B. cockerelli mitogenome is identical to three other published Psylloidea mitogenomes: one species from the Triozidae, Paratrioza sinica; and two species from the Psyllidae, Cacopsylla coccinea and Pachypsylla venusta. This suggests all of these species share a common ancestral mitogenome. However, sequence analyses revealed differences between and among the insect families, in particular a unique region that can be folded into three stem-loop secondary structures present only within the B. cockerelli mitogenome. A phylogenetic tree based on the 13 PCGs matched an existing taxonomy scheme that was based on morphological characteristics. The available complete mitogenome sequence makes it accessible to all genes for future population diversity evaluation of B. cockerelli. PMID:27227976

  9. The Complete Mitochondrial Genome Sequence of Bactericera cockerelli and Comparison with Three Other Psylloidea Species.

    PubMed

    Wu, Fengnian; Cen, Yijing; Wallis, Christopher M; Trumble, John T; Prager, Sean; Yokomi, Ray; Zheng, Zheng; Deng, Xiaoling; Chen, Jianchi; Liang, Guangwen

    2016-01-01

    Potato psyllid (Bactericera cockerelli) is an important pest of potato, tomato and pepper. Not only could a toxin secreted by nymphs results in serious phytotoxemia in some host plants, but also over the past few years B. cockerelli was shown to transmit "Candidatus Liberibacter solanacearum", the putative bacterial pathogen of potato zebra chip (ZC) disease, to potato and tomato. ZC has caused devastating losses to potato production in the western U.S., Mexico, and elsewhere. New knowledge of the genetic diversity of the B. cockerelli is needed to develop improved strategies to manage pest populations. Mitochondrial genome (mitogenome) sequencing provides important knowledge about insect evolution and diversity in and among populations. This report provides the first complete B. cockerelli mitogenome sequence as determined by next generation sequencing technology (Illumina MiSeq). The circular B. cockerelli mitogenome had a size of 15,220 bp with 13 protein-coding gene (PCGs), 2 ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), and a non-coding region of 975 bp. The overall gene order of the B. cockerelli mitogenome is identical to three other published Psylloidea mitogenomes: one species from the Triozidae, Paratrioza sinica; and two species from the Psyllidae, Cacopsylla coccinea and Pachypsylla venusta. This suggests all of these species share a common ancestral mitogenome. However, sequence analyses revealed differences between and among the insect families, in particular a unique region that can be folded into three stem-loop secondary structures present only within the B. cockerelli mitogenome. A phylogenetic tree based on the 13 PCGs matched an existing taxonomy scheme that was based on morphological characteristics. The available complete mitogenome sequence makes it accessible to all genes for future population diversity evaluation of B. cockerelli. PMID:27227976

  10. The nucleotide sequence of HLA-B{sup *}2704 reveals a new amino acid substitution in exon 4 which is also present in HLA-B{sup *}2706

    SciTech Connect

    Rudwaleit, M.; Bowness, P.; Wordsworth, P.

    1996-12-31

    The HLA-B27 subtype HLA-B{sup *}2704 is virtually absent in Caucasians but common in Orientals, where it is associated with ankylosing spondylitis. The amino acid sequence of HLA-B{sup *}2704 has been established by peptide mapping and was shown to differ by two amino acids from HLA-B{sup *}2705, HLA-B{sup *}2704 is characterized by a serine for aspartic acid substitution at position 77 and glutamic acid for valine at position 152. To date, however, no nucleotide sequence confirming these changes at the DNA level has been published. 13 refs., 2 figs.

  11. JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms.

    PubMed

    Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim

    2015-01-01

    The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/. PMID:26424080

  12. JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms

    PubMed Central

    Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim

    2015-01-01

    The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/ PMID:26424080

  13. Amino acid sequence and molecular modelling of glycoprotein IIb-IIIa and fibronectin receptor iso-antagonists from Trimeresurus elegans venom.

    PubMed Central

    Scaloni, A; Di Martino, E; Miraglia, N; Pelagalli, A; Della Morte, R; Staiano, N; Pucci, P

    1996-01-01

    Low-molecular-mass Arg-Gly-Asp (RGD)-containing polypeptides were isolated from the venom of Trimeresurus elegans by a simple two-step procedure consisting of membrane filtration and reverse-phase HPLC. A combination of electrospray MS, fast-atom bombardment MS and Edman degradation allowed us to ascertain the presence in the venom of different isoforms and to determine their primary structures. The amino acid sequences resembled the structure of elegantin, the only disintegrin previously reported from the T. elegans venom [Williams, Rucinski, Holt and Niewiarowski (1990) Biochim. Biophys, Acta 1039, 81-89]. MS analyses indicated the occurrence of differential proteolytic processing at both the N-terminus and the C-termins of the polypeptide chains. The amino acid sequence alignment of the elegantin isoforms with known components of the disintegrin family demonstrated the complete conservation of the 12 cysteine residues involved in disulphide bridges. Molecular modelling of elegantins predicted an overall folding of these molecules quite similar to that reported for the kistrin solution structure. The newly identified polypeptide isoforms strongly inhibited ADP-induced aggregation in both human and canine platelet-rich plasma but showed a different species-dependent specificity. These molecules were also able to inhibit B16-BL6 murine melanoma cell adhesion to immobilized fibronectin. The comparison of the structures and biological activities of elegantin isoforms and kistrin allowed us to highlight some structural features that, in addition to the RGD locus might be involved in the interaction of these snake-venom polypeptides with the integrin receptors on the platelet and cell surface. PMID:8920980

  14. Comparison between Topical and Oral Tranexamic Acid in Management of Traumatic Hyphema

    PubMed Central

    Jahadi Hosseini, Seyed Hamid Reza; Khalili, Mohammad Reza; Motallebi, Mahmoud

    2014-01-01

    Background: We sought to determine the efficacy of topical tranexamic acid (5%) in the management of traumatic hyphema. Methods: Thirty eyes with gross traumatic hyphema were enrolled in this study. The patients were treated with tranexamic acid (5%) eye drop every 6 hours for 5 days. The main outcome measures were best corrected visual acuity (BCVA), Intra-ocular pressure (IOP), day of clot absorption, and rate of rebleeding. These parameters were evaluated daily for 4 days and thereafter at the 8th and 14th days after treatment. The patients were also compared with two historical control groups of patients (80 eyes) with traumatic hyphema; the first control group was treated with oral placebo and the other group was treated with oral tranexamic acid at our department. Result: Prior to treatment, the mean logarithm of the minimum angle of resolution (logMAR) BCVA was 0.59±0.62. BCVA was increased to 0.08±0.14 at day 14 (P<0.001) and the mean IOP before treatment was 13.7±3.9 mm Hg, which was reduced to 11.4±1.8 mm Hg at day 14 (P=0.004). Rebleeding occurred in one (3.3%) patient on the 4th day post treatment. Comparison between the case group and the other two historical control groups with respect to the rebleeding rate demonstrated statistically significant differences between the case group and the first control group (P=0.008) but no statistically significant differences between the case group and the second control group (P=0.25). Conclusion: Topical tranexamic acid seems promising in the management of traumatic hyphema. However, the small sample size of the present study precludes the conclusion that topical tranexamic acid can replace the oral tranexamic acid. PMID:24753640

  15. Comparison of automated pre-column and post-column analysis of amino acid oligomers

    NASA Technical Reports Server (NTRS)

    Chow, J.; Orenberg, J. B.; Nugent, K. D.

    1987-01-01

    It has been shown that various amino acids will polymerize under plausible prebiotic conditions on mineral surfaces, such as clays and soluble salts, to form varying amounts of oligomers (n = 2-6). The investigations of these surface reactions required a quantitative method for the separation and detection of these amino acid oligomers at the picomole level in the presence of nanomole levels of the parent amino acid. In initial high-performance liquid chromatography (HPLC) studies using a classical postcolumn o-phthalaldehyde (OPA) derivatization ion-exchange HPLC procedure with fluorescence detection, problems encountered included lengthy analysis time, inadequate separation and large relative differences in sensitivity for the separated species, expressed as a variable fluorescent yield, which contributed to poor quantitation. We have compared a simple, automated, pre-column OPA derivatization and reversed-phase HPLC method with the classical post-column OPA derivatization and ion-exchange HPLC procedure. A comparison of UV and fluorescent detection of the amino acid oligomers is also presented. The conclusion reached is that the pre-column OPA derivatization, reversed-phase HPLC and UV detection produces enhanced separation, improved sensitivity and faster analysis than post-column OPA derivatization, ion-exchange HPLC and fluorescence detection.

  16. Comparison of Four Strong Acids on the Precipitation Potential of Gypsum in Brines During Distillation of Pretreated, Augmented Urine

    NASA Technical Reports Server (NTRS)

    Muirhead, Dean

    2011-01-01

    Two batches of nominally pretreated and augmented urine were prepared with the baseline pretreatment formulation of sulfuric acid and chromium trioxide. The urine was augmented with inorganic salts and organic compounds in order to simulate a urinary ionic concentrations representing the upper 95 percentile on orbit. Three strong mineral acids: phosphoric, hydrochloric, and nitric acid, were substituted for the sulfuric acid for comparison to the baseline sulfuric acid pretreatment formulation. Three concentrations of oxidizer in the pretreatment formulation were also tested. Pretreated urine was distilled to 85% water recovery to determine the effect of each acid and its conjugate base on the precipitation of minerals during distillation. The brines were analyzed for calcium and sulfate ion, total, volatile, and fixed suspended solids. Test results verified that substitution of phosphoric, hydrochloric, or nitric acids for sulfuric acid would prevent the precipitation of gypsum up to 85% recovery from pretreated urine representing the upper 95 percentile calcium concentration on orbit.

  17. A molecular mechanism realizing sequence-specific recognition of nucleic acids by TDP-43

    PubMed Central

    Furukawa, Yoshiaki; Suzuki, Yoh; Fukuoka, Mami; Nagasawa, Kenichi; Nakagome, Kenta; Shimizu, Hideaki; Mukaiyama, Atsushi; Akiyama, Shuji

    2016-01-01

    TAR DNA-binding protein 43 (TDP-43) is a DNA/RNA-binding protein containing two consecutive RNA recognition motifs (RRM1 and RRM2) in tandem. Functional abnormality of TDP-43 has been proposed to cause neurodegeneration, but it remains obscure how the physiological functions of this protein are regulated. Here, we show distinct roles of RRM1 and RRM2 in the sequence-specific substrate recognition of TDP-43. RRM1 was found to bind a wide spectrum of ssDNA sequences, while no binding was observed between RRM2 and ssDNA. When two RRMs are fused in tandem as in native TDP-43, the fused construct almost exclusively binds ssDNA with a TG-repeat sequence. In contrast, such sequence-specificity was not observed in a simple mixture of RRM1 and RRM2. We thus propose that the spatial arrangement of multiple RRMs in DNA/RNA binding proteins provides steric effects on the substrate-binding site and thereby controls the specificity of its substrate nucleotide sequences. PMID:26838063

  18. Application of combined mass spectrometry and partial amino acid sequence to the identification of gel-separated proteins.

    PubMed

    Patterson, S D; Thomas, D; Bradshaw, R A

    1996-05-01

    The combined use of peptide mass information with amino acid sequence information derived by chemical sequencing or mass spectrometry (MS)-based approaches provides a powerful means of protein identification. We have used a two-part strategy to identify proteins from nerve growth factor (NGF)-stimulated rat adrenal pheochromocytoma cell line PC-12 cell lysates that associate with the adaptor protein Shc (Shc homologous and collagen protein). Initial experiments with metabolically radiolabeled cell extracts separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) revealed a number of proteins that coimmunoprecipitated with anti-Shc antibody compared with control (unstimulated) cell extracts. The experiment was scaled up and cell lysate from NGF-stimulated PC-12 cells was applied to a glutathione-S-transferase (GST)-Shc affinity column, eluted, separated by SDS-PAGE and blotted to Immobilon-CD. The blotted proteins were proteolytically digested in situ, and the masses obtained from the extracted peptides were used in a peptide-mass search program in an attempt to identify the protein. Even if a strong candidate was found using this search, an additional step was performed to confirm the identification. The mixtures were fractionated by reversed-phase high-performance liquid chromatography (RP-HPLC) and subjected to chemical sequencing to obtain (partial) sequence information, or post-source decay (PSD-) matrix-assisted laser-desorption ionization (MALDI)-MS to obtain sequence-specific fragment ions. This data was used in a peptide-sequence tag search to confirm the identity of the proteins. This combined approach allowed identification of four proteins of M(r) 43,000 to 200,000. In one case the identified protein clearly did not correspond to the radiolabeled band, but to a protein contaminant from the column. The advantages and pitfalls of the approach are discussed. PMID:8783013

  19. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads

    PubMed Central

    Rebolledo-Mendez, Jovan; Hestand, Matthew S.; Coleman, Stephen J.; Zeng, Zheng; Orlando, Ludovic; MacLeod, James N.; Kalbfleisch, Ted

    2015-01-01

    The reference assembly for the domestic horse, EquCab2, published in 2009, was built using approximately 30 million Sanger reads from a Thoroughbred mare named Twilight. Contiguity in the assembly was facilitated using nearly 315 thousand BAC end sequences from Twilight’s half brother Bravo. Since then, it has served as the foundation for many genome-wide analyses that include not only the modern horse, but ancient horses and other equid species as well. As data mapped to this reference has accumulated, consistent variation between mapped datasets and the reference, in terms of regions with no read coverage, single nucleotide variants, and small insertions/deletions have become apparent. In many cases, it is not clear whether these differences are the result of true sequence variation between the research subjects’ and Twilight’s genome or due to errors in the reference. EquCab2 is regarded as “The Twilight Assembly.” The objective of this study was to identify inconsistencies between the EquCab2 assembly and the source Twilight Sanger data used to build it. To that end, the original Sanger and BAC end reads have been mapped back to this equine reference and assessed with the addition of approximately 40X coverage of new Illumina Paired-End sequence data. The resulting mapped datasets identify those regions with low Sanger read coverage, as well as variation in genomic content that is not consistent with either the original Twilight Sanger data or the new genomic sequence data generated from Twilight on the Illumina platform. As the haploid EquCab2 reference assembly was created using Sanger reads derived largely from a single individual, the vast majority of variation detected in a mapped dataset comprised of those same Sanger reads should be heterozygous. In contrast, homozygous variations would represent either errors in the reference or contributions from Bravo's BAC end sequences. Our analysis identifies 720,843 homozygous discrepancies between new

  20. Peptide mapping and amino acid sequencing of two catechol 1,2-dioxygenases (CD I1 and CD I2) from Acinetobacter lwoffii K24.

    PubMed

    Kim, S I; Ha, K S

    1997-10-31

    The partial amino acid sequences of two catechol 1,2-dioxygenases (CD I1 and CD I2) from Acinetobacter lwoffii K24 have been determined by analysis of peptides after cleavages with endopeptidase Lys-C, endopeptidase Glu-C, trypsin, and chemicals (cyanogen bromide and BNPS-skatole). They include 248 amino acid sequences (4 fragments) of CD I1 and 211 amino acid sequences (5 fragments) of CD I2. Two enzymes have more than 50% sequence homology with type I catechol 1,2-dioxygenases and less than 30% sequence homology with type II catechol 1,2-dioxygenases. Two enzymes have similar hydropathy profiles in the N-terminal region, suggesting that they have similar secondary structures. PMID:9387151