Sample records for acid sequences based

  1. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  2. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  3. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  4. Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.

    PubMed

    Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami

    2012-08-01

    Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or <0.5% or >15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.

  5. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those listed... the Feature section. Otherwise, each occurrence of a base or amino acid not appearing in WIPO Standard...

  6. Labeled nucleotide phosphate (NP) probes

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2009-02-03

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Nucleic acid analysis using terminal-phosphate-labeled nucleotides

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2008-04-22

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  8. Application of 2D graphic representation of protein sequence based on Huffman tree method.

    PubMed

    Qi, Zhao-Hui; Feng, Jun; Qi, Xiao-Qin; Li, Ling

    2012-05-01

    Based on Huffman tree method, we propose a new 2D graphic representation of protein sequence. This representation can completely avoid loss of information in the transfer of data from a protein sequence to its graphic representation. The method consists of two parts. One is about the 0-1 codes of 20 amino acids by Huffman tree with amino acid frequency. The amino acid frequency is defined as the statistical number of an amino acid in the analyzed protein sequences. The other is about the 2D graphic representation of protein sequence based on the 0-1 codes. Then the applications of the method on ten ND5 genes and seven Escherichia coli strains are presented in detail. The results show that the proposed model may provide us with some new sights to understand the evolution patterns determined from protein sequences and complete genomes. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  10. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  11. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  12. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    PubMed

    Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

    2015-01-01

    Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  13. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  14. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

    PubMed

    Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

    2017-04-15

    Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.

  15. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979

  16. Quantitative thermodynamic predication of interactions between nucleic acid and non-nucleic acid species using Microsoft excel.

    PubMed

    Zou, Jiaqi; Li, Na

    2013-09-01

    Proper design of nucleic acid sequences is crucial for many applications. We have previously established a thermodynamics-based quantitative model to help design aptamer-based nucleic acid probes by predicting equilibrium concentrations of all interacting species. To facilitate customization of this thermodynamic model for different applications, here we present a generic and easy-to-use platform to implement the algorithm of the model with Microsoft(®) Excel formulas and VBA (Visual Basic for Applications) macros. Two Excel spreadsheets have been developed: one for the applications involving only nucleic acid species, the other for the applications involving both nucleic acid and non-nucleic acid species. The spreadsheets take the nucleic acid sequences and the initial concentrations of all species as input, guide the user to retrieve the necessary thermodynamic constants, and finally calculate equilibrium concentrations for all species in various bound and unbound conformations. The validity of both spreadsheets has been verified by comparing the modeling results with the experimental results on nucleic acid sequences reported in the literature. This Excel-based platform described here will allow biomedical researchers to rationalize the sequence design of nucleic acid probes using the thermodynamics-based modeling even without relevant theoretical and computational skills. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  17. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  18. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  19. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  20. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, M.S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.

  1. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  2. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  3. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2003-08-19

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  4. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1999-10-26

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  5. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2001-06-05

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  6. Microwave-assisted acid and base hydrolysis of intact proteins containing disulfide bonds for protein sequence analysis by mass spectrometry.

    PubMed

    Reiz, Bela; Li, Liang

    2010-09-01

    Controlled hydrolysis of proteins to generate peptide ladders combined with mass spectrometric analysis of the resultant peptides can be used for protein sequencing. In this paper, two methods of improving the microwave-assisted protein hydrolysis process are described to enable rapid sequencing of proteins containing disulfide bonds and increase sequence coverage, respectively. It was demonstrated that proteins containing disulfide bonds could be sequenced by MS analysis by first performing hydrolysis for less than 2 min, followed by 1 h of reduction to release the peptides originally linked by disulfide bonds. It was shown that a strong base could be used as a catalyst for microwave-assisted protein hydrolysis, producing complementary sequence information to that generated by microwave-assisted acid hydrolysis. However, using either acid or base hydrolysis, amide bond breakages in small regions of the polypeptide chains of the model proteins (e.g., cytochrome c and lysozyme) were not detected. Dynamic light scattering measurement of the proteins solubilized in an acid or base indicated that protein-protein interaction or aggregation was not the cause of the failure to hydrolyze certain amide bonds. It was speculated that there were some unknown local structures that might play a role in preventing an acid or base from reacting with the peptide bonds therein. 2010 American Society for Mass Spectrometry. Published by Elsevier Inc. All rights reserved.

  7. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    PubMed

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  8. Brain cDNA clone for human cholinesterase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McTiernan, C.; Adkins, S.; Chatonnet, A.

    1987-10-01

    A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leong, JoAnn Ching

    The nucleotide sequence of the IHNV glycoprotein gene has been determined from a cDNA clone containing the entire coding region. The glycoprotein cDNA clone contained a leader sequence of 48 bases, a coding region of 1524 nucleotides, and 39 bases at the 3 foot end. The entire cDNA clone contains 1609 nucleodites and encodes a protein of 508 amino acids. The deduced amino acid sequence gave a translated molecular weight of 56,795 daltons. A hydropathicity profile of the deduced amino acid sequence indicated that there were two major hydrophobic domains: one,at the N-terminus,delineating a signal peptide of 18 amino acidsmore » and the other, at the C-terminus,delineating the region of the transmembrane. Five possible sites of N-linked glyscoylation were identified. Although no nucleic acid homology existed between the IHNV glycoprotein gene and the glycoprotein genes of rabies and VSV, there was significant homology at the amino acid level between all three rhabdovirus glycoproteins.« less

  10. High speed nucleic acid sequencing

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  11. Method of Identifying a Base in a Nucleic Acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    1999-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  12. Identifying a base in a nucleic acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2005-02-08

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  13. A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package

    PubMed Central

    Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.

    2013-01-01

    Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703

  14. A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package.

    PubMed

    Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M

    2013-01-01

    Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.

  15. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...

  16. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...

  18. Construction Strategy for an Internal Amplification Control for Real-Time Diagnostic Assays Using Nucleic Acid Sequence-Based Amplification: Development and Clinical Application

    PubMed Central

    Rodríguez-Lázaro, David; D'Agostino, Martin; Pla, Maria; Cook, Nigel

    2004-01-01

    An important analytical control in molecular amplification-based methods is an internal amplification control (IAC), which should be included in each reaction mixture. An IAC is a nontarget nucleic acid sequence which is coamplified simultaneously with the target sequence. With negative results for the target nucleic acid, the absence of an IAC signal indicates that amplification has failed. A general strategy for the construction of an IAC for inclusion in molecular beacon-based real-time nucleic acid sequence-based amplification (NASBA) assays is presented. Construction proceeds in two phases. In the first phase, a double-stranded DNA molecule that contains nontarget sequences flanked by target sequences complementary to the NASBA primers is produced. At the 5′ end of this DNA molecule is a T7 RNA polymerase binding sequence. In the second phase of construction, RNA transcripts are produced from the DNA by T7 RNA polymerase. This RNA is the IAC; it is amplified by the target NASBA primers and is detected by a molecular beacon probe complementary to the internal nontarget sequences. As a practical example, an IAC for use in an assay for the detection of Mycobacterium avium subsp. paratuberculosis is described, its incorporation and optimization within the assay are detailed, and its application to spiked and natural clinical samples is shown to illustrate the correct interpretation of the diagnostic results. PMID:15583319

  19. Molecular and Cellular Mechanisms for the Interaction between Gold Nanoparticles and Neuroimmune Cells Based on Size, Shape, and Charge

    DTIC Science & Technology

    2014-04-25

    IgG secretion. 2.3 Designing of Synthetic peptide The immunogenic peptides against the foot and mouth disease virus ( FMDV ) were designed and...synthesized based on viral protein 1 of type O FMDV . The amino acid sequence for pFMDV is NGSSKYGDTSTNNVRGDLQVLAQKAERTLC. An extra cysteine was added...peptides were synthesized based on the amino acid sequence of the VP1 coat protein of the FMDV (table 1). The peptide pFMDVD (19 amino acids in length

  20. Variability of the protein sequences of lcrV between epidemic and atypical rhamnose-positive strains of Yersinia pestis.

    PubMed

    Anisimov, Andrey P; Panfertsev, Evgeniy A; Svetoch, Tat'yana E; Dentovskaya, Svetlana V

    2007-01-01

    Sequencing of lcrV genes and comparison of the deduced amino acid sequences from ten Y. pestis strains belonging mostly to the group of atypical rhamnose-positive isolates (non-pestis subspecies or pestoides group) showed that the LcrV proteins analyzed could be classified into five sequence types. This classification was based on major amino acid polymorphisms among LcrV proteins in the four "hot points" of the protein sequences. Some additional minor polymorphisms were found throughout these sequence types. The "hot points" corresponded to amino acids 18 (Lys --> Asn), 72 (Lys --> Arg), 273 (Cys --> Ser), and 324-326 (Ser-Gly-Lys --> Arg) in the LcrV sequence of the reference Y. pestis strain CO92. One possible explanation for polymorphism in amino acid sequences of LcrV among different strains is that strain-specific variation resulted from adaptation of the plague pathogen to different rodent and lagomorph hosts.

  1. Probe kit for identifying a base in a nucleic acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  2. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  3. Detection of nucleic acids by multiple sequential invasive cleavages

    DOEpatents

    Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.

  4. Nucleic acid detection kits

    DOEpatents

    Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann; Kwiatkowski, Robert W.; Vavra, Stephanie H.

    2005-03-29

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of nucleic acid from various viruses in a sample.

  5. Detection of nucleic acids by multiple sequential invasive cleavages 02

    DOEpatents

    Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.

    2002-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.

  6. Detection of nucleic acids by multiple sequential invasive cleavages

    DOEpatents

    Hall, Jeff G; Lyamichev, Victor I; Mast, Andrea L; Brow, Mary Ann D

    2012-10-16

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.

  7. WEB-server for search of a periodicity in amino acid and nucleotide sequences

    NASA Astrophysics Data System (ADS)

    E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

    2017-12-01

    A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.

  8. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

    PubMed Central

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-01-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

  9. WebLogo

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crooks, Gavin E.

    WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Sequesnce logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richermore » and more precise description of, for example, a binding site, than would a consensus sequence.« less

  10. Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions.

    PubMed

    Nishizawa, M; Nishizawa, K

    2000-10-01

    The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.

  11. Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions

    PubMed Central

    Nishizawa, Manami; Nishizawa, Kazuhisa

    2000-01-01

    The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the ‘between gene’ GC content heterogeneity, which is linked to ‘isochores’, is a principal factor associated with the bias in substitution patterns in human, ‘within gene’ heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed. PMID:11000273

  12. Arrays of nucleic acid probes on biological chips

    DOEpatents

    Chee, Mark; Cronin, Maureen T.; Fodor, Stephen P. A.; Huang, Xiaohua X.; Hubbell, Earl A.; Lipshutz, Robert J.; Lobban, Peter E.; Morris, MacDonald S.; Sheldon, Edward L.

    1998-11-17

    DNA chips containing arrays of oligonucleotide probes can be used to determine whether a target nucleic acid has a nucleotide sequence identical to or different from a specific reference sequence. The array of probes comprises probes exactly complementary to the reference sequence, as well as probes that differ by one or more bases from the exactly complementary probes.

  13. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  14. Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

    PubMed

    Pietrowski, D; Förster, M

    2000-01-01

    The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).

  15. Diagnostics based on nucleic acid sequence variant profiling: PCR, hybridization, and NGS approaches.

    PubMed

    Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu

    2016-10-01

    Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  16. The third annual BRDS on research and development of nucleic acid-based nanomedicines

    PubMed Central

    Chaudhary, Amit Kumar

    2017-01-01

    The completion of human genome project, decrease in the sequencing cost, and correlation of genome sequencing data with specific diseases led to the exponential rise in the nucleic acid-based therapeutic approaches. In the third annual Biopharmaceutical Research and Development Symposium (BRDS) held at the Center for Drug Discovery and Lozier Center for Pharmacy Sciences and Education at the University of Nebraska Medical Center (UNMC), we highlighted the remarkable features of the nucleic acid-based nanomedicines, their significance, NIH funding opportunities on nanomedicines and gene therapy research, challenges and opportunities in the clinical translation of nucleic acids into therapeutics, and the role of intellectual property (IP) in drug discovery and development. PMID:27848223

  17. DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

    DOEpatents

    Lee, James W.; Thundat, Thomas G.

    2005-06-14

    An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.

  18. Regulation of Nutrient Transport in Quiescent, Lactating, and Neoplastic Mammary Epithelia

    DTIC Science & Technology

    1998-10-01

    collected and solubilized with 1.25% dodecyl maltoside in the presence of 6- aminocaproic acid . After a 30-minute 13000 rpm centrifugation at 4°C, the... acids . Hydropathy plots based on amino acid sequences predicted from cDNA sequence suggest that all share a common topology, which includes... acid intracellular loop midway through the transporter. There is a striking degree of homology among these isoforms, which are 50- 65% identical in

  19. Sequence-based screening for self-sufficient P450 monooxygenase from a metagenome library.

    PubMed

    Kim, B S; Kim, S Y; Park, J; Park, W; Hwang, K Y; Yoon, Y J; Oh, W K; Kim, B Y; Ahn, J S

    2007-05-01

    Cytochrome P450 monooxygenases (CYPs) are useful catalysts for oxidation reactions. Self-sufficient CYPs harbour a reductive domain covalently connected to a P450 domain and are known for their robust catalytic activity with great potential as biocatalysts. In an effort to expand genetic sources of self-sufficient CYPs, we devised a sequence-based screening system to identify them in a soil metagenome. We constructed a soil metagenome library and performed sequence-based screening for self-sufficient CYP genes. A new CYP gene, syk181, was identified from the metagenome library. Phylogenetic analysis revealed that SYK181 formed a distinct phylogenic line with 46% amino-acid-sequence identity to CYP102A1 which has been extensively studied as a fatty acid hydroxylase. The heterologously expressed SYK181 showed significant hydroxylase activity towards naphthalene and phenanthrene as well as towards fatty acids. Sequence-based screening of metagenome libraries is expected to be a useful approach for searching self-sufficient CYP genes. The translated product of syk181 shows self-sufficient hydroxylase activity towards fatty acids and aromatic compounds. SYK181 is the first self-sufficient CYP obtained directly from a metagenome library. The genetic and biochemical information on SYK181 are expected to be helpful for engineering self-sufficient CYPs with broader catalytic activities towards various substrates, which would be useful for bioconversion of natural products and biodegradation of organic chemicals.

  20. Synthesis and evaluations of an acid-cleavable, fluorescently labeled nucleotide as a reversible terminator for DNA sequencing.

    PubMed

    Tan, Lianjiang; Liu, Yazhi; Li, Xiaowei; Wu, Xin-Yan; Gong, Bing; Shen, Yu-Mei; Shao, Zhifeng

    2016-02-11

    An acid-cleavable linker based on a dimethylketal moiety was synthesized and used to connect a nucleotide with a fluorophore to produce a 3'-OH unblocked nucleotide analogue as an excellent reversible terminator for DNA sequencing by synthesis.

  1. The cDNA sequence of mouse Pgp-1 and homology to human CD44 cell surface antigen and proteoglycan core/link proteins.

    PubMed

    Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T

    1990-01-05

    We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.

  2. The bglA Gene of Aspergillus kawachii Encodes Both Extracellular and Cell Wall-Bound β-Glucosidases

    PubMed Central

    Iwashita, Kazuhiro; Nagahara, Tatsuya; Kimura, Hitoshi; Takano, Makoto; Shimoi, Hitoshi; Ito, Kiyoshi

    1999-01-01

    We cloned the genomic DNA and cDNA of bglA, which encodes β-glucosidase in Aspergillus kawachii, based on a partial amino acid sequence of purified cell wall-bound β-glucosidase CB-1. The nucleotide sequence of the cloned bglA gene revealed a 2,933-bp open reading frame with six introns that encodes an 860-amino-acid protein. Based on the deduced amino acid sequence, we concluded that the bglA gene encodes cell wall-bound β-glucosidase CB-1. The amino acid sequence exhibited high levels of homology with the amino acid sequences of fungal β-glucosidases classified in subfamily B. We expressed the bglA cDNA in Saccharomyces cerevisiae and detected the recombinant β-glucosidase in the periplasm fraction of the recombinant yeast. A. kawachii can produce two extracellular β-glucosidases (EX-1 and EX-2) in addition to the cell wall-bound β-glucosidase. A. kawachii in which the bglA gene was disrupted produced none of the three β-glucosidases, as determined by enzyme assays and a Western blot analysis. Thus, we concluded that the bglA gene encodes both extracellular and cell wall-bound β-glucosidases in A. kawachii. PMID:10584016

  3. Crimean-Congo Hemorrhagic Fever

    DTIC Science & Technology

    2004-01-01

    aminocaproic acid were also indicated. Much emphasis was also placed on preventing reinfection, including the necessity of remov- ing blood crusts from...The se- quence is approximately 60% identical both at the nucleotide and amino acid levels to the L segment of Dugbe virus, the only other Nairovirus...However, more recent data based on nucleic acid sequence analysis have revealed extensive genetic diversity. The first published CCHFV sequence

  4. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  5. Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

    PubMed Central

    Khan, A S

    1984-01-01

    The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017

  6. Two-level QSAR network (2L-QSAR) for peptide inhibitor design based on amino acid properties and sequence positions.

    PubMed

    Du, Q S; Ma, Y; Xie, N Z; Huang, R B

    2014-01-01

    In the design of peptide inhibitors the huge possible variety of the peptide sequences is of high concern. In collaboration with the fast accumulation of the peptide experimental data and database, a statistical method is suggested for peptide inhibitor design. In the two-level peptide prediction network (2L-QSAR) one level is the physicochemical properties of amino acids and the other level is the peptide sequence position. The activity contributions of amino acids are the functions of physicochemical properties and the sequence positions. In the prediction equation two weight coefficient sets {ak} and {bl} are assigned to the physicochemical properties and to the sequence positions, respectively. After the two coefficient sets are optimized based on the experimental data of known peptide inhibitors using the iterative double least square (IDLS) procedure, the coefficients are used to evaluate the bioactivities of new designed peptide inhibitors. The two-level prediction network can be applied to the peptide inhibitor design that may aim for different target proteins, or different positions of a protein. A notable advantage of the two-level statistical algorithm is that there is no need for host protein structural information. It may also provide useful insight into the amino acid properties and the roles of sequence positions.

  7. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, D.A.; Zilinskas, B.A.

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity)more » with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).« less

  8. Regulation of Glucose Transport in Quiescent, Lactating, and Neoplastic Mammary Epithelia

    DTIC Science & Technology

    1998-10-01

    17000g pellet iodixanol density gradient was collected and solubilized with 1.25% dodecyl maltoside in the presence of 6- aminocaproic acid . After a...regulatory properties, tissue distributions, and kinetics. However, they are all integral membrane proteins containing approximately 500 amino acids ...Hydropathy plots based on amino acid sequences predicted from cDNA sequence suggest that all share a common topology, which includes cytoplasmic N- and C

  9. A novel HLA-B allele, B*5214, detected in a Taiwanese volunteer bone marrow donor using a sequence-based typing method.

    PubMed

    Chen, M J; Chu, C C; Shyr, M H; Lin, C L; Lin, P Y; Yang, K L

    2010-02-01

    HLA-B*5214, a novel rare allele of HLA-B*52 variant, was found in a Taiwanese volunteer bone marrow donor by sequence-based typing method. The sequence of B*5214 is identical to that of B*520101 in exon 2 but differs from B*520101 in exon 3 at nucleotide positions 419 A-->T and 435 A-->G. Alteration of these two nucleotides resulted an amino acid substitution at amino acid residue 116 Y-->F ( TAC-->TTC) and a silent exchange at residue 121 K-->K (AAA-->AAG).

  10. Sequence search on a supercomputer.

    PubMed

    Gotoh, O; Tagashira, Y

    1986-01-10

    A set of programs was developed for searching nucleic acid and protein sequence data bases for sequences similar to a given sequence. The programs, written in FORTRAN 77, were optimized for vector processing on a Hitachi S810-20 supercomputer. A search of a 500-residue protein sequence against the entire PIR data base Ver. 1.0 (1) (0.5 M residues) is carried out in a CPU time of 45 sec. About 4 min is required for an exhaustive search of a 1500-base nucleotide sequence against all mammalian sequences (1.2M bases) in Genbank Ver. 29.0. The CPU time is reduced to about a quarter with a faster version.

  11. Ultraselective electrochemiluminescence biosensor based on locked nucleic acid modified toehold-mediated strand displacement reaction and junction-probe.

    PubMed

    Zhang, Xi; Zhang, Jing; Wu, Dongzhi; Liu, Zhijing; Cai, Shuxian; Chen, Mei; Zhao, Yanping; Li, Chunyan; Yang, Huanghao; Chen, Jinghua

    2014-12-07

    Locked nucleic acid (LNA) is applied in toehold-mediated strand displacement reaction (TMSDR) to develop a junction-probe electrochemiluminescence (ECL) biosensor for single-nucleotide polymorphism (SNP) detection in the BRCA1 gene related to breast cancer. More than 65-fold signal difference can be observed with perfectly matched target sequence to single-base mismatched sequence under the same conditions, indicating good selectivity of the ECL biosensor.

  12. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  13. PipeOnline 2.0: automated EST processing and functional data sorting.

    PubMed

    Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A

    2002-11-01

    Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.

  14. Primary structure of prostaglandin G/H synthase from sheep vesicular gland determined from the complementary DNA sequence.

    PubMed Central

    DeWitt, D L; Smith, W L

    1988-01-01

    Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548

  15. First draft genome sequencing of indole acetic acid producing and plant growth promoting fungus Preussia sp. BSL10.

    PubMed

    Khan, Abdul Latif; Asaf, Sajjad; Khan, Abdur Rahim; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung

    2016-05-10

    Preussia sp. BSL10, family Sporormiaceae, was actively producing phytohormone (indole-3-acetic acid) and extra-cellular enzymes (phosphatases and glucosidases). The fungus was also promoting the growth of arid-land tree-Boswellia sacra. Looking at such prospects of this fungus, we sequenced its draft genome for the first time. The Illumina based sequence analysis reveals an approximate genome size of 31.4Mbp for Preussia sp. BSL10. Based on ab initio gene prediction, total 32,312 coding sequences were annotated consisting of 11,967 coding genes, pseudogenes, and 221 tRNA genes. Furthermore, 321 carbohydrate-active enzymes were predicted and classified into many functional families. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  17. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  18. Cloning of a cDNA encoding 1-aminocyclopropane-1-carboxylate synthase and expression of its mRNA in ripening apple fruit.

    PubMed

    Dong, J G; Kim, W T; Yip, W K; Thompson, G A; Li, L; Bennett, A B; Yang, S F

    1991-08-01

    1-Aminocyclopropane-1-carboxylate (ACC) synthase (EC 4.4.1.14) purified from apple (Malus sylvestris Mill.) fruit was subjected to trypsin digestion. Following separation by reversed-phase high-pressure liquid chromatography, ten tryptic peptides were sequenced. Based on the sequences of three tryptic peptides, three sets of mixed oligonucleotide probes were synthesized and used to screen a plasmid cDNA library prepared from poly(A)(+) RNA of ripe apple fruit. A 1.5-kb (kilobase) cDNA clone which hybridized to all three probes were isolated. The clone contained an open reading frame of 1214 base pairs (bp) encoding a sequence of 404 amino acids. While the polyadenine tail at the 3'-end was intact, it lacked a portion of sequence at the 5'-end. Using the RNA-based polymerase chain reaction, an additional sequence of 148 bp was obtained at the 5'-end. Thus, 1362 bp were sequenced and they encode 454 amino acids. The deduced amino-acid sequence contained peptide sequences corresponding to all ten tryptic fragments, confirming the identity of the cDNA clone. Comparison of the deduced amino-acid sequence between ACC synthase from apple fruit and those from tomato (Lycopersicon esculentum Mill.) and winter squash (Cucurbita maxima Duch.) fruits demonstrated the presence of seven highly conserved regions, including the previously identified region for the active site. The size of the translation product of ACC-synthase mRNA was similar to that of the mature protein on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), indicating that apple ACC-synthase undergoes only minor, if any, post-translational proteolytic processing. Analysis of ACC-synthase mRNA by in-vitro translation-immunoprecipitation, and by Northern blotting indicates that the ACC-synthase mRNA was undetectable in unripe fruit, but was accumulated massively during the ripening proccess. These data demonstrate that the expression of the ACC-synthase gene is developmentally regulated.

  19. Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST.

    PubMed

    Goonesekere, Nalin Cw

    2009-01-01

    The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.

  20. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  1. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  2. Sequence-dependent DNA deformability studied using molecular dynamics simulations.

    PubMed

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.

  3. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

    PubMed Central

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762

  4. Identification and characterization of Theileria ovis surface protein (ToSp) resembled TaSp in Theileria annulata.

    PubMed

    Shayan, P; Jafari, S; Fattahi, R; Ebrahimzade, E; Amininia, N; Changizi, E

    2016-05-01

    Ovine theileriosis is an important hemoprotozoal disease of sheep and goats in tropical and subtropical regions which caused high economic loses in the livestock industry. Theileria annulata surface protein (TaSp) was used previously as a tool for serological analysis in livestock. Since the amino acid sequences of TaSp is, at least, in part very conserved in T. annulata, Theileria lestoquardi and Theileria china I and II, it is very important to determine the amino acid sequence of this protein in Theileria ovis as well, to avoid false interpretation of serological data based on this protein in small animal. In the present study, the nucleotide sequence and amino acid sequence of T. ovis surface protein (ToSp) were determined. The comparison of the nucleotide sequence of ToSp showed 96, 96, 99, and 86 % homology to the corresponding nucleotide sequence of TaSp genes by T. annulata, T. China I, T. China II and T. lestoquardi, previously registered in GenBank under accession nos. AJ316260.1, AY274329.1, DQ120058.1, and EF092924.1 respectively. The amino acid sequence analysis showed 95, 81, 98 and 70 % homology to the corresponding amino acid sequence of T. annulata, T chinaI, T china II and T. lestoquardi, registered in GenBank under accession nos. CAC87478.1, AAP36993.1, AAZ30365.1 and AAP36999.11, respectively. Interestingly, in contrast to the C terminus, a significant difference in amino acid sequence in the N teminus of the ToSp protein could be determined compared to the other known corresponding TaSp sequences, which make this region attractive for designing of a suitable tool for serological diagnosis.

  5. Development and Evaluation of Novel Real-Time Reverse Transcription-PCR Assays with Locked Nucleic Acid Probes Targeting Leader Sequences of Human-Pathogenic Coronaviruses

    PubMed Central

    Chan, Jasper Fuk-Woo; Choi, Garnet Kwan-Yue; Tsang, Alan Ka-Lun; Tee, Kah-Meng; Lam, Ho-Yin; Yip, Cyril Chik-Yan; To, Kelvin Kai-Wang; Cheng, Vincent Chi-Chung; Yeung, Man-Lung; Lau, Susanna Kar-Pui; Woo, Patrick Chiu-Yat; Chan, Kwok-Hung; Tang, Bone Siu-Fai

    2015-01-01

    Based on findings in small RNA-sequencing (Seq) data analysis, we developed highly sensitive and specific real-time reverse transcription (RT)-PCR assays with locked nucleic acid probes targeting the abundantly expressed leader sequences of Middle East respiratory syndrome coronavirus (MERS-CoV) and other human coronaviruses. Analytical and clinical evaluations showed their noninferiority to a commercial multiplex PCR test for the detection of these coronaviruses. PMID:26019210

  6. Reference System of DNA and Protein Sequences on CD-ROM

    NASA Astrophysics Data System (ADS)

    Nasu, Hisanori; Ito, Toshiaki

    DNASIS-DBREF31 is a database for DNA and Protein sequences in the form of optical Compact Disk (CD) ROM, developed and commercialized by Hitachi Software Engineering Co., Ltd. Both nucleic acid base sequences and protein amino acid sequences can be retrieved from a single CD-ROM. Existing database is offered in the form of on-line service, floppy disks, or magnetic tape, all of which have some problems or other, such as usability or storage capacity. DNASIS-DBREF31 newly adopt a CD-ROM as a database device to realize a mass storage and personal use of the database.

  7. [Molecular cloning and characterization of an acetylcholinesterase gene Dd-ace-2 from sweet potato stem nematode Ditylenchus destructor].

    PubMed

    Ding, Zhong; Peng, Deliang; Huang, Wenkun; He, Wenting; Gao, Bida

    2008-02-01

    A cDNA, named Dd-ace-2, encoding an acetylcholinesterase (AChE, EC3.1.1.7), was isolated from sweet-potato-stem nematode, Ditylenchus destructor. The nucleotide and amino acid sequences among different nematode species were compared and analyzed with DNAMAN5.0, MEGA3.0 softwares. The results showed that the complete nucleotide sequence of Dd-ace-2 gene of Ditylenchus destructor contains 2425 base pairs from which deduced 734 amino acids (GenBank accession No. EF583058). The homology rates of amino acid sequences of Dd-ace-2 gene between Ditylenchus destructor and Meloidogyne incognita, Caenorhabditis elegans, Dictyocaulus viviparous were 48.0%, 42.7%, 42.1% respectively. The mature acetylcholinesterase sequences of Ditylenchus destructor may encode by the first 701 residues of deduced 734 amino acids.The conserved motifs involved in the catalytic triad, the choline binding site and 10 aromatic residues lining the catalytic gorge were present in the Dd-ace-2 deduced protein. Phylogenetic analysis based on AChEs of other nematodes and species showed that the deduced AChE formed the same cluster with ACE-2s.

  8. Molecular Recognition and Structural Influences on Function in Bio-nanosystems of Nucleic Acids and Proteins

    NASA Astrophysics Data System (ADS)

    Sethaphong, Latsavongsakda

    This work examines smart material properties of rational self-assembly and molecular recognition found in nano-biosystems. Exploiting the sequence and structural information encoded within nucleic acids and proteins will permit programmed synthesis of nanomaterials and help create molecular machines that may carry out new roles involving chemical catalysis and bioenergy. Responsive to different ionic environments thru self-reorgnization, nucleic acids (NA) are nature's signature smart material; organisms such as viruses and bacteria use features of NAs to react to their environment and orchestrate their lifecycle. Furthermore, nucleic acid systems (both RNA and DNA) are currently exploited as scaffolds; recent applications have been showcased to build bioelectronics and biotemplated nanostructures via directed assembly of multidimensional nanoelectronic devices 1. Since the most stable and rudimentary structure of nucleic acids is the helical duplex, these were modeled in order to examine the influence of the microenvironment, sequence, and cation-dependent perturbations of their canonical forms. Due to their negatively charged phosphate backbone, NA's rely on counterions to overcome the inherent repulsive forces that arise from the assembly of two complementary strands. As a realistic model system, we chose the HIV-TAR helix (PDB ID: 397D) to study specific sequence motifs on cation sequestration. At physiologically relevant concentrations of sodium and potassium ions, we observed sequence based effects where purine stretches were adept in retaining high residency cations. The transitional space between adenine and guanosine nucleotides (ApG step) in a sequence proved the most favorable. This work was the first to directly show these subtle interactions of sequence based cationic sequestration and may be useful for controlling metallization of nucleic acids in conductive nanowires. Extending the study further, we explored the degree to which the structure of NA duplexes alone interacted with cations distinct from a specific sequence. Under physiologically relevant conditions, a duplex of RNA polyguanine-polycitidine was highly responsive and able to sequester cations to the middle of the purine stretches. The least responsive structure was a DNA polyadenine-polythymine duplex. A random sequence DNA duplex contorted into an RNA-like helix resulted in cationic dynamics similar to RNA systems. These studies showed that cation diffusive binding events in nucleic acid duplex structures are sequence specific and heavily influenced by structural aspects helical forms to account for much of the differences observed. Although structural information in nucleic acids is encoded within their sequence, linking amino acid sequence to protein structure is murkier; the structural information within proteins is encoded by the folding process itself: a complex phenomenon driven toward the equilibrium state of the active conformation. Upwards of two thirds of a protein's sequence can be substituted with similar amino acids without significantly perturbing its function; conserved residues of about 10% seem to be vital; since evolutionary selection pressure in proteins operates 3-dimenionally, a linear sequence is partially informative. We explored this problem by folding de-novo the cytosolic portion of the membrane protein, cellulose synthase, CESA1 from upland cotton, Gossypium hirsutum (Ghcesa1). The cytoplasmic region was generated by homology modeling and refined with molecular dynamics. These mutations impair local structural flexibility which likely results in cellulose that is produced at a lower rate and is less crystalline. Additional modeling of fragments of cellulose synthases from the model plant, Arabidopsis thaliana, offered novel insights into the function of conserved cytosolic domains within plant cellulose synthases. Transport mechanisms related to the transmembrane region revealed significant differences between plants and a bacterial complex. These studies generated possible mutations that may allow for the creation of new synthases and identified other avenues of research in order to develop technologies that may alter the crystallinity and other useful properties of cellulose. 1. Karplus, K., SAM-T08, HMM-based protein structure prediction. Nucleic Acids Research, 2009. 37: p. W492-W497.

  9. GENETIC-BASED ANALYTICAL METHODS FOR BACTERIA AND FUNGI

    EPA Science Inventory

    In the past two decades, advances in high-throughput sequencing technologies have lead to a veritable explosion in the generation of nucleic acid sequence information (1). While these advances are illustrated most prominently by the successful sequencing of the human genome, they...

  10. Molecular cloning of two human liver 3 alpha-hydroxysteroid/dihydrodiol dehydrogenase isoenzymes that are identical with chlordecone reductase and bile-acid binder.

    PubMed Central

    Deyashiki, Y; Ogasawara, A; Nakayama, T; Nakanishi, M; Miyabe, Y; Sato, K; Hara, A

    1994-01-01

    Human liver contains two dihydrodiol dehydrogenases, DD2 and DD4, associated with 3 alpha-hydroxysteroid dehydrogenase activity. We have raised polyclonal antibodies that cross-reacted with the two enzymes and isolated two 1.2 kb cDNA clones (C9 and C11) for the two enzymes from a human liver cDNA library using the antibodies. The clones of C9 and C11 contained coding sequences corresponding to 306 and 321 amino acid residues respectively, but lacked 5'-coding regions around the initiation codon. Sequence analyses of several peptides obtained by enzymic and chemical cleavages of the two purified enzymes verified that the C9 and C11 clones encoded DD2 and DD4 respectively, and further indicated that the sequence of DD2 had at least additional 16 residues upward from the N-terminal sequence deduced from the cDNA. There was 82% amino acid sequence identity between the two enzymes, indicating that the enzymes are genetic isoenzymes. A computer-based comparison of the cDNAs of the isoenzymes with the DNA sequence database revealed that the nucleotide and amino acid sequences of DD2 and DD4 are virtually identical with those of human bile-acid binder and human chlordecone reductase cDNAs respectively. Images Figure 1 PMID:8172617

  11. Method for isolating chromosomal DNA in preparation for hybridization in suspension

    DOEpatents

    Lucas, Joe N.

    2000-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. Chromosomal DNA in a sample containing cell debris is prepared for hybridization in suspension by treating the mixture with RNase. The treated DNA can also be fixed prior to hybridization.

  12. Codes in the codons: construction of a codon/amino acid periodic table and a study of the nature of specific nucleic acid-protein interactions.

    PubMed

    Benyo, B; Biro, J C; Benyo, Z

    2004-01-01

    The theory of "codon-amino acid coevolution" was first proposed by Woese in 1967. It suggests that there is a stereochemical matching - that is, affinity - between amino acids and certain of the base triplet sequences that code for those amino acids. We have constructed a common periodic table of codons and amino acids, where the nucleic acid table showed perfect axial symmetry for codons and the corresponding amino acid table also displayed periodicity regarding the biochemical properties (charge and hydrophobicity) of the 20 amino acids and the position of the stop signals. The table indicates that the middle (2/sup nd/) amino acid in the codon has a prominent role in determining some of the structural features of the amino acids. The possibility that physical contact between codons and amino acids might exist was tested on restriction enzymes. Many recognition site-like sequences were found in the coding sequences of these enzymes and as many as 73 examples of codon-amino acid co-location were observed in the 7 known 3D structures (December 2003) of endonuclease-nucleic acid complexes. These results indicate that the smallest possible units of specific nucleic acid-protein interaction are indeed the stereochemically compatible codons and amino acids.

  13. Cloning and sequence analysis of Hemonchus contortus HC58cDNA.

    PubMed

    Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li

    2007-06-01

    The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.

  14. Studying the evolutionary relationships and phylogenetic trees of 21 groups of tRNA sequences based on complex networks.

    PubMed

    Wei, Fangping; Chen, Bowen

    2012-03-01

    To find out the evolutionary relationships among different tRNA sequences of 21 amino acids, 22 networks are constructed. One is constructed from whole tRNAs, and the other 21 networks are constructed from the tRNAs which carry the same amino acids. A new method is proposed such that the alignment scores of any two amino acids groups are determined by the average degree and the average clustering coefficient of their networks. The anticodon feature of isolated tRNA and the phylogenetic trees of 21 group networks are discussed. We find that some isolated tRNA sequences in 21 networks still connect with other tRNAs outside their group, which reflects the fact that those tRNAs might evolve by intercrossing among these 21 groups. We also find that most anticodons among the same cluster are only one base different in the same sites when S ≥ 70, and they stay in the same rank in the ladder of evolutionary relationships. Those observations seem to agree on that some tRNAs might mutate from the same ancestor sequences based on point mutation mechanisms.

  15. Molecular cloning and sequence analysis of full-length growth hormone cDNAs from six important economic fishes.

    PubMed

    Zhang, Jing-Nan; Song, Ping; Hu, Jia-Rui; Mo, Sai-Jun; Peng, Mao-Yu; Zhou, Wei; Zou, Ji-Xing; Hu, Yin-Chang

    2005-01-01

    In this study,the full-length cDNAs of GH (Growth Hormone) gene was isolated from six important economic fishes, Siniperca kneri, Epinephelus coioides, Monopterus albus, Silurus asotus, Misgurnus anguillicaudatus and Carassius auratus gibelio Bloch. It is the first time to clone these GH sequences except E. coioides GH. The lengths of the above cDNAs are as follows: 953 bp, 1 023 bp, 825 bp, 1 082 bp, 1 154 bp and 1 180 bp. Each sequence includes an ORF of about 600 bp which encodes a protein of about 200 amino acid: S. kneri, E. coioides and M. albus GHs of 204 amino acid, S. asotus GH of 200 amino acid, M. anguillicaudatus and C. auratus gibelio GHs of 210 amino acid. Then detailed sequence analysis of the six GHs with many other fish sequences was performed. The six sequences all showed high homology to other sequences, especially to sequences within the same order, and many conserved residues were identified, most localized in five domains. The phylogenetic trees (MP and NJ) of many fish GH ORF sequences (including the new six) with Amia calva as outgroup were generally resolved and largely congruent with the morphology-based tree though some incongruities were observed, suggesting GH ORF should be paid more attention to in teleostean phylogeny.

  16. Sequence signatures of allosteric proteins towards rational design.

    PubMed

    Namboodiri, Saritha; Verma, Chandra; Dhar, Pawan K; Giuliani, Alessandro; Nair, Achuthsankar S

    2010-12-01

    Allostery is the phenomenon of changes in the structure and activity of proteins that appear as a consequence of ligand binding at sites other than the active site. Studying mechanistic basis of allostery leading to protein design with predetermined functional endpoints is an important unmet need of synthetic biology. Here, we screened the amino acid sequence landscape in search of sequence-signatures of allostery using Recurrence Quantitative Analysis (RQA) method. A characteristic vector, comprised of 10 features extracted from RQA was defined for amino acid sequences. Using Principal Component Analysis, four factors were found to be important determinants of allosteric behavior. Our sequence-based predictor method shows 82.6% accuracy, 85.7% sensitivity and 77.9% specificity with the current dataset. Further, we show that Laminarity-Mean-hydrophobicity representing repeated hydrophobic patches is the most crucial indicator of allostery. To our best knowledge this is the first report that describes sequence determinants of allostery based on hydrophobicity. As an outcome of these findings, we plan to explore possibility of inducing allostery in proteins.

  17. Veillonella infantium sp. nov., an anaerobic, Gram-stain-negative coccus isolated from tongue biofilm of a Thai child.

    PubMed

    Mashima, Izumi; Liao, Yu-Chieh; Miyakawa, Hiroshi; Theodorea, Citra F; Thawboon, Boonyanit; Thaweboon, Sroisiri; Scannapieco, Frank A; Nakazawa, Futoshi

    2018-04-01

    A strain of a novel anaerobic, Gram-stain-negative coccus was isolated from the tongue biofilm of a Thai child. This strain was shown, at the phenotypic level and based on 16S rRNA gene sequencing, to be a member of the genus Veillonella. Comparative analysis of the 16S rRNA, dnaK and rpoB gene sequences indicated that phylogenetically the strain comprised a distinct novel branch within the genus Veillonella. The novel strain showed 99.8, 95.1 and 95.9 % similarity to partial 16S rRNA, dnaK and rpoB gene sequences, respectively, to the type strains of the two most closely related species, Veillonelladispar ATCC 17748 T and Veillonellatobetsuensis ATCC BAA-2400 T . The novel strain could be discriminated from previously reported species of the genus Veillonella based on partial dnaK and rpoB gene sequencing and average nucleotide identity values. The major acid end-product produced by this strain was acetic acid under anaerobic conditions in trypticase-yeast extract-haemin with 1 % (w/v) glucose or fructose medium. Lactate was fermented to acetic acid and propionic acid. Based on these observations, this strain represents a novel species, for which the name Veillonella infantium sp. nov. is proposed. The type strain is T11011-4 T (=JCM 31738 T =TSD-88 T ).

  18. A comprehensive bioinformatic analysis of hepatitis D virus full-length genomes.

    PubMed

    Delfino, C M; Cerrudo, C S; Biglione, M; Oubiña, J R; Ghiringhelli, P D; Mathet, V L

    2018-02-06

    In association with hepatitis B virus (HBV), hepatitis delta virus (HDV) is a subviral agent that may promote severe acute and chronic forms of liver disease. Based on the percentage of nucleotide identity of the genome, HDV was initially classified into three genotypes. However, since 2006, the original classification has been further expanded into eight clades/genotypes. The intergenotype divergence may be as high as 35%-40% over the entire RNA genome, whereas sequence heterogeneity among the isolates of a given genotype is <20%; furthermore, HDV recombinants have been clearly demonstrated. The genetic diversity of HDV is related to the geographic origin of the isolates. This study shows the first comprehensive bioinformatic analysis of the complete available set of HDV sequences, using both nucleotide and protein phylogenies (based on an evolutionary model selection, gamma distribution estimation, tree inference and phylogenetic distance estimation), protein composition analysis and comparison (based on the presence of invariant residues, molecular signatures, amino acid frequencies and mono- and di-amino acid compositional distances), as well as amino acid changes in sequence evolution. Taking into account the congruent and consistent results of both nucleotide and amino acid analyses of GenBank available sequences (recorded as of January, 2017), we propose that the eight hepatitis D virus genotypes may be grouped into three large genogroups fully supported by their shared characteristics. © 2018 John Wiley & Sons Ltd.

  19. NASBA: A detection and amplification system uniquely suited for RNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sooknanan, R.; Malek, L.T.

    1995-06-01

    The invention of PCR (polymerase chain reaction) has revolutionized our ability to amplify and manipulate a nucleic acid sequence in vitro. The commercial rewards of this revolution have driven the development of other nuclei acid amplification and detection methodologies. This has created an alphabet soup of technologies that use different amplification methods, including NASBA (nucleic acid sequence-based amplification), LCR (ligase chain reaction), SDA (strand displacement amplification), QBR (Q-beta replicase), CPR (cycling probe reaction), and bDNA (branched DNA). Despite the differences in their processes, these amplification systems can be separated into two broad categories based on how they achieve their goal:more » sequence-based amplification systems, such as PCR, NASBA, and SDA, amplify a target nucleic acid sequence. Signal-based amplification systems, such as LCR, QBR, CPR and bDNA, amplify or alter a signal from a detection reaction that is target-dependent. While the various methods have relative strengths and weaknesses, only NASBA offers the unique ability to homogeneously amplify an RNA analyte in the presence of homologous genomic DNA under isothermal conditions. Since the detection of RNA sequences almost invariably measures biological activity, it is an excellent prognostic indicator of activities as diverse as virus production, gene expression, and cell viability. The isothermal nature of the reaction makes NASBA especially suitable for large-scale manual screening. These features extend NASBA`s application range from research to commercial diagnostic applications. Field test kits are presently under development for human diagnostics as well as the burgeoning fields of food and environmental diagnostic testing. These developments suggest future integration of NASBA into robotic workstations for high-throughput screening as well. 17 refs., 1 tab.« less

  20. Rapid Threat Organism Recognition Pipeline

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Kelly P.; Solberg, Owen D.; Schoeniger, Joseph S.

    2013-05-07

    The RAPTOR computational pipeline identifies microbial nucleic acid sequences present in sequence data from clinical samples. It takes as input raw short-read genomic sequence data (in particular, the type generated by the Illumina sequencing platforms) and outputs taxonomic evaluation of detected microbes in various human-readable formats. This software was designed to assist in the diagnosis or characterization of infectious disease, by detecting pathogen sequences in nucleic acid sequence data from clinical samples. It has also been applied in the detection of algal pathogens, when algal biofuel ponds became unproductive. RAPTOR first trims and filters genomic sequence reads based on qualitymore » and related considerations, then performs a quick alignment to the human (or other host) genome to filter out host sequences, then performs a deeper search against microbial genomes. Alignment to a protein sequence database is optional. Alignment results are summarized and placed in a taxonomic framework using the Lowest Common Ancestor algorithm.« less

  1. Protein binding hot spots prediction from sequence only by a new ensemble learning method.

    PubMed

    Hu, Shan-Shan; Chen, Peng; Wang, Bing; Li, Jinyan

    2017-10-01

    Hot spots are interfacial core areas of binding proteins, which have been applied as targets in drug design. Experimental methods are costly in both time and expense to locate hot spot areas. Recently, in-silicon computational methods have been widely used for hot spot prediction through sequence or structure characterization. As the structural information of proteins is not always solved, and thus hot spot identification from amino acid sequences only is more useful for real-life applications. This work proposes a new sequence-based model that combines physicochemical features with the relative accessible surface area of amino acid sequences for hot spot prediction. The model consists of 83 classifiers involving the IBk (Instance-based k means) algorithm, where instances are encoded by important properties extracted from a total of 544 properties in the AAindex1 (Amino Acid Index) database. Then top-performance classifiers are selected to form an ensemble by a majority voting technique. The ensemble classifier outperforms the state-of-the-art computational methods, yielding an F1 score of 0.80 on the benchmark binding interface database (BID) test set. http://www2.ahu.edu.cn/pchen/web/HotspotEC.htm .

  2. DNA tetrominoes: the construction of DNA nanostructures using self-organised heterogeneous deoxyribonucleic acids shapes.

    PubMed

    Ong, Hui San; Rahim, Mohd Syafiq; Firdaus-Raih, Mohd; Ramlan, Effirul Ikhwan

    2015-01-01

    The unique programmability of nucleic acids offers alternative in constructing excitable and functional nanostructures. This work introduces an autonomous protocol to construct DNA Tetris shapes (L-Shape, B-Shape, T-Shape and I-Shape) using modular DNA blocks. The protocol exploits the rich number of sequence combinations available from the nucleic acid alphabets, thus allowing for diversity to be applied in designing various DNA nanostructures. Instead of a deterministic set of sequences corresponding to a particular design, the protocol promotes a large pool of DNA shapes that can assemble to conform to any desired structures. By utilising evolutionary programming in the design stage, DNA blocks are subjected to processes such as sequence insertion, deletion and base shifting in order to enrich the diversity of the resulting shapes based on a set of cascading filters. The optimisation algorithm allows mutation to be exerted indefinitely on the candidate sequences until these sequences complied with all the four fitness criteria. Generated candidates from the protocol are in agreement with the filter cascades and thermodynamic simulation. Further validation using gel electrophoresis indicated the formation of the designed shapes. Thus, supporting the plausibility of constructing DNA nanostructures in a more hierarchical, modular, and interchangeable manner.

  3. Silver ions-mediated conformational switch: facile design of structure-controllable nucleic acid probes.

    PubMed

    Wang, Yongxiang; Li, Jishan; Wang, Hao; Jin, Jianyu; Liu, Jinhua; Wang, Kemin; Tan, Weihong; Yang, Ronghua

    2010-08-01

    Conformationally constraint nucleic acid probes were usually designed by forming an intramolecular duplex based on Watson-Crick hydrogen bonds. The disadvantages of these approaches are the inflexibility and instability in complex environment of the Watson-Crick-based duplex. We report that this hydrogen bonding pattern can be replaced by metal-ligation between specific metal ions and the natural bases. To demonstrate the feasibility of this principle, two linear oligonucleotides and silver ions were examined as models for DNA hybridization assay and adenosine triphosphate detection. The both nucleic acids contain target binding sequences in the middle and cytosine (C)-rich sequences at the lateral portions. The strong interaction between Ag(+) ions and cytosines forms stable C-Ag(+)-C structures, which promises the oligonucleotides to form conformationally constraint formations. In the presence of its target, interaction between the loop sequences and the target unfolds the C-Ag(+)-C structures, and the corresponding probes unfolding can be detected by a change in their fluorescence emission. We discuss the thermodynamic and kinetic opportunities that are provided by using Ag(+) ion complexes instead of traditional Watson-Crick-based duplex. In particular, the intrinsic feature of the metal-ligation motif facilitates the design of functional nucleic acids probes by independently varying the concentration of Ag(+) ions in the medium.

  4. Improving protein complex classification accuracy using amino acid composition profile.

    PubMed

    Huang, Chien-Hung; Chou, Szu-Yu; Ng, Ka-Lok

    2013-09-01

    Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Cloning, sequencing, and expression of cDNA for human. beta. -glucuronidase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oshima, A.; Kyle, J.W.; Miller, R.D.

    1987-02-01

    The authors report here the cDNA sequence for human placental ..beta..-glucuronidase (..beta..-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH/sub 2/-terminal amino acid sequence determined for human spleen ..beta..-glucuronidase agreed with that inferred from the DNAmore » sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human ..beta..-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human ..beta..-glucuronidase, demonstrate the existence of two populations of mRNA for ..beta..-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length.« less

  6. Determination of a mutational spectrum

    DOEpatents

    Thilly, William G.; Keohavong, Phouthone

    1991-01-01

    A method of resolving (physically separating) mutant DNA from nonmutant DNA and a method of defining or establishing a mutational spectrum or profile of alterations present in nucleic acid sequences from a sample to be analyzed, such as a tissue or body fluid. The present method is based on the fact that it is possible, through the use of DGGE, to separate nucleic acid sequences which differ by only a single base change and on the ability to detect the separate mutant molecules. The present invention, in another aspect, relates to a method for determining a mutational spectrum in a DNA sequence of interest present in a population of cells. The method of the present invention is useful as a diagnostic or analytical tool in forensic science in assessing environmental and/or occupational exposures to potentially genetically toxic materials (also referred to as potential mutagens); in biotechnology, particularly in the study of the relationship between the amino acid sequence of enzymes and other biologically-active proteins or protein-containing substances and their respective functions; and in determining the effects of drugs, cosmetics and other chemicals for which toxicity data must be obtained.

  7. Sequence-Specific Recognition of DNA by Proteins: Binding Motifs Discovered Using a Novel Statistical/Computational Analysis

    PubMed Central

    Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri

    2016-01-01

    Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774

  8. An Alignment-Free Algorithm in Comparing the Similarity of Protein Sequences Based on Pseudo-Markov Transition Probabilities among Amino Acids

    PubMed Central

    Li, Yushuang; Yang, Jiasheng; Zhang, Yi

    2016-01-01

    In this paper, we have proposed a novel alignment-free method for comparing the similarity of protein sequences. We first encode a protein sequence into a 440 dimensional feature vector consisting of a 400 dimensional Pseudo-Markov transition probability vector among the 20 amino acids, a 20 dimensional content ratio vector, and a 20 dimensional position ratio vector of the amino acids in the sequence. By evaluating the Euclidean distances among the representing vectors, we compare the similarity of protein sequences. We then apply this method into the ND5 dataset consisting of the ND5 protein sequences of 9 species, and the F10 and G11 datasets representing two of the xylanases containing glycoside hydrolase families, i.e., families 10 and 11. As a result, our method achieves a correlation coefficient of 0.962 with the canonical protein sequence aligner ClustalW in the ND5 dataset, much higher than those of other 5 popular alignment-free methods. In addition, we successfully separate the xylanases sequences in the F10 family and the G11 family and illustrate that the F10 family is more heat stable than the G11 family, consistent with a few previous studies. Moreover, we prove mathematically an identity equation involving the Pseudo-Markov transition probability vector and the amino acids content ratio vector. PMID:27918587

  9. Molecular beacon sequence design algorithm.

    PubMed

    Monroe, W Todd; Haselton, Frederick R

    2003-01-01

    A method based on Web-based tools is presented to design optimally functioning molecular beacons. Molecular beacons, fluorogenic hybridization probes, are a powerful tool for the rapid and specific detection of a particular nucleic acid sequence. However, their synthesis costs can be considerable. Since molecular beacon performance is based on its sequence, it is imperative to rationally design an optimal sequence before synthesis. The algorithm presented here uses simple Microsoft Excel formulas and macros to rank candidate sequences. This analysis is carried out using mfold structural predictions along with other free Web-based tools. For smaller laboratories where molecular beacons are not the focus of research, the public domain algorithm described here may be usefully employed to aid in molecular beacon design.

  10. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  11. A multi-model approach to nucleic acid-based drug development.

    PubMed

    Gautherot, Isabelle; Sodoyer, Regís

    2004-01-01

    With the advent of functional genomics and the shift of interest towards sequence-based therapeutics, the past decades have witnessed intense research efforts on nucleic acid-mediated gene regulation technologies. Today, RNA interference is emerging as a groundbreaking discovery, holding promise for development of genetic modulators of unprecedented potency. Twenty-five years after the discovery of antisense RNA and ribozymes, gene control therapeutics are still facing developmental difficulties, with only one US FDA-approved antisense drug currently available in the clinic. Limited predictability of target site selection models is recognized as one major stumbling block that is shared by all of the so-called complementary technologies, slowing the progress towards a commercial product. Currently employed in vitro systems for target site selection include RNAse H-based mapping, antisense oligonucleotide microarrays, and functional screening approaches using libraries of catalysts with randomized target-binding arms to identify optimal ribozyme/DNAzyme cleavage sites. Individually, each strategy has its drawbacks from a drug development perspective. Utilization of message-modulating sequences as therapeutic agents requires that their action on a given target transcript meets criteria of potency and selectivity in the natural physiological environment. In addition to sequence-dependent characteristics, other factors will influence annealing reactions and duplex stability, as well as nucleic acid-mediated catalysis. Parallel consideration of physiological selection systems thus appears essential for screening for nucleic acid compounds proposed for therapeutic applications. Cellular message-targeting studies face issues relating to efficient nucleic acid delivery and appropriate analysis of response. For reliability and simplicity, prokaryotic systems can provide a rapid and cost-effective means of studying message targeting under pseudo-cellular conditions, but such approaches also have limitations. To streamline nucleic acid drug discovery, we propose a multi-model strategy integrating high-throughput-adapted bacterial screening, followed by reporter-based and/or natural cellular models and potentially also in vitro assays for characterization of the most promising candidate sequences, before final in vivo testing.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Li -Chen; Lu, Jie; Weck, Marcus

    In shell cross-linked micelles (SCMs) containing acid sites in the shell and base sites in the core are prepared from amphiphilic poly(2-oxazoline) triblock copolymers. These materials are utilized as two-chamber nanoreactors for a prototypical acid-base bifunctional tandem deacetalization-nitroaldol reaction. Furthermore, the acid and base sites are localized in different regions of the micelle, allowing the two steps in the reaction sequence to largely proceed in separate compartments, akin to the compartmentalization that occurs in biological systems.

  13. Array-Based Rational Design of Short Peptide Probe-Derived from an Anti-TNT Monoclonal Antibody.

    PubMed

    Okochi, Mina; Muto, Masaki; Yanai, Kentaro; Tanaka, Masayoshi; Onodera, Takeshi; Wang, Jin; Ueda, Hiroshi; Toko, Kiyoshi

    2017-10-09

    Complementarity-determining regions (CDRs) are sites on the variable chains of antibodies responsible for binding to specific antigens. In this study, a short peptide probe for recognition of 2,4,6-trinitrotoluene (TNT), was identified by testing sequences derived from the CDRs of an anti-TNT monoclonal antibody. The major TNT-binding site in this antibody was identified in the heavy chain CDR3 by antigen docking simulation and confirmed by an immunoassay using a spot-synthesis based peptide array comprising amino acid sequences of six CDRs in the variable region. A peptide derived from heavy chain CDR3 (RGYSSFIYWF) bound to TNT with a dissociation constant of 1.3 μM measured by surface plasmon resonance. Substitution of selected amino acids with basic residues increased TNT binding while substitution with acidic amino acids decreased affinity, an isoleucine to arginine change showed the greatest improvement of 1.8-fold. The ability to create simple peptide binders of volatile organic compounds from sequence information provided by the immune system in the creation of an immune response will be beneficial for sensor developments in the future.

  14. Protein location prediction using atomic composition and global features of the amino acid sequence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cherian, Betsy Sheena, E-mail: betsy.skb@gmail.com; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectivelymore » used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.« less

  15. Nucleic acid arrays and methods of synthesis

    DOEpatents

    Sabanayagam, Chandran R.; Sano, Takeshi; Misasi, John; Hatch, Anson; Cantor, Charles

    2001-01-01

    The present invention generally relates to high density nucleic acid arrays and methods of synthesizing nucleic acid sequences on a solid surface. Specifically, the present invention contemplates the use of stabilized nucleic acid primer sequences immobilized on solid surfaces, and circular nucleic acid sequence templates combined with the use of isothermal rolling circle amplification to thereby increase nucleic acid sequence concentrations in a sample or on an array of nucleic acid sequences.

  16. Multiple copies of a bile acid-inducible gene in Eubacterium sp. strain VPI 12708.

    PubMed Central

    Gopal-Srivastava, R; Mallonee, D H; White, W B; Hylemon, P B

    1990-01-01

    Eubacterium sp. strain VPI 12708 is an anaerobic intestinal bacterium which possesses inducible bile acid 7-dehydroxylation activity. Several new polypeptides are produced in this strain following induction with cholic acid. Genes coding for two copies of a bile acid-inducible 27,000-dalton polypeptide (baiA1 and baiA2) have been previously cloned and sequenced. We now report on a gene coding for a third copy of this 27,000-dalton polypeptide (baiA3). The baiA3 gene has been cloned in lambda DASH on an 11.2-kilobase DNA fragment from a partial Sau3A digest of the Eubacterium DNA. DNA sequence analysis of the baiA3 gene revealed 100% homology with the baiA1 gene within the coding region of the 27,000-dalton polypeptides. The baiA2 gene shares 81% sequence identity with the other two genes at the nucleotide level. The flanking nucleotide sequences associated with the baiA1 and baiA3 genes are identical for 930 bases in the 5' direction from the initiation codon and for at least 325 bases in the 3' direction from the stop codon, including the putative promoter regions for the genes. An additional open reading frame (occupying from 621 to 648 bases, depending on the correct start codon) was found in the identical 5' regions associated with the baiA1 and baiA3 clones. The 5' sequence 930 bases upstream from the baiA1 and baiA3 genes was totally divergent. The baiA2 gene, which is part of a large bile acid-inducible operon, showed no homology with the other two genes either in the 5' or 3' direction from the polypeptide coding region, except for a 15-base-pair presumed ribosome-binding site in the 5' region. These studies strongly suggest that a gene duplication (baiA1 and baiA3) has occurred and is stably maintained in this bacterium. Images PMID:2376563

  17. RaptorX server: a resource for template-based protein structure modeling.

    PubMed

    Källberg, Morten; Margaryan, Gohar; Wang, Sheng; Ma, Jianzhu; Xu, Jinbo

    2014-01-01

    Assigning functional properties to a newly discovered protein is a key challenge in modern biology. To this end, computational modeling of the three-dimensional atomic arrangement of the amino acid chain is often crucial in determining the role of the protein in biological processes. We present a community-wide web-based protocol, RaptorX server ( http://raptorx.uchicago.edu ), for automated protein secondary structure prediction, template-based tertiary structure modeling, and probabilistic alignment sampling.Given a target sequence, RaptorX server is able to detect even remotely related template sequences by means of a novel nonlinear context-specific alignment potential and probabilistic consistency algorithm. Using the protocol presented here it is thus possible to obtain high-quality structural models for many target protein sequences when only distantly related protein domains have experimentally solved structures. At present, RaptorX server can perform secondary and tertiary structure prediction of a 200 amino acid target sequence in approximately 30 min.

  18. Genome analysis and identification of gelatinase encoded gene in Enterobacter aerogenes

    NASA Astrophysics Data System (ADS)

    Shahimi, Safiyyah; Mutalib, Sahilah Abdul; Khalid, Rozida Abdul; Repin, Rul Aisyah Mat; Lamri, Mohd Fadly; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, bioinformatic analysis towards genome sequence of E. aerogenes was done to determine gene encoded for gelatinase. Enterobacter aerogenes was isolated from hot spring water and gelatinase species-specific bacterium to porcine and fish gelatin. This bacterium offers the possibility of enzymes production which is specific to both species gelatine, respectively. Enterobacter aerogenes was partially genome sequenced resulting in 5.0 mega basepair (Mbp) total size of sequence. From pre-process pipeline, 87.6 Mbp of total reads, 68.8 Mbp of total high quality reads and 78.58 percent of high quality percentage was determined. Genome assembly produced 120 contigs with 67.5% of contigs over 1 kilo base pair (kbp), 124856 bp of N50 contig length and 55.17 % of GC base content percentage. About 4705 protein gene was identified from protein prediction analysis. Two candidate genes selected have highest similarity identity percentage against gelatinase enzyme available in Swiss-Prot and NCBI online database. They were NODE_9_length_26866_cov_148.013245_12 containing 1029 base pair (bp) sequence with 342 amino acid sequence and NODE_24_length_155103_cov_177.082458_62 which containing 717 bp sequence with 238 amino acid sequence, respectively. Thus, two paired of primers (forward and reverse) were designed, based on the open reading frame (ORF) of selected genes. Genome analysis of E. aerogenes resulting genes encoded gelatinase were identified.

  19. Acid–base bifunctional shell cross-linked micelle nanoreactor for one-pot tandem reaction

    DOE PAGES

    Lee, Li -Chen; Lu, Jie; Weck, Marcus; ...

    2015-12-29

    In shell cross-linked micelles (SCMs) containing acid sites in the shell and base sites in the core are prepared from amphiphilic poly(2-oxazoline) triblock copolymers. These materials are utilized as two-chamber nanoreactors for a prototypical acid-base bifunctional tandem deacetalization-nitroaldol reaction. Furthermore, the acid and base sites are localized in different regions of the micelle, allowing the two steps in the reaction sequence to largely proceed in separate compartments, akin to the compartmentalization that occurs in biological systems.

  20. Molecular characterization of the vitamin D receptor (VDR) gene in Holstein cows.

    PubMed

    Ali, Mayar O; El-Adl, Mohamed A; Ibrahim, Hussam M M; Elseedy, Youssef Y; Rizk, Mohamed A; El-Khodery, Sabry A

    2018-06-01

    Vitamin D plays a vital role in calcium homeostasis, growth, and immunoregulation. Because little is known about the vitamin D receptor (VDR) gene in cattle, the aim of the present investigation was to present the molecular characterization of exons 5 and 6 of the VDR gene in Holstein cows. DNA extraction, genomic sequencing, phylogenetic analysis, synteny mapping and single nucleotide gene polymorphism analysis of the VDR gene were performed to assess blood samples collected from 50 clinically healthy Holstein cows. The results revealed the presence of a 450-base pair (bp) nucleotide sequence that resembled exons 5 and 6 with intron 5 enclosed between these exons. Sequence alignment and phylogenetic analysis revealed a close relationship between the sequenced VDR region and that found in Hereford cattle. A close association between this region and the corresponding region in small ruminants was also documented. Moreover, a single nucleotide polymorphism (SNP) that caused the replacement of a glutamate with an arginine in the deduced amino acid sequence was detected at position 7 of exon 5. In conclusion, Holstein and Hereford cattle differ with respect to exon 5 of the VDR gene. Phylogenetic analysis of the VDR gene based on nucleotide sequence produced different results from prior analyses based on amino acid sequence. Copyright © 2018 Elsevier Ltd. All rights reserved.

  1. Students' Understanding of Acids/Bases in Organic Chemistry Contexts

    ERIC Educational Resources Information Center

    Cartrette, David P.; Mayo, Provi M.

    2011-01-01

    Understanding key foundational principles is vital to learning chemistry across different contexts. One such foundational principle is the acid/base behavior of molecules. In the general chemistry sequence, the Bronsted-Lowry theory is stressed, because it lends itself well to studying equilibrium and kinetics. However, the Lewis theory of…

  2. Molecular cloning and nucleotide sequence of the alpha and beta subunits of allophycocyanin from the cyanelle genome of Cyanophora paradoxa.

    PubMed Central

    Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E

    1985-01-01

    The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916

  3. Creation of a data base for sequences of ribosomal nucleic acids and detection of conserved restriction endonucleases sites through computerized processing.

    PubMed Central

    Patarca, R; Dorta, B; Ramirez, J L

    1982-01-01

    As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402

  4. Guiding principles for peptide nanotechnology through directed discovery.

    PubMed

    Lampel, A; Ulijn, R V; Tuttle, T

    2018-05-21

    Life's diverse molecular functions are largely based on only a small number of highly conserved building blocks - the twenty canonical amino acids. These building blocks are chemically simple, but when they are organized in three-dimensional structures of tremendous complexity, new properties emerge. This review explores recent efforts in the directed discovery of functional nanoscale systems and materials based on these same amino acids, but that are not guided by copying or editing biological systems. The review summarises insights obtained using three complementary approaches of searching the sequence space to explore sequence-structure relationships for assembly, reactivity and complexation, namely: (i) strategic editing of short peptide sequences; (ii) computational approaches to predicting and comparing assembly behaviours; (iii) dynamic peptide libraries that explore the free energy landscape. These approaches give rise to guiding principles on controlling order/disorder, complexation and reactivity by peptide sequence design.

  5. Computational design of enzyme-ligand binding using a combined energy function and deterministic sequence optimization algorithm.

    PubMed

    Tian, Ye; Huang, Xiaoqiang; Zhu, Yushan

    2015-08-01

    Enzyme amino-acid sequences at ligand-binding interfaces are evolutionarily optimized for reactions, and the natural conformation of an enzyme-ligand complex must have a low free energy relative to alternative conformations in native-like or non-native sequences. Based on this assumption, a combined energy function was developed for enzyme design and then evaluated by recapitulating native enzyme sequences at ligand-binding interfaces for 10 enzyme-ligand complexes. In this energy function, the electrostatic interaction between polar or charged atoms at buried interfaces is described by an explicitly orientation-dependent hydrogen-bonding potential and a pairwise-decomposable generalized Born model based on the general side chain in the protein design framework. The energy function is augmented with a pairwise surface-area based hydrophobic contribution for nonpolar atom burial. Using this function, on average, 78% of the amino acids at ligand-binding sites were predicted correctly in the minimum-energy sequences, whereas 84% were predicted correctly in the most-similar sequences, which were selected from the top 20 sequences for each enzyme-ligand complex. Hydrogen bonds at the enzyme-ligand binding interfaces in the 10 complexes were usually recovered with the correct geometries. The binding energies calculated using the combined energy function helped to discriminate the active sequences from a pool of alternative sequences that were generated by repeatedly solving a series of mixed-integer linear programming problems for sequence selection with increasing integer cuts.

  6. Cloning and purification of alpha-neurotoxins from king cobra (Ophiophagus hannah).

    PubMed

    He, Ying-Ying; Lee, Wei-Hui; Zhang, Yun

    2004-09-01

    Thirteen complete and three partial cDNA sequences were cloned from the constructed king cobra (Ophiophagus hannah) venom gland cDNA library. Phylogenetic analysis of nucleotide sequences of king cobra with those from other snake venoms revealed that obtained cDNAs are highly homologous to snake venom alpha-neurotoxins. Alignment of deduced mature peptide sequences of the obtained clones with those of other reported alpha-neurotoxins from the king cobra venom indicates that our obtained 16 clones belong to long-chain neurotoxins (seven), short-chain neurotoxins (seven), weak toxin (one) and variant (one), respectively. Up to now, two out of 16 newly cloned king cobra alpha-neurotoxins have identical amino acid sequences with CM-11 and Oh-6A/6B, which have been characterized from the same venom. Furthermore, five long-chain alpha-neurotoxins and two short-chain alpha-neurotoxins were purified from crude venom and their N-terminal amino acid sequences were determined. The cDNAs encoding the putative precursors of the purified native peptide were also determined based on the N-terminal amino acid sequencing. The purified alpha-neurotoxins showed different lethal activities on mice.

  7. Retention of nucleic acids in ion-pair reversed-phase high-performance liquid chromatography depends not only on base composition but also on base sequence.

    PubMed

    Qiao, Jun-Qin; Liang, Chao; Wei, Lan-Chun; Cao, Zhao-Ming; Lian, Hong-Zhen

    2016-12-01

    The study on nucleic acid retention in ion-pair reversed-phase high-performance liquid chromatography mainly focuses on size-dependence, however, other factors influencing retention behaviors have not been comprehensively clarified up to date. In this present work, the retention behaviors of oligonucleotides and double-stranded DNAs were investigated on silica-based C 18 stationary phase by ion-pair reversed-phase high-performance liquid chromatography. It is found that the retention of oligonucleotides was influenced by base composition and base sequence as well as size, and oligonucleotides prone to self-dimerization have weaker retention than those not prone to self-dimerization but with the same base composition. However, homo-oligonucleotides are suitable for the size-dependent separation as a special case of oligonucleotides. For double-stranded DNAs, the retention is also influenced by base composition and base sequence, as well as size. This may be attributed to the interaction of exposed bases in major or minor grooves with the hydrophobic alky chains of stationary phase. In addition, no specific influence of guanine and cytosine content was confirmed on retention of double-stranded DNAs. Notably, the space effect resulted from the stereostructure of nucleic acids also influences the retention behavior in ion-pair reversed-phase high-performance liquid chromatography. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Molecular characterization of two genotypes of a new polerovirus infecting brassicas in China.

    PubMed

    Xiang, Hai-Ying; Dong, Shu-Wei; Shang, Qiao-Xia; Zhou, Cui-Ji; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2011-12-01

    The genomic RNA sequences of two genotypes of a brassica-infecting polerovirus from China were determined. Sequence analysis revealed that the virus was closely related to but significantly different from turnip yellows virus (TuYV). This virus and other poleroviruses, including TuYV, had less than 90% amino acid sequence identity in all gene products except the coat protein. Based on the molecular criterion (>10% amino acid sequence difference) for species demarcation in the genus Polerovirus, the virus represents a distinct species for which the name Brassica yellows virus (BrYV) is proposed. Interestingly, there were two genotypes of BrYV, which mainly differed in the 5'-terminal half of the genome.

  9. Structure-based conformational preferences of amino acids

    PubMed Central

    Koehl, Patrice; Levitt, Michael

    1999-01-01

    Proteins can be very tolerant to amino acid substitution, even within their core. Understanding the factors responsible for this behavior is of critical importance for protein engineering and design. Mutations in proteins have been quantified in terms of the changes in stability they induce. For example, guest residues in specific secondary structures have been used as probes of conformational preferences of amino acids, yielding propensity scales. Predicting these amino acid propensities would be a good test of any new potential energy functions used to mimic protein stability. We have recently developed a protein design procedure that optimizes whole sequences for a given target conformation based on the knowledge of the template backbone and on a semiempirical potential energy function. This energy function is purely physical, including steric interactions based on a Lennard-Jones potential, electrostatics based on a Coulomb potential, and hydrophobicity in the form of an environment free energy based on accessible surface area and interatomic contact areas. Sequences designed by this procedure for 10 different proteins were analyzed to extract conformational preferences for amino acids. The resulting structure-based propensity scales show significant agreements with experimental propensity scale values, both for α-helices and β-sheets. These results indicate that amino acid conformational preferences are a natural consequence of the potential energy we use. This confirms the accuracy of our potential and indicates that such preferences should not be added as a design criterion. PMID:10535955

  10. Chemical property based sequence characterization of PpcA and its homolog proteins PpcB-E: A mathematical approach

    PubMed Central

    Pal Choudhury, Pabitra

    2017-01-01

    Periplasmic c7 type cytochrome A (PpcA) protein is determined in Geobacter sulfurreducens along with its other four homologs (PpcB-E). From the crystal structure viewpoint the observation emerges that PpcA protein can bind with Deoxycholate (DXCA), while its other homologs do not. But it is yet to be established with certainty the reason behind this from primary protein sequence information. This study is primarily based on primary protein sequence analysis through the chemical basis of embedded amino acids. Firstly, we look for the chemical group specific score of amino acids. Along with this, we have developed a new methodology for the phylogenetic analysis based on chemical group dissimilarities of amino acids. This new methodology is applied to the cytochrome c7 family members and pinpoint how a particular sequence is differing with others. Secondly, we build a graph theoretic model on using amino acid sequences which is also applied to the cytochrome c7 family members and some unique characteristics and their domains are highlighted. Thirdly, we search for unique patterns as subsequences which are common among the group or specific individual member. In all the cases, we are able to show some distinct features of PpcA that emerges PpcA as an outstanding protein compared to its other homologs, resulting towards its binding with deoxycholate. Similarly, some notable features for the structurally dissimilar protein PpcD compared to the other homologs are also brought out. Further, the five members of cytochrome family being homolog proteins, they must have some common significant features which are also enumerated in this study. PMID:28362850

  11. Sequentially distant but structurally similar proteins exhibit fold specific patterns based on their biophysical properties.

    PubMed

    Rajendran, Senthilnathan; Jothi, Arunachalam

    2018-05-16

    The Three-dimensional structure of a protein depends on the interaction between their amino acid residues. These interactions are in turn influenced by various biophysical properties of the amino acids. There are several examples of proteins that share the same fold but are very dissimilar at the sequence level. For proteins to share a common fold some crucial interactions should be maintained despite insignificant sequence similarity. Since the interactions are because of the biophysical properties of the amino acids, we should be able to detect descriptive patterns for folds at such a property level. In this line, the main focus of our research is to analyze such proteins and to characterize them in terms of their biophysical properties. Protein structures with sequence similarity lesser than 40% were selected for ten different subfolds from three different mainfolds (according to CATH classification) and were used for this analysis. We used the normalized values of the 49 physio-chemical, energetic and conformational properties of amino acids. We characterize the folds based on the average biophysical property values. We also observed a fold specific correlational behavior of biophysical properties despite a very low sequence similarity in our data. We further trained three different binary classification models (Naive Bayes-NB, Support Vector Machines-SVM and Bayesian Generalized Linear Model-BGLM) which could discriminate mainfold based on the biophysical properties. We also show that among the three generated models, the BGLM classifier model was able to discriminate protein sequences coming under all beta category with 81.43% accuracy and all alpha, alpha-beta proteins with 83.37% accuracy. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method.

    PubMed

    Vasylenko, Tamara; Liou, Yi-Fan; Chen, Hong-An; Charoenkwan, Phasit; Huang, Hui-Ling; Ho, Shinn-Ying

    2015-01-01

    Photosynthetic proteins (PSPs) greatly differ in their structure and function as they are involved in numerous subprocesses that take place inside an organelle called a chloroplast. Few studies predict PSPs from sequences due to their high variety of sequences and structues. This work aims to predict and characterize PSPs by establishing the datasets of PSP and non-PSP sequences and developing prediction methods. A novel bioinformatics method of predicting and characterizing PSPs based on scoring card method (SCMPSP) was used. First, a dataset consisting of 649 PSPs was established by using a Gene Ontology term GO:0015979 and 649 non-PSPs from the SwissProt database with sequence identity <= 25%.- Several prediction methods are presented based on support vector machine (SVM), decision tree J48, Bayes, BLAST, and SCM. The SVM method using dipeptide features-performed well and yielded - a test accuracy of 72.31%. The SCMPSP method uses the estimated propensity scores of 400 dipeptides - as PSPs and has a test accuracy of 71.54%, which is comparable to that of the SVM method. The derived propensity scores of 20 amino acids were further used to identify informative physicochemical properties for characterizing PSPs. The analytical results reveal the following four characteristics of PSPs: 1) PSPs favour hydrophobic side chain amino acids; 2) PSPs are composed of the amino acids prone to form helices in membrane environments; 3) PSPs have low interaction with water; and 4) PSPs prefer to be composed of the amino acids of electron-reactive side chains. The SCMPSP method not only estimates the propensity of a sequence to be PSPs, it also discovers characteristics that further improve understanding of PSPs. The SCMPSP source code and the datasets used in this study are available at http://iclab.life.nctu.edu.tw/SCMPSP/.

  13. DNA sequence of the lymphotropic variant of minute virus of mice, MVM(i), and comparison with the DNA sequence of the fibrotropic prototype strain.

    PubMed

    Astell, C R; Gardiner, E M; Tattersall, P

    1986-02-01

    The sequence of molecular clones of the genome of MVM(i), a lymphotropic variant of minute virus of mice, was determined and compared with that of MVM(p), the fibrotropic prototype strain. At the nucleotide level there are 163 base changes: 129 transitions and 34 transversions. Most nucleotide changes are silent, with only 27 amino acids changes predicted, of which 22 are conservative. Notable differences between the MVM(i) and MVM(p) genomes which may account for the cell specificities of these viruses occur within the 3' nontranslated regions. The differences discussed include the absence of a 65-base-pair direct in MVM(i), the presence of only two polyadenylation sites in MVM(i) compared with four in MVM(p), and sequences that bear a resemblance to enhancer sequences. Also included in this paper is an important correction to the MVM(p) sequence (C.R. Astell, M. Thomson, M. Merchlinsky, and D. C. Ward, Nucleic Acids Res. 11:999-1018, 1983).

  14. Typing of canine parvovirus isolates using mini-sequencing based single nucleotide polymorphism analysis.

    PubMed

    Naidu, Hariprasad; Subramanian, B Mohana; Chinchkar, Shankar Ramchandra; Sriraman, Rajan; Rana, Samir Kumar; Srinivasan, V A

    2012-05-01

    The antigenic types of canine parvovirus (CPV) are defined based on differences in the amino acids of the major capsid protein VP2. Type specificity is conferred by a limited number of amino acid changes and in particular by few nucleotide substitutions. PCR based methods are not particularly suitable for typing circulating variants which differ in a few specific nucleotide substitutions. Assays for determining SNPs can detect efficiently nucleotide substitutions and can thus be adapted to identify CPV types. In the present study, CPV typing was performed by single nucleotide extension using the mini-sequencing technique. A mini-sequencing signature was established for all the four CPV types (CPV2, 2a, 2b and 2c) and feline panleukopenia virus. The CPV typing using the mini-sequencing reaction was performed for 13 CPV field isolates and the two vaccine strains available in our repository. All the isolates had been typed earlier by full-length sequencing of the VP2 gene. The typing results obtained from mini-sequencing matched completely with that of sequencing. Typing could be achieved with less than 100 copies of standard plasmid DNA constructs or ≤10¹ FAID₅₀ of virus by mini-sequencing technique. The technique was also efficient for detecting multiple types in mixed infections. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. Species specific identification of spore-producing microbes using the gene sequence of small acid-soluble spore coat proteins for amplification based diagnostics

    DOEpatents

    McKinney, Nancy

    2002-01-01

    PCR (polymerase chain reaction) primers for the detection of certain Bacillus species, such as Bacillus anthracis. The primers specifically amplify only DNA found in the target species and can distinguish closely related species. Species-specific PCR primers for Bacillus anthracis, Bacillus globigii and Clostridium perfringens are disclosed. The primers are directed to unique sequences within sasp (small acid soluble protein) genes.

  16. Numeric promoter description - A comparative view on concepts and general application.

    PubMed

    Beier, Rico; Labudde, Dirk

    2016-01-01

    Nucleic acid molecules play a key role in a variety of biological processes. Starting from storage and transfer tasks, this also comprises the triggering of biological processes, regulatory effects and the active influence gained by target binding. Based on the experimental output (in this case promoter sequences), further in silico analyses aid in gaining new insights into these processes and interactions. The numerical description of nucleic acids thereby constitutes a bridge between the concrete biological issues and the analytical methods. Hence, this study compares 26 descriptor sets obtained by applying well-known numerical description concepts to an established dataset of 38 DNA promoter sequences. The suitability of the description sets was evaluated by computing partial least squares regression models and assessing the model accuracy. We conclude that the major importance regarding the descriptive power is attached to positional information rather than to explicitly incorporated physico-chemical information, since a sufficient amount of implicit physico-chemical information is already encoded in the nucleobase classification. The regression models especially benefited from employing the information that is encoded in the sequential and structural neighborhood of the nucleobases. Thus, the analyses of n-grams (short fragments of length n) suggested that they are valuable descriptors for DNA target interactions. A mixed n-gram descriptor set thereby yielded the best description of the promoter sequences. The corresponding regression model was checked and found to be plausible as it was able to reproduce the characteristic binding motifs of promoter sequences in a reasonable degree. As most functional nucleic acids are based on the principle of molecular recognition, the findings are not restricted to promoter sequences, but can rather be transferred to other kinds of functional nucleic acids. Thus, the concepts presented in this study could provide advantages for future nucleic acid-based technologies, like biosensoring, therapeutics and molecular imaging. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Partial nucleotide sequences, and routine typing by polymerase chain reaction-restriction fragment length polymorphism, of the brown trout (Salmo trutta) lactate dehydrogenase, LDH-C1*90 and *100 alleles.

    PubMed

    McMeel, O M; Hoey, E M; Ferguson, A

    2001-01-01

    The cDNA nucleotide sequences of the lactate dehydrogenase alleles LDH-C1*90 and *100 of brown trout (Salmo trutta) were found to differ at position 308 where an A is present in the *100 allele but a G is present in the *90 allele. This base substitution results in an amino acid change from aspartic acid at position 82 in the LDH-C1 100 allozyme to a glycine in the 90 allozyme. Since aspartic acid has a net negative charge whilst glycine is uncharged, this is consistent with the electrophoretic observation that the LDH-C1 100 allozyme has a more anodal mobility relative to the LDH-C1 90 allozyme. Based on alignment of the cDNA sequence with the mouse genomic sequence, a local primer set was designed, incorporating the variable position, and was found to give very good amplification with brown trout genomic DNA. Sequencing of this fragment confirmed the difference in both homozygous and heterozygous individuals. Digestion of the polymerase chain reaction products with BslI, a restriction enzyme specific for the site difference, gave one, two and three fragments for the two homozygotes and the heterozygote, respectively, following electrophoretic separation. This provides a DNA-based means of routine screening of the highly informative LDH-C1* polymorphism in brown trout population genetic studies. Primer sets presented could be used to sequence cDNA of other LDH* genes of brown trout and other species.

  18. Elman RNN based classification of proteins sequences on account of their mutual information.

    PubMed

    Mishra, Pooja; Nath Pandey, Paras

    2012-10-21

    In the present work we have employed the method of estimating residue correlation within the protein sequences, by using the mutual information (MI) of adjacent residues, based on structural and solvent accessibility properties of amino acids. The long range correlation between nonadjacent residues is improved by constructing a mutual information vector (MIV) for a single protein sequence, like this each protein sequence is associated with its corresponding MIVs. These MIVs are given to Elman RNN to obtain the classification of protein sequences. The modeling power of MIV was shown to be significantly better, giving a new approach towards alignment free classification of protein sequences. We also conclude that sequence structural and solvent accessible property based MIVs are better predictor. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. A reduced amino acid alphabet for understanding and designing protein adaptation to mutation.

    PubMed

    Etchebest, C; Benros, C; Bornot, A; Camproux, A-C; de Brevern, A G

    2007-11-01

    Protein sequence world is considerably larger than structure world. In consequence, numerous non-related sequences may adopt similar 3D folds and different kinds of amino acids may thus be found in similar 3D structures. By grouping together the 20 amino acids into a smaller number of representative residues with similar features, sequence world simplification may be achieved. This clustering hence defines a reduced amino acid alphabet (reduced AAA). Numerous works have shown that protein 3D structures are composed of a limited number of building blocks, defining a structural alphabet. We previously identified such an alphabet composed of 16 representative structural motifs (5-residues length) called Protein Blocks (PBs). This alphabet permits to translate the structure (3D) in sequence of PBs (1D). Based on these two concepts, reduced AAA and PBs, we analyzed the distributions of the different kinds of amino acids and their equivalences in the structural context. Different reduced sets were considered. Recurrent amino acid associations were found in all the local structures while other were specific of some local structures (PBs) (e.g Cysteine, Histidine, Threonine and Serine for the alpha-helix Ncap). Some similar associations are found in other reduced AAAs, e.g Ile with Val, or hydrophobic aromatic residues Trp with Phe and Tyr. We put into evidence interesting alternative associations. This highlights the dependence on the information considered (sequence or structure). This approach, equivalent to a substitution matrix, could be useful for designing protein sequence with different features (for instance adaptation to environment) while preserving mainly the 3D fold.

  20. Prediction of glutathionylation sites in proteins using minimal sequence information and their experimental validation.

    PubMed

    Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K

    2016-09-01

    S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.

  1. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  2. Identification and characterization of a NBS–LRR class resistance gene analog in Pistacia atlantica subsp. Kurdica

    PubMed Central

    Bahramnejad, Bahman

    2014-01-01

    P. atlantica subsp. Kurdica, with the local name of Baneh, is a wild medicinal plant which grows in Kurdistan, Iran. The identification of resistance gene analogs holds great promise for the development of resistant cultivars. A PCR approach with degenerate primers designed according to conserved NBS-LRR (nucleotide binding site-leucine rich repeat) regions of known disease-resistance (R) genes was used to amplify and clone homologous sequences from P. atlantica subsp. Kurdica. A DNA fragment of the expected 500-bp size was amplified. The nucleotide sequence of this amplicon was obtained through sequencing and the predicted amino acid sequence compared to the amino acid sequences of known R-genes revealed significant sequence similarity. Alignment of the deduced amino acid sequence of P. atlantica subsp. Kurdica resistance gene analog (RGA) showed strong identity, ranging from 68% to 77%, to the non-toll interleukin receptor (non-TIR) R-gene subfamily from other plants. A P-loop motif (GMMGGEGKTT), a conserved and hydrophobic motif GLPLAL, a kinase-2a motif (LLVLDDV), when replaced by IAVFDDI in PAKRGA1 and a kinase-3a (FGPGSRIII) were presented in all RGA. A phylogenetic tree, based on the deduced amino-acid sequences of PAKRGA1 and RGAs from different species indicated that they were separated in two clusters, PAKRGA1 being on cluster II. The isolated NBS analogs can be eventually used as guidelines to isolate numerous R-genes in Pistachio. PMID:27843981

  3. Simplified Identification of mRNA or DNA in Whole Cells

    NASA Technical Reports Server (NTRS)

    Almeida, Eduardo; Kadambi, Geeta

    2007-01-01

    A recently invented method of detecting a selected messenger ribonucleic acid (mRNA) or deoxyribonucleic acid (DNA) sequence offers two important advantages over prior such methods: it is simpler and can be implemented by means of compact equipment. The simplification and miniaturization achieved by this invention are such that this method is suitable for use outside laboratories, in field settings in which space and power supplies may be limited. The present method is based partly on hybridization of nucleic acid, which is a powerful technique for detection of specific complementary nucleic acid sequences and is increasingly being used for detection of changes in gene expression in microarrays containing thousands of gene probes.

  4. A statistical physics perspective on alignment-independent protein sequence comparison.

    PubMed

    Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

    2015-08-01

    Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.

  5. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information.

    PubMed

    Song, Jiangning; Burrage, Kevin; Yuan, Zheng; Huber, Thomas

    2006-03-09

    The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.

  6. Random Amplification and Pyrosequencing for Identification of Novel Viral Genome Sequences

    PubMed Central

    Hang, Jun; Forshey, Brett M.; Kochel, Tadeusz J.; Li, Tao; Solórzano, Víctor Fiestas; Halsey, Eric S.; Kuschner, Robert A.

    2012-01-01

    ssRNA viruses have high levels of genomic divergence, which can lead to difficulty in genomic characterization of new viruses using traditional PCR amplification and sequencing methods. In this study, random reverse transcription, anchored random PCR amplification, and high-throughput pyrosequencing were used to identify orthobunyavirus sequences from total RNA extracted from viral cultures of acute febrile illness specimens. Draft genome sequence for the orthobunyavirus L segment was assembled and sequentially extended using de novo assembly contigs from pyrosequencing reads and orthobunyavirus sequences in GenBank as guidance. Accuracy and continuous coverage were achieved by mapping all reads to the L segment draft sequence. Subsequently, RT-PCR and Sanger sequencing were used to complete the genome sequence. The complete L segment was found to be 6936 bases in length, encoding a 2248-aa putative RNA polymerase. The identified L segment was distinct from previously published South American orthobunyaviruses, sharing 63% and 54% identity at the nucleotide and amino acid level, respectively, with the complete Oropouche virus L segment and 73% and 81% identity at the nucleotide and amino acid level, respectively, with a partial Caraparu virus L segment. The result demonstrated the effectiveness of a sequence-independent amplification and next-generation sequencing approach for obtaining complete viral genomes from total nucleic acid extracts and its use in pathogen discovery. PMID:22468136

  7. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  8. DNA Music.

    ERIC Educational Resources Information Center

    Miner, Carol; della Villa, Paula

    1997-01-01

    Describes an activity in which students reverse-translate proteins from their amino acid sequences back to their DNA sequences then assign musical notes to represent the adenine, guanine, cytosine, and thymine bases. Data is obtained from the National Institutes of Health (NIH) on the Internet. (DDR)

  9. Genome Sequence of Lactobacillus saerimneri 30a (Formerly Lactobacillus sp. Strain 30a), a Reference Lactic Acid Bacterium Strain Producing Biogenic Amines

    PubMed Central

    Romano, Andrea; Trip, Hein; Campbell-Sills, Hugo; Bouchez, Olivier; Sherman, David; Lolkema, Juke S.

    2013-01-01

    Lactobacillus sp. strain 30a (Lactobacillus saerimneri) produces the biogenic amines histamine, putrescine, and cadaverine by decarboxylating their amino acid precursors. We report its draft genome sequence (1,634,278 bases, 42.6% G+C content) and the principal findings from its annotation, which might shed light onto the enzymatic machineries that are involved in its production of biogenic amines. PMID:23405290

  10. Assessing quality of Medicago sativa silage by monitoring bacterial composition with single molecule, real-time sequencing technology and various physiological parameters

    PubMed Central

    Bao, Weichen; Mi, Zhihui; Xu, Haiyan; Zheng, Yi; Kwok, Lai Yu; Zhang, Heping; Zhang, Wenyi

    2016-01-01

    The present study applied the PacBio single molecule, real-time sequencing technology (SMRT) in evaluating the quality of silage production. Specifically, we produced four types of Medicago sativa silages by using four different lactic acid bacteria-based additives (AD-I, AD-II, AD-III and AD-IV). We monitored the changes in pH, organic acids (including butyric acid, the ratio of acetic acid/lactic acid, γ-aminobutyric acid, 4-hyroxy benzoic acid and phenyl lactic acid), mycotoxins, and bacterial microbiota during silage fermentation. Our results showed that the use of the additives was beneficial to the silage fermentation by enhancing a general pH and mycotoxin reduction, while increasing the organic acids content. By SMRT analysis of the microbial composition in eight silage samples, we found that the bacterial species number and relative abundances shifted apparently after fermentation. Such changes were specific to the LAB species in the additives. Particularly, Bacillus megaterium was the initial dominant species in the raw materials; and after the fermentation process, Pediococcus acidilactici and Lactobacillus plantarum became the most prevalent species, both of which were intrinsically present in the LAB additives. Our data have demonstrated that the SMRT sequencing platform is applicable in assessing the quality of silage. PMID:27340760

  11. Assessing quality of Medicago sativa silage by monitoring bacterial composition with single molecule, real-time sequencing technology and various physiological parameters.

    PubMed

    Bao, Weichen; Mi, Zhihui; Xu, Haiyan; Zheng, Yi; Kwok, Lai Yu; Zhang, Heping; Zhang, Wenyi

    2016-06-24

    The present study applied the PacBio single molecule, real-time sequencing technology (SMRT) in evaluating the quality of silage production. Specifically, we produced four types of Medicago sativa silages by using four different lactic acid bacteria-based additives (AD-I, AD-II, AD-III and AD-IV). We monitored the changes in pH, organic acids (including butyric acid, the ratio of acetic acid/lactic acid, γ-aminobutyric acid, 4-hyroxy benzoic acid and phenyl lactic acid), mycotoxins, and bacterial microbiota during silage fermentation. Our results showed that the use of the additives was beneficial to the silage fermentation by enhancing a general pH and mycotoxin reduction, while increasing the organic acids content. By SMRT analysis of the microbial composition in eight silage samples, we found that the bacterial species number and relative abundances shifted apparently after fermentation. Such changes were specific to the LAB species in the additives. Particularly, Bacillus megaterium was the initial dominant species in the raw materials; and after the fermentation process, Pediococcus acidilactici and Lactobacillus plantarum became the most prevalent species, both of which were intrinsically present in the LAB additives. Our data have demonstrated that the SMRT sequencing platform is applicable in assessing the quality of silage.

  12. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

    PubMed

    Tan, Yen Hock; Huang, He; Kihara, Daisuke

    2006-08-15

    Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.

  13. CRISPR Spacer Arrays for Detection of Viral Signatures from Acidic Hot Springs

    NASA Astrophysics Data System (ADS)

    Snyder, J. C.; Bateson, M. M.; Suciu, D.; Young, M. J.

    2010-04-01

    Viruses are the most abundant life-like entities on the planet Earth. Using CRISPR spacer sequences, we have developed a microarray-based approach to detecting viral signatures in the acidic hot springs of Yellowstone.

  14. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  15. Terminal region sequence variations in variola virus DNA.

    PubMed

    Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J

    1996-07-15

    Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted.

  16. A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

    PubMed

    García-Remesal, Miguel; Maojo, Victor; Crespo, José

    2010-01-01

    In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences.

  17. Solid phase sequencing of biopolymers

    DOEpatents

    Cantor, Charles; Koster, Hubert

    2010-09-28

    This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.

  18. Depletion of Unwanted Nucleic Acid Templates by Selective Cleavage: LNAzymes, Catalytically Active Oligonucleotides Containing Locked Nucleic Acids, Open a New Window for Detecting Rare Microbial Community Members

    PubMed Central

    Dolinšek, Jan; Dorninger, Christiane; Lagkouvardos, Ilias; Wagner, Michael

    2013-01-01

    Many studies of molecular microbial ecology rely on the characterization of microbial communities by PCR amplification, cloning, sequencing, and phylogenetic analysis of genes encoding rRNAs or functional marker enzymes. However, if the established clone libraries are dominated by one or a few sequence types, the cloned diversity is difficult to analyze by random clone sequencing. Here we present a novel approach to deplete unwanted sequence types from complex nucleic acid mixtures prior to cloning and downstream analyses. It employs catalytically active oligonucleotides containing locked nucleic acids (LNAzymes) for the specific cleavage of selected RNA targets. When combined with in vitro transcription and reverse transcriptase PCR, this LNAzyme-based technique can be used with DNA or RNA extracts from microbial communities. The simultaneous application of more than one specific LNAzyme allows the concurrent depletion of different sequence types from the same nucleic acid preparation. This new method was evaluated with defined mixtures of cloned 16S rRNA genes and then used to identify accompanying bacteria in an enrichment culture dominated by the nitrite oxidizer “Candidatus Nitrospira defluvii.” In silico analysis revealed that the majority of publicly deposited rRNA-targeted oligonucleotide probes may be used as specific LNAzymes with no or only minor sequence modifications. This efficient and cost-effective approach will greatly facilitate tasks such as the identification of microbial symbionts in nucleic acid preparations dominated by plastid or mitochondrial rRNA genes from eukaryotic hosts, the detection of contaminants in microbial cultures, and the analysis of rare organisms in microbial communities of highly uneven composition. PMID:23263968

  19. Isolation and characterization of the chicken trypsinogen gene family.

    PubMed Central

    Wang, K; Gan, L; Lee, I; Hood, L

    1995-01-01

    Based on genomic Southern hybridizations and cDNA sequence analyses, the chicken trypsinogen gene family can be divided into two multi-member subfamilies, a six-member trypsinogen I subfamily which encodes the cationic trypsin isoenzymes and a three-member trypsinogen II subfamily which encodes the anionic trypsin isoenzymes. The chicken cDNA and genomic clones containing these two subfamilies were isolated and characterized by DNA sequence analysis. The results indicated that the chicken trypsinogen genes encoded a signal peptide of 15 to 16 amino acid residues, an activation peptide of 9 to 10 residues and a trypsin of 223 amino acid residues. The chicken trypsinogens contain all the common catalytic and structural features for trypsins, including the catalytic triad His, Asp and Ser and the six disulphide bonds. The trypsinogen I and II subfamilies share approximately 70% sequence identity at the nucleotide and amino acid level. The sequence comparison among chicken trypsinogen subfamily members and trypsin sequences from other species suggested that the chicken trypsinogen genes may have evolved in coincidental or concerted fashion. Images Figure 6 Figure 7 PMID:7733885

  20. CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design

    PubMed Central

    Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven

    2003-01-01

    We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413

  1. Haloarcula hispanica CRISPR authenticates PAM of a target sequence to prime discriminative adaptation

    PubMed Central

    Li, Ming; Wang, Rui; Xiang, Hua

    2014-01-01

    The prokaryotic immune system CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR-associated genes) adapts to foreign invaders by acquiring their short deoxyribonucleic acid (DNA) fragments as spacers, which guide subsequent interference to foreign nucleic acids based on sequence matching. The adaptation mechanism avoiding acquiring ‘self’ DNA fragments is poorly understood. In Haloarcula hispanica, we previously showed that CRISPR adaptation requires being primed by a pre-existing spacer partially matching the invader DNA. Here, we further demonstrate that flanking a fully-matched target sequence, a functional PAM (protospacer adjacent motif) is still required to prime adaptation. Interestingly, interference utilizes only four PAM sequences, whereas adaptation-priming tolerates as many as 23 PAM sequences. This relaxed PAM selectivity explains how adaptation-priming maximizes its tolerance of PAM mutations (that escape interference) while avoiding mis-targeting the spacer DNA within CRISPR locus. We propose that the primed adaptation, which hitches and cooperates with the interference pathway, distinguishes target from non-target by CRISPR ribonucleic acid guidance and PAM recognition. PMID:24803673

  2. 42 CFR 73.1 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... otherwise modified but can base pair with naturally occurring nucleic acid molecules (i.e., synthetic... conotoxins containing the following amino acid sequence X1CCX2PACGX3X4X5X6CX7, whereas: (1) C = Cysteine... well as α-GIA, Ac1.1a, α-CnIA, α-CnIB; (3) X1 = any amino acid(s) or Des-X; (4) X2 = Asparagine or...

  3. 42 CFR 73.1 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... otherwise modified but can base pair with naturally occurring nucleic acid molecules (i.e., synthetic... conotoxins containing the following amino acid sequence X1CCX2PACGX3X4X5X6CX7, whereas: (1) C = Cysteine... well as α-GIA, Ac1.1a, α-CnIA, α-CnIB; (3) X1 = any amino acid(s) or Des-X; (4) X2 = Asparagine or...

  4. Characterization of the complete mitochondrial genome of Marshallagia marshalli and phylogenetic implications for the superfamily Trichostrongyloidea.

    PubMed

    Sun, Miao-Miao; Han, Liang; Zhang, Fu-Kai; Zhou, Dong-Hui; Wang, Shu-Qing; Ma, Jun; Zhu, Xing-Quan; Liu, Guo-Hua

    2018-01-01

    Marshallagia marshalli (Nematoda: Trichostrongylidae) infection can lead to serious parasitic gastroenteritis in sheep, goat, and wild ruminant, causing significant socioeconomic losses worldwide. Up to now, the study concerning the molecular biology of M. marshalli is limited. Herein, we sequenced the complete mitochondrial (mt) genome of M. marshalli and examined its phylogenetic relationship with selected members of the superfamily Trichostrongyloidea using Bayesian inference (BI) based on concatenated mt amino acid sequence datasets. The complete mt genome sequence of M. marshalli is 13,891 bp, including 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes. All protein-coding genes are transcribed in the same direction. Phylogenetic analyses based on concatenated amino acid sequences of the 12 protein-coding genes supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support, but rejected the monophyly of the family Trichostrongylidae. The determination of the complete mt genome sequence of M. marshalli provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.

  5. Sequences of heavy and light chain variable regions from four bovine immunoglobulins.

    PubMed

    Armour, K L; Tempest, P R; Fawcett, P H; Fernie, M L; King, S I; White, P; Taylor, G; Harris, W J

    1994-12-01

    Oligodeoxyribonucleotide primers based on the 5' ends of bovine IgG1/2 and lambda constant (C) region genes, together with primers encoding conserved amino acids at the N-terminus of mature variable (V) regions from other species, have been used in cDNA and polymerase chain reactions (PCRs) to amplify heavy and light chain V region cDNA from bovine heterohybridomas. The amino acid sequences of VH and V lambda from four bovine immunoglobulins of different specificities are presented.

  6. Analyses of mitochondrial amino acid sequence datasets support the proposal that specimens of Hypodontus macropi from three species of macropodid hosts represent distinct species

    PubMed Central

    2013-01-01

    Background Hypodontus macropi is a common intestinal nematode of a range of kangaroos and wallabies (macropodid marsupials). Based on previous multilocus enzyme electrophoresis (MEE) and nuclear ribosomal DNA sequence data sets, H. macropi has been proposed to be complex of species. To test this proposal using independent molecular data, we sequenced the whole mitochondrial (mt) genomes of individuals of H. macropi from three different species of hosts (Macropus robustus robustus, Thylogale billardierii and Macropus [Wallabia] bicolor) as well as that of Macropicola ocydromi (a related nematode), and undertook a comparative analysis of the amino acid sequence datasets derived from these genomes. Results The mt genomes sequenced by next-generation (454) technology from H. macropi from the three host species varied from 13,634 bp to 13,699 bp in size. Pairwise comparisons of the amino acid sequences predicted from these three mt genomes revealed differences of 5.8% to 18%. Phylogenetic analysis of the amino acid sequence data sets using Bayesian Inference (BI) showed that H. macropi from the three different host species formed distinct, well-supported clades. In addition, sliding window analysis of the mt genomes defined variable regions for future population genetic studies of H. macropi in different macropodid hosts and geographical regions around Australia. Conclusions The present analyses of inferred mt protein sequence datasets clearly supported the hypothesis that H. macropi from M. robustus robustus, M. bicolor and T. billardierii represent distinct species. PMID:24261823

  7. Four distinct types of E.C. 1.2.1.30 enzymes can catalyze the reduction of carboxylic acids to aldehydes.

    PubMed

    Stolterfoht, Holly; Schwendenwein, Daniel; Sensen, Christoph W; Rudroff, Florian; Winkler, Margit

    2017-09-10

    Increasing demand for chemicals from renewable resources calls for the development of new biotechnological methods for the reduction of oxidized bio-based compounds. Enzymatic carboxylate reduction is highly selective, both in terms of chemo- and product selectivity, but not many carboxylate reductase enzymes (CARs) have been identified on the sequence level to date. Thus far, their phylogeny is unexplored and very little is known about their structure-function-relationship. CARs minimally contain an adenylation domain, a phosphopantetheinylation domain and a reductase domain. We have recently identified new enzymes of fungal origin, using similarity searches against genomic sequences from organisms in which aldehydes were detected upon incubation with carboxylic acids. Analysis of sequences with known CAR functionality and CAR enzymes recently identified in our laboratory suggests that the three-domain architecture mentioned above is modular. The construction of a distance tree with a subsequent 1000-replicate bootstrap analysis showed that the CAR sequences included in our study fall into four distinct subgroups (one of bacterial origin and three of fungal origin, respectively), each with a bootstrap value of 100%. The multiple sequence alignment of all experimentally confirmed CAR protein sequences revealed fingerprint sequences of residues which are likely to be involved in substrate and co-substrate binding and one of the three catalytic substeps, respectively. The fingerprint sequences broaden our understanding of the amino acids that might be essential for the reduction of organic acids to the corresponding aldehydes in CAR proteins. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Sequence-based analysis of the microbial composition of water kefir from multiple sources.

    PubMed

    Marsh, Alan J; O'Sullivan, Orla; Hill, Colin; Ross, R Paul; Cotter, Paul D

    2013-11-01

    Water kefir is a water-sucrose-based beverage, fermented by a symbiosis of bacteria and yeast to produce a final product that is lightly carbonated, acidic and that has a low alcohol percentage. The microorganisms present in water kefir are introduced via water kefir grains, which consist of a polysaccharide matrix in which the microorganisms are embedded. We aimed to provide a comprehensive sequencing-based analysis of the bacterial population of water kefir beverages and grains, while providing an initial insight into the corresponding fungal population. To facilitate this objective, four water kefirs were sourced from the UK, Canada and the United States. Culture-independent, high-throughput, sequencing-based analyses revealed that the bacterial fraction of each water kefir and grain was dominated by Zymomonas, an ethanol-producing bacterium, which has not previously been detected at such a scale. The other genera detected were representatives of the lactic acid bacteria and acetic acid bacteria. Our analysis of the fungal component established that it was comprised of the genera Dekkera, Hanseniaspora, Saccharomyces, Zygosaccharomyces, Torulaspora and Lachancea. This information will assist in the ultimate identification of the microorganisms responsible for the potentially health-promoting attributes of these beverages. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  9. Statistical theory for protein combinatorial libraries. Packing interactions, backbone flexibility, and the sequence variability of a main-chain structure.

    PubMed

    Kono, H; Saven, J G

    2001-02-23

    Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.

  10. Genotype-specific signal generation based on digestion of 3-way DNA junctions: application to KRAS variation detection.

    PubMed

    Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike

    2006-10-01

    Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.

  11. GASP: Gapped Ancestral Sequence Prediction for proteins

    PubMed Central

    Edwards, Richard J; Shields, Denis C

    2004-01-01

    Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

  12. Design and preparation of beta-sheet forming repetitive and block-copolymerized polypeptides.

    PubMed

    Higashiya, Seiichiro; Topilina, Natalya I; Ngo, Silvana C; Zagorevskii, Dmitri; Welch, John T

    2007-05-01

    The design and rapid construction of libraries of genes coding beta-sheet forming repetitive and block-copolymerized polypeptides bearing various C- and N-terminal sequences are described. The design was based on the assembly of DNA cassettes coding for the (GA)3GX amino acid sequence where the (GAGAGA) sequences would constitute the beta-strand units of a larger beta-sheet assembly. The edges of this beta-sheet would be functionalized by the turn-inducing amino acids (GX). The polypeptides were expressed in Escherichia coli using conventional vectors and were purified by Ni-nitriloacetic acid (NTA) chromatography. The correlation of polymer structure with molecular weight was investigated by gel electrophoresis and mass spectrometry. The monomer sequences and post-translational chemical modifications were found to influence the mobility of the polypeptides over the full range of polypeptide molecular weights while the electrophoretic mobility of lower molecular weight polypeptides was more susceptible to C- and N-termini polypeptide modifications.

  13. Pseudomonas sp. strain CA5 (a selenite-reducing bacterium) 16S rRNA gene complete sequence. National Institute of Health, National Center for Biotechnology Information, GenBank sequence. Accession FJ422810.1.

    USDA-ARS?s Scientific Manuscript database

    This study used 1321 base pair 16S rRNA gene sequence methods to confirm the phylogenetic position of a soil isolate as a bacterium belonging to the genus Pesudomonas sp. Morphological, biochemical characteristics, and fatty acid profiles are consistent with the 16S rRNA gene sequence identification...

  14. Cloning and characterization of an abalone (Haliotis discus hannai) actin gene

    NASA Astrophysics Data System (ADS)

    Ma, Hongming; Xu, Wei; Mai, Kangsen; Liufu, Zhiguo; Chen, Hong

    2004-10-01

    An actin encoding gene was cloned by using RT-PCR, 3‧ RACE and 5‧ RACE from abalone Haliotis discus hannai. The full length of the gene is 1532 base pairs, which contains a long 3‧ untranslated region of 307 base pairs and 79 base pairs of 5‧ untranslated sequence. The open reading frame encodes 376 amino acid residues. Sequence comparison with those of human and other mollusks showed high conservation among species at amino acid level. The identities was 96%, 97% and 96% respectively compared with Aplysia californica, Biomphalaria glabrata and Homo sapience β-actin. It is also indicated that this actin is more similar to the human cytoplasmic actin (β-actin) than to human muscle actin.

  15. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  16. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  17. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  18. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  19. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

  20. Evolution-Based Functional Decomposition of Proteins

    PubMed Central

    Rivoire, Olivier; Reynolds, Kimberly A.; Ranganathan, Rama

    2016-01-01

    The essential biological properties of proteins—folding, biochemical activities, and the capacity to adapt—arise from the global pattern of interactions between amino acid residues. The statistical coupling analysis (SCA) is an approach to defining this pattern that involves the study of amino acid coevolution in an ensemble of sequences comprising a protein family. This approach indicates a functional architecture within proteins in which the basic units are coupled networks of amino acids termed sectors. This evolution-based decomposition has potential for new understandings of the structural basis for protein function. To facilitate its usage, we present here the principles and practice of the SCA and introduce new methods for sector analysis in a python-based software package (pySCA). We show that the pattern of amino acid interactions within sectors is linked to the divergence of functional lineages in a multiple sequence alignment—a model for how sector properties might be differentially tuned in members of a protein family. This work provides new tools for studying proteins and for generally testing the concept of sectors as the principal units of function and adaptive variation. PMID:27254668

  1. Autonomous replication of nucleic acids by polymerization/nicking enzyme/DNAzyme cascades for the amplified detection of DNA and the aptamer-cocaine complex.

    PubMed

    Wang, Fuan; Freage, Lina; Orbach, Ron; Willner, Itamar

    2013-09-03

    The progressive development of amplified DNA sensors and aptasensors using replication/nicking enzymes/DNAzyme machineries is described. The sensing platforms are based on the tailoring of a DNA template on which the recognition of the target DNA or the formation of the aptamer-substrate complex trigger on the autonomous isothermal replication/nicking processes and the displacement of a Mg(2+)-dependent DNAzyme that catalyzes the generation of a fluorophore-labeled nucleic acid acting as readout signal for the analyses. Three different DNA sensing configurations are described, where in the ultimate configuration the target sequence is incorporated into a nucleic acid blocker structure associated with the sensing template. The target-triggered isothermal autonomous replication/nicking process on the modified template results in the formation of the Mg(2+)-dependent DNAzyme tethered to a free strand consisting of the target sequence. This activates additional template units for the nucleic acid self-replication process, resulting in the ultrasensitive detection of the target DNA (detection limit 1 aM). Similarly, amplified aptamer-based sensing platforms for cocaine are developed along these concepts. The modification of the cocaine-detection template by the addition of a nucleic acid sequence that enables the autonomous secondary coupled activation of a polymerization/nicking machinery and DNAzyme generation path leads to an improved analysis of cocaine (detection limit 10 nM).

  2. Identification, Classification, and Phylogeny of the Pathogenic Species Exophiala jeanselmei and Related Species by Mitochondrial Cytochrome b Gene Analysis

    PubMed Central

    Wang, Li; Yokoyama, Koji; Miyaji, Makoto; Nishimura, Kazuko

    2001-01-01

    We analyzed a 402-bp sequence of the mitochondrial cytochrome b gene of 34 strains of Exophiala jeanselmei and 16 strains representing 12 related species. The strains of E. jeanselmei were classified into 20 DNA types and 17 amino acid types. The differences between these strains were found in 1 to 60 nucleotides and 1 to 17 amino acids. On the basis of the identities and similarities of nucleotide and amino acid sequences, some strains were reidentified: i.e., two strains of E. jeanselmei var. hetermorpha and one strain of E. castellanii as E. dermatitidis (including the type strain), three strains of E. jeanselmei as E. jeanselmei var. lecanii-corni (including the type strain), three strains of E. jeanselmei as E. bergeri (including the type strain), seven strains of E. jeanselmei as E. pisciphila (including the type strain), seven strains of E. jeanselmei as E. jeanselmei var. jeanselmei (including the type strain), one strain of E. jeanselmei as Fonsecaea pedrosoi (including the type strain), and one strain of E. jeanselmei as E. spinifera (including the type strain). Some E. jeanselmei strains showed distinct nucleotide and amino acid sequences. The amino-acid-based UPGMA (unweighted pair group method with the arithmetic mean) tree exhibited nearly the same topology as those of the DNA-based trees obtained by neighbor joining, maximum parsimony, and maximum likelihood methods. PMID:11724862

  3. Intervening sequences in a plant gene-comparison of the partial sequence of cDNA and genomic DNA of French bean phaseolin

    NASA Astrophysics Data System (ADS)

    Sun, S. M.; Slightom, J. L.; Hall, T. C.

    1981-01-01

    A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.

  4. Universal digital high-resolution melt: a novel approach to broad-based profiling of heterogeneous biological samples.

    PubMed

    Fraley, Stephanie I; Hardick, Justin; Masek, Billie J; Jo Masek, Billie; Athamanolap, Pornpat; Rothman, Richard E; Gaydos, Charlotte A; Carroll, Karen C; Wakefield, Teresa; Wang, Tza-Huei; Yang, Samuel

    2013-10-01

    Comprehensive profiling of nucleic acids in genetically heterogeneous samples is important for clinical and basic research applications. Universal digital high-resolution melt (U-dHRM) is a new approach to broad-based PCR diagnostics and profiling technologies that can overcome issues of poor sensitivity due to contaminating nucleic acids and poor specificity due to primer or probe hybridization inaccuracies for single nucleotide variations. The U-dHRM approach uses broad-based primers or ligated adapter sequences to universally amplify all nucleic acid molecules in a heterogeneous sample, which have been partitioned, as in digital PCR. Extensive assay optimization enables direct sequence identification by algorithm-based matching of melt curve shape and Tm to a database of known sequence-specific melt curves. We show that single-molecule detection and single nucleotide sensitivity is possible. The feasibility and utility of U-dHRM is demonstrated through detection of bacteria associated with polymicrobial blood infection and microRNAs (miRNAs) associated with host response to infection. U-dHRM using broad-based 16S rRNA gene primers demonstrates universal single cell detection of bacterial pathogens, even in the presence of larger amounts of contaminating bacteria; U-dHRM using universally adapted Lethal-7 miRNAs in a heterogeneous mixture showcases the single copy sensitivity and single nucleotide specificity of this approach.

  5. Pyrin gene and mutants thereof, which cause familial Mediterranean fever

    DOEpatents

    Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL

    2003-09-30

    The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.

  6. Characterization of a tandemly repeated DNA sequence family originally derived by retroposition of tRNA(Glu) in the newt.

    PubMed

    Nagahashi, S; Endoh, H; Suzuki, Y; Okada, N

    1991-11-20

    A previous report from this laboratory showed that in vitro transcription of total genomic DNA of the newt Cynopus pyrrhogaster resulted in a discrete sized 8 S RNA, which represented highly repetitive and transcribable sequences with a glutamic acid tRNA-like structure in the newt genome. We isolated four independent clones from a newt genomic library and determined the complete sequences of three 2000 to 2400 base-pair PstI fragments spanning the 8 S RNA gene. The glutamic acid tRNA-related segment in the 8 S RNA gene contains the CCA sequence expected as the 3' terminus of a tRNA molecule. Further, the 11 nucleotides located 13 nucleotides upstream from one of the two transcription initiation sites of the 8 S RNA were found to be repeated in the region upstream from the termination site, suggesting that the original unit, which is shorter than the 8 S RNA, was retrotransposed via cDNA intermediates from the PolIII transcript. In the upstream region of the 8 S RNA gene, a 360 nucleotide unit containing the glutamic acid tRNA-related segment was found to be duplicated (clones NE1 and NE10) or triplicated (clone NE3). Except for the difference in the number of the 360 nucleotide unit, the three sequences of the 2000 to 2400 base-pair PstI fragment were essentially the same with only a few mutations and minor deletions. Inverse polymerase chain reaction and sequence determination of the products, together with a Southern hybridization experiment, demonstrated that the family consists of a tandemly repeated unit of 3300, 3700 or 4100 base-pairs. Thus during evolution, this family in the newt was created by retroposition via cDNA intermediates, followed by duplication or triplication of the 360 nucleotide unit and multiplication of the 3300 to 4100 base-pair region at the DNA level.

  7. Rhodotorula svalbardensis sp. nov., a novel yeast species isolated from cryoconite holes of Ny-Ålesund, Arctic.

    PubMed

    Singh, Purnima; Singh, Shiv M; Tsuji, Masaharu; Prasad, Gandham S; Hoshino, Tamotsu

    2014-02-01

    A psychrophilic yeast species was isolated from glacier cryoconite holes of Svalbard. Nucleotide sequences of the strains were studied using D1/D2 domain, ITS region and partial sequences of mitochondrial cytochrome b gene. The strains belonged to a clade of psychrophilic yeasts, but showed marked differences from related species in the D1/D2 domain and biochemical characters. Effects of temperature, salt and media on growth of the cultures were also studied. Screening of the cultures for amylase, cellulase, protease, lipase, urease and catalase activities was carried out. The strains expressed high amylase and lipase activities. Freeze tolerance ability of the isolates indicated the formation of unique hexagonal ice crystal structures due to presence of 'antifreeze proteins' (AFPs). FAME analysis of cultures showed a unique trend of increase in unsaturated fatty acids with decrease in temperature. The major fatty acids recorded were oleic acid, linoleic acid, linolenic acid, palmitic acid, stearic acid, myristic acid and pentadecanoic acid. Based on sequence data and, physiological and morphological properties of the strains, we propose a novel species, Rhodotorula svalbardensis and designate strains MLB-I (CCP-II) and CRY-YB-1 (CBS 12863, JCM 19699, JCM 19700, MTCC 10952) as its type strains (Etymology: sval.bar.den'sis. N.L. fem. adj. svalbardensis pertaining to Svalbard). Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Closed cycle ion exchange method for regenerating acids, bases and salts

    DOEpatents

    Dreyfuss, Robert M.

    1976-01-01

    A method for conducting a chemical reaction in acidic, basic, or neutral solution as required and then regenerating the acid, base, or salt by means of ion exchange in a closed cycle reaction sequence which comprises contacting the spent acid, base, or salt with an ion exchanger, preferably a synthetic organic ion-exchange resin, so selected that the counter ions thereof are ions also produced as a by-product in the closed reaction cycle, and then regenerating the spent ion exchanger by contact with the by-product counter ions. The method is particularly applicable to closed cycle processes for the thermochemical production of hydrogen.

  9. Molecular mechanisms of adaptation emerging from the physics and evolution of nucleic acids and proteins.

    PubMed

    Goncearenco, Alexander; Ma, Bin-Guang; Berezovsky, Igor N

    2014-03-01

    DNA, RNA and proteins are major biological macromolecules that coevolve and adapt to environments as components of one highly interconnected system. We explore here sequence/structure determinants of mechanisms of adaptation of these molecules, links between them, and results of their mutual evolution. We complemented statistical analysis of genomic and proteomic sequences with folding simulations of RNA molecules, unraveling causal relations between compositional and sequence biases reflecting molecular adaptation on DNA, RNA and protein levels. We found many compositional peculiarities related to environmental adaptation and the life style. Specifically, thermal adaptation of protein-coding sequences in Archaea is characterized by a stronger codon bias than in Bacteria. Guanine and cytosine load in the third codon position is important for supporting the aerobic life style, and it is highly pronounced in Bacteria. The third codon position also provides a tradeoff between arginine and lysine, which are favorable for thermal adaptation and aerobicity, respectively. Dinucleotide composition provides stability of nucleic acids via strong base-stacking in ApG dinucleotides. In relation to coevolution of nucleic acids and proteins, thermostability-related demands on the amino acid composition affect the nucleotide content in the second codon position in Archaea.

  10. Molecular mechanisms of adaptation emerging from the physics and evolution of nucleic acids and proteins

    PubMed Central

    Goncearenco, Alexander; Ma, Bin-Guang; Berezovsky, Igor N.

    2014-01-01

    DNA, RNA and proteins are major biological macromolecules that coevolve and adapt to environments as components of one highly interconnected system. We explore here sequence/structure determinants of mechanisms of adaptation of these molecules, links between them, and results of their mutual evolution. We complemented statistical analysis of genomic and proteomic sequences with folding simulations of RNA molecules, unraveling causal relations between compositional and sequence biases reflecting molecular adaptation on DNA, RNA and protein levels. We found many compositional peculiarities related to environmental adaptation and the life style. Specifically, thermal adaptation of protein-coding sequences in Archaea is characterized by a stronger codon bias than in Bacteria. Guanine and cytosine load in the third codon position is important for supporting the aerobic life style, and it is highly pronounced in Bacteria. The third codon position also provides a tradeoff between arginine and lysine, which are favorable for thermal adaptation and aerobicity, respectively. Dinucleotide composition provides stability of nucleic acids via strong base-stacking in ApG dinucleotides. In relation to coevolution of nucleic acids and proteins, thermostability-related demands on the amino acid composition affect the nucleotide content in the second codon position in Archaea. PMID:24371267

  11. Electron microscopic analysis and structural characterization of novel NADP(H)-containing methanol: N,N'-dimethyl-4-nitrosoaniline oxidoreductases from the gram-positive methylotrophic bacteria Amycolatopsis methanolica and Mycobacterium gastri MB19.

    PubMed Central

    Bystrykh, L V; Vonck, J; van Bruggen, E F; van Beeumen, J; Samyn, B; Govorukhina, N I; Arfman, N; Duine, J A; Dijkhuizen, L

    1993-01-01

    The quaternary protein structure of two methanol:N,N'-dimethyl-4-nitrosoaniline (NDMA) oxidoreductases purified from Amycolatopsis methanolica and Mycobacterium gastri MB19 was analyzed by electron microscopy and image processing. The enzymes are decameric proteins (displaying fivefold symmetry) with estimated molecular masses of 490 to 500 kDa based on their subunit molecular masses of 49 to 50 kDa. Both methanol:NDMA oxidoreductases possess a tightly but noncovalently bound NADP(H) cofactor at an NADPH-to-subunit molar ratio of 0.7. These cofactors are redox active toward alcohol and aldehyde substrates. Both enzymes contain significant amounts of Zn2+ and Mg2+ ions. The primary amino acid sequences of the A. methanolica and M. gastri MB19 methanol:NDMA oxidoreductases share a high degree of identity, as indicated by N-terminal sequence analysis (63% identity among the first 27 N-terminal amino acids), internal peptide sequence analysis, and overall amino acid composition. The amino acid sequence analysis also revealed significant similarity to a decameric methanol dehydrogenase of Bacillus methanolicus C1. Images PMID:8449887

  12. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...

  13. 5S ribosomal ribonucleic acid sequences in Bacteroides and Fusobacterium: evolutionary relationships within these genera and among eubacteria in general

    NASA Technical Reports Server (NTRS)

    Van den Eynde, H.; De Baere, R.; Shah, H. N.; Gharbia, S. E.; Fox, G. E.; Michalik, J.; Van de Peer, Y.; De Wachter, R.

    1989-01-01

    The 5S ribosomal ribonucleic acid (rRNA) sequences were determined for Bacteroides fragilis, Bacteroides thetaiotaomicron, Bacteroides capillosus, Bacteroides veroralis, Porphyromonas gingivalis, Anaerorhabdus furcosus, Fusobacterium nucleatum, Fusobacterium mortiferum, and Fusobacterium varium. A dendrogram constructed by a clustering algorithm from these sequences, which were aligned with all other hitherto known eubacterial 5S rRNA sequences, showed differences as well as similarities with respect to results derived from 16S rRNA analyses. In the 5S rRNA dendrogram, Bacteroides clustered together with Cytophaga and Fusobacterium, as in 16S rRNA analyses. Intraphylum relationships deduced from 5S rRNAs suggested that Bacteroides is specifically related to Cytophaga rather than to Fusobacterium, as was suggested by 16S rRNA analyses. Previous taxonomic considerations concerning the genus Bacteroides, based on biochemical and physiological data, were confirmed by the 5S rRNA sequence analysis.

  14. A Support Vector Machine based method to distinguish proteobacterial proteins from eukaryotic plant proteins

    PubMed Central

    2012-01-01

    Background Members of the phylum Proteobacteria are most prominent among bacteria causing plant diseases that result in a diminution of the quantity and quality of food produced by agriculture. To ameliorate these losses, there is a need to identify infections in early stages. Recent developments in next generation nucleic acid sequencing and mass spectrometry open the door to screening plants by the sequences of their macromolecules. Such an approach requires the ability to recognize the organismal origin of unknown DNA or peptide fragments. There are many ways to approach this problem but none have emerged as the best protocol. Here we attempt a systematic way to determine organismal origins of peptides by using a machine learning algorithm. The algorithm that we implement is a Support Vector Machine (SVM). Result The amino acid compositions of proteobacterial proteins were found to be different from those of plant proteins. We developed an SVM model based on amino acid and dipeptide compositions to distinguish between a proteobacterial protein and a plant protein. The amino acid composition (AAC) based SVM model had an accuracy of 92.44% with 0.85 Matthews correlation coefficient (MCC) while the dipeptide composition (DC) based SVM model had a maximum accuracy of 94.67% and 0.89 MCC. We also developed SVM models based on a hybrid approach (AAC and DC), which gave a maximum accuracy 94.86% and a 0.90 MCC. The models were tested on unseen or untrained datasets to assess their validity. Conclusion The results indicate that the SVM based on the AAC and DC hybrid approach can be used to distinguish proteobacterial from plant protein sequences. PMID:23046503

  15. STING Millennium: a web-based suite of programs for comprehensive and simultaneous analysis of protein structure and sequence

    PubMed Central

    Neshich, Goran; Togawa, Roberto C.; Mancini, Adauto L.; Kuser, Paula R.; Yamagishi, Michel E. B.; Pappas, Georgios; Torres, Wellington V.; Campos, Tharsis Fonseca e; Ferreira, Leonardo L.; Luna, Fabio M.; Oliveira, Adilton G.; Miura, Ronald T.; Inoue, Marcus K.; Horita, Luiz G.; de Souza, Dimas F.; Dominiquini, Fabiana; Álvaro, Alexandre; Lima, Cleber S.; Ogawa, Fabio O.; Gomes, Gabriel B.; Palandrani, Juliana F.; dos Santos, Gabriela F.; de Freitas, Esther M.; Mattiuz, Amanda R.; Costa, Ivan C.; de Almeida, Celso L.; Souza, Savio; Baudet, Christian; Higa, Roberto H.

    2003-01-01

    STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). SMS operates with a collection of both publicly available data (PDB, HSSP, Prosite) and its own data (contacts, interface contacts, surface accessibility). Biologists find SMS useful because it provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. Using SMS it is now possible to analyze sequence to structure relationships, the quality of the structure, nature and volume of atomic contacts of intra and inter chain type, relative conservation of amino acids at the specific sequence position based on multiple sequence alignment, indications of folding essential residue (FER) based on the relationship of the residue conservation to the intra-chain contacts and Cα–Cα and Cβ–Cβ distance geometry. Specific emphasis in SMS is given to interface forming residues (IFR)—amino acids that define the interactive portion of the protein surfaces. SMS may simultaneously display and analyze previously superimposed structures. PDB updates trigger SMS updates in a synchronized fashion. SMS is freely accessible for public data at http://www.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS and http://trantor.bioc.columbia.edu/SMS. PMID:12824333

  16. Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches.

    PubMed

    Zhang, Yiming; Jin, Quan; Wang, Shuting; Ren, Ren

    2011-05-01

    The mobile behavior of 1481 peptides in ion mobility spectrometry (IMS), which are generated by protease digestion of the Drosophila melanogaster proteome, is modeled and predicted based on two different types of characterization methods, i.e. sequence-based approach and structure-based approach. In this procedure, the sequence-based approach considers both the amino acid composition of a peptide and the local environment profile of each amino acid in the peptide; the structure-based approach is performed with the CODESSA protocol, which regards a peptide as a common organic compound and generates more than 200 statistically significant variables to characterize the whole structure profile of a peptide molecule. Subsequently, the nonlinear support vector machine (SVM) and Gaussian process (GP) as well as linear partial least squares (PLS) regression is employed to correlate the structural parameters of the characterizations with the IMS drift times of these peptides. The obtained quantitative structure-spectrum relationship (QSSR) models are evaluated rigorously and investigated systematically via both one-deep and two-deep cross-validations as well as the rigorous Monte Carlo cross-validation (MCCV). We also give a comprehensive comparison on the resulting statistics arising from the different combinations of variable types with modeling methods and find that the sequence-based approach can give the QSSR models with better fitting ability and predictive power but worse interpretability than the structure-based approach. In addition, though the QSSR modeling using sequence-based approach is not needed for the preparation of the minimization structures of peptides before the modeling, it would be considerably efficient as compared to that using structure-based approach. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.

    2007-12-11

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  18. Invasive cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  19. Invasive cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    2002-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  20. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.

    2010-11-09

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  1. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    2000-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  2. Nucleic acid detection assays

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.

    2005-04-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  3. Summary of International Exhibition and Congress (3rd): BIOTECHNICA 󈨛 Hannover Held in Hannover (Germany, F.R.) on 22-24 September 1987

    DTIC Science & Technology

    1988-01-21

    nucleic acids which occur in DNA and seem to play an e Improved theoretical analysis of the important role in determining gene reg’- fntra- and...developed two retroviral vectors, based on the murine new peptide-based animal vaccines which myeloproliferative sarcoma virus (MPSV), are currertly...Structure tides are part of a precursor molecule elucidation is performed by gas-phase composed of 126 amino acids. From a pre- amino acid sequence analysis

  4. Structure/Function Analyses of Human Serum Paraoxonase (HuPON1) Mutants Designed from a DFPase-Like Homology Model

    DTIC Science & Technology

    2004-08-23

    purified HuPON1 Substitution of amino acid residues in the HuPONI enzyme was accomplished by PCR-based site-directed Two methods were utilized to...including organophosphates and lactones, and exhibits anti-atherogenic properties. A few amino acids have been shown to be essential for the enzyme’s...not been assigned to those residues. Based on scquence-structure alignment studies, we have folded the amino acid sequence of HuPON I onto the sixfold

  5. A Simple Base-Mediated Halogenation of Acidic sp2 C-H Bonds under Non-Cryogenic Conditions

    PubMed Central

    Do, Hien-Quang; Daugulis, Olafs

    2009-01-01

    A new method has been developed for in situ halogenation of acidic sp2 carbon-hydrogen bonds in heterocycles and electron-deficient arenes. Either selective monohalogenation or one-step exhaustive polyhalogenation is possible for substrates possessing several C-H bonds that are flanked by electron-withdrawing groups. For the most acidic arenes, such as pentafluorobenzene, K3PO4 base can be employed instead of BuLi for metalation/halogenation sequences. PMID:19102661

  6. The lignin component of humic substances: Distribution among soil and sedimentary humic, fulvic, and base-insoluble fractions

    NASA Astrophysics Data System (ADS)

    Ertel, John R.; Hedges, John I.

    1984-10-01

    Vanillyl, syringyl and cinnamyl phenols occur as CuO oxidation products of humic, fulvic and base-insoluble residual fractions from soils, peat and nearshore marine sediments. However, none of these lignin-derived phenols were released by CuO oxidation of deepsea sediment or its base-extractable organic fractions. Lignin analysis indicated that peat and coastal marine sediments contained significantly higher levels of recognizable vascular plant carbon (20-50%) than soils and offshore marine sediments (0-10%). Although accounting for less than 20% of the total sedimentary (bulk) lignin, lignin components of humic acid fractions compositionally and quantitatively resembled the corresponding bulk samples and baseinsoluble residues. Recognizable lignin, presumably present as intact phenylpropanoid units, accounted for up to 5% of the carbon in peat and coastal humic acids but less than 1% in soil humic acids. Fulvic acid fractions uniformly yielded less lignin-derived phenols in mixtures that were depleted in syringyl and cinnamyl phenols relative to the corresponding humic acid fractions. Within the vanillyl and syringyl families the relative distribution of acidic and aldehydic phenols is a sensitive measure of the degree of oxidative alteration of the lignin component The high acid/aldehyde ratios and the low phenol yields of soils and their humic fractions compared to peat and coastal sediments indicate extensive degradation of the lignin source material. Likewise, the progressively higher acid/aldehyde ratios and lower phenol yields along the sequence: plant tissues (plant debris)-humic acids-fulvic acids suggest that this pattern represents the diagenetic sequence for the aerobic degradation of lignin biopolymers.

  7. Complete genome sequence of keunjorong mosaic virus, a potyvirus from Cynanchum wilfordii.

    PubMed

    Nam, Moon; Lee, Joo-Hee; Choi, Hong Soo; Lim, Hyoun-Sub; Moon, Jae Sun; Lee, Su-Heon

    2013-08-01

    We have determined the complete genome sequence of keunjorong mosaic virus (KjMV). The KjMV genome is composed of 9,611 nucleotides, excluding the 3'-terminal poly(A) tail. It contains two open reading frames (ORFs), with the large one encoding a polyprotein of 3,070 amino acids and the small overlapping ORF encoding a PIPO protein of 81 amino acids. The KjMV genome shared the highest nucleotide sequence identity (57.5  %) with pepper mottle virus and freesia mosaic virus, two members of the genus Potyvirus. Based on the phylogenetic relatedness to known potyviruses, KjMV appears to be a member of a new species in the genus Potyvirus.

  8. Comparative Analysis and Distribution of Omega-3 lcPUFA Biosynthesis Genes in Marine Molluscs

    PubMed Central

    Surm, Joachim M.; Prentis, Peter J.; Pavasovic, Ana

    2015-01-01

    Recent research has identified marine molluscs as an excellent source of omega-3 long-chain polyunsaturated fatty acids (lcPUFAs), based on their potential for endogenous synthesis of lcPUFAs. In this study we generated a representative list of fatty acyl desaturase (Fad) and elongation of very long-chain fatty acid (Elovl) genes from major orders of Phylum Mollusca, through the interrogation of transcriptome and genome sequences, and various publicly available databases. We have identified novel and uncharacterised Fad and Elovl sequences in the following species: Anadara trapezia, Nerita albicilla, Nerita melanotragus, Crassostrea gigas, Lottia gigantea, Aplysia californica, Loligo pealeii and Chlamys farreri. Based on alignments of translated protein sequences of Fad and Elovl genes, the haeme binding motif and histidine boxes of Fad proteins, and the histidine box and seventeen important amino acids in Elovl proteins, were highly conserved. Phylogenetic analysis of aligned reference sequences was used to reconstruct the evolutionary relationships for Fad and Elovl genes separately. Multiple, well resolved clades for both the Fad and Elovl sequences were observed, suggesting that repeated rounds of gene duplication best explain the distribution of Fad and Elovl proteins across the major orders of molluscs. For Elovl sequences, one clade contained the functionally characterised Elovl5 proteins, while another clade contained proteins hypothesised to have Elovl4 function. Additional well resolved clades consisted only of uncharacterised Elovl sequences. One clade from the Fad phylogeny contained only uncharacterised proteins, while the other clade contained functionally characterised delta-5 desaturase proteins. The discovery of an uncharacterised Fad clade is particularly interesting as these divergent proteins may have novel functions. Overall, this paper presents a number of novel Fad and Elovl genes suggesting that many mollusc groups possess most of the required enzymes for the synthesis of lcPUFAs. PMID:26308548

  9. Cloning, sequencing and characterization of lipase from a polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans

    USDA-ARS?s Scientific Manuscript database

    Lipase gene (lip) of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing bacterium P. resinovorans NRRL B-2649 was cloned, sequenced and characterized by using consensus primers and PCR-based genome walking method. The ORF of the putative Lip (314 amino acids) and its active site (Ser111, Asp...

  10. Method for nucleic acid hybridization using single-stranded DNA binding protein

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1996-01-01

    Method of nucleic acid hybridization for detecting the presence of a specific nucleic acid sequence in a population of different nucleic acid sequences using a nucleic acid probe. The nucleic acid probe hybridizes with the specific nucleic acid sequence but not with other nucleic acid sequences in the population. The method includes contacting a sample (potentially including the nucleic acid sequence) with the nucleic acid probe under hybridizing conditions in the presence of a single-stranded DNA binding protein provided in an amount which stimulates renaturation of a dilute solution (i.e., one in which the t.sub.1/2 of renaturation is longer than 3 weeks) of single-stranded DNA greater than 500 fold (i.e., to a t.sub.1/2 less than 60 min, preferably less than 5 min, and most preferably about 1 min.) in the absence of nucleotide triphosphates.

  11. Protein Tertiary Structure Prediction Based on Main Chain Angle Using a Hybrid Bees Colony Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen

    Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.

  12. A novel chaotic based image encryption using a hybrid model of deoxyribonucleic acid and cellular automata

    NASA Astrophysics Data System (ADS)

    Enayatifar, Rasul; Sadaei, Hossein Javedani; Abdullah, Abdul Hanan; Lee, Malrey; Isnin, Ismail Fauzi

    2015-08-01

    Currently, there are many studies have conducted on developing security of the digital image in order to protect such data while they are sending on the internet. This work aims to propose a new approach based on a hybrid model of the Tinkerbell chaotic map, deoxyribonucleic acid (DNA) and cellular automata (CA). DNA rules, DNA sequence XOR operator and CA rules are used simultaneously to encrypt the plain-image pixels. To determine rule number in DNA sequence and also CA, a 2-dimension Tinkerbell chaotic map is employed. Experimental results and computer simulations, both confirm that the proposed scheme not only demonstrates outstanding encryption, but also resists various typical attacks.

  13. SNP in Chalcone Synthase gene is associated with variation of 6-gingerol content in contrasting landraces of Zingiber officinale.Roscoe.

    PubMed

    Ghosh, Subhabrata; Mandi, Swati Sen

    2015-07-25

    Zingiber officinale, medicinally the most important species within Zingiber genus, contains 6-gingerol as the active principle. This compound obtained from rhizomes of Z.officinale, has immense medicinal importance and is used in various herbal drug formulations. Our record of variation in content of this active principle, viz. 6-gingerol, in land races of this drug plant collected from different locations correlated with our Gene expression studies exhibiting high Chalcone Synthase gene (Chalcone Synthase is the rate limiting enzyme of 6-gingerol biosynthesis pathway) expression in high 6-gingerol containing landraces than in the low 6-gingerol containing landraces. Sequencing of Chalcone Synthase cDNA and subsequent multiple sequence alignment revealed seven SNPs between these contrasting genotypes. Converting this nucleotide sequence to amino acid sequence, alteration of two amino acids becomes evident; one amino acid change (asparagine to serine at position 336) is associated with base change (A→G) and another change (serine to leucine at position 142) is associated with the base change (C→T). Since asparagine at position 336 is one of the critical amino acids of the catalytic triad of Chalcone Synthase enzyme, responsible for substrate binding, our study suggests that landraces with a specific amino acid change viz. Asparagine (found in high 6-gingerol containing landraces) to serine causes low 6-gingerol content. This is probably due to a weak enzyme substrate association caused by the absence of asparagine in the catalytic triad. Detailed study of this finding could also help to understand molecular mechanism associated with variation in 6-gingerol content in Z.officinale genotypes and thereby strategies for developing elite genotypes containing high 6-gingerol content. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Generate Optimized Genetic Rhythm for Enzyme Expression in Non-native systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2016-11-03

    Most amino acids are represented by more than one codon, resulting in redundancy in the genetic code. Silent codon substitutions that do not alter the amino acid sequence still have an effect on protein expression. We have developed an algorithm, GoGREEN, to enhance the expression of foreign proteins in a host organism. GoGREEN selects codons according to frequency patterns seen in the gene of interest using the codon usage table from the host organism. GoGREEN is also designed to accommodate gaps in the sequence.This software takes for input (1) the aligned protein sequences for genes the user wishes to express,more » (2) the codon usage table for the host organism, (3) and the DNA sequence for the target protein found in the host organism. The program will select codons based on codon usage patterns for the target DNA sequence. The program will also select codons for “gaps” found in the aligned protein sequences using the codon usage table from the host organism.« less

  15. Comparison of the nucleotide and amino acid sequences of the RsrI and EcoRI restriction endonucleases.

    PubMed

    Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J

    1989-12-21

    The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms.

  16. Identification of Delta5-fatty acid desaturase from the cellular slime mold dictyostelium discoideum.

    PubMed

    Saito, T; Ochiai, H

    1999-10-01

    cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.

  17. Molecular evaluation of five cardiac genes in Doberman Pinschers with dilated cardiomyopathy.

    PubMed

    Meurs, Kathryn M; Hendrix, Kristina P; Norgard, Michelle M

    2008-08-01

    To sequence the exonic and splice site regions of 5 cardiac genes associated with the human form of familial dilated cardiomyopathy (DCM) in Doberman Pinschers with DCM and to identify a causative mutation. 5 unrelated Doberman Pinschers with DCM and 2 unaffected Labrador Retrievers (control dogs). Exonic and splice site regions of the 5 genes encoding the cardiac proteins troponin C, lamin A/C, cysteine- and glycine-rich protein 3, cardiac troponin T, and the beta-myosin heavy chain were sequenced. Sequences were compared for nucleotide changes between affected dogs and the published canine sequences and 2 control dogs. Base pair changes were considered to be causative for DCM if they were present in an affected dog but not in the control dogs or published sequences and if they involved a conserved amino acid and changed that amino acid to a different polarity, acid-base status, or structure. A causative mutation for DCM in Doberman Pinschers was not identified, although single nucleotide polymorphisms were detected in some dogs in the cysteine- and glycine-rich protein 3, beta-myosin heavy chain, and troponin T genes. Mutations in 5 of the cardiac genes associated with the development of DCM in humans did not appear to be causative for DCM in Doberman Pinschers. Continued evaluation of additional candidate genes or a focused approach with an association analysis is warranted to elucidate the molecular cause of this important cardiac disease in Doberman Pinschers.

  18. Relationships between functional genes in Lactobacillus delbrueckii ssp. bulgaricus isolates and phenotypic characteristics associated with fermentation time and flavor production in yogurt elucidated using multilocus sequence typing.

    PubMed

    Liu, Wenjun; Yu, Jie; Sun, Zhihong; Song, Yuqin; Wang, Xueni; Wang, Hongmei; Wuren, Tuoya; Zha, Musu; Menghe, Bilige; Heping, Zhang

    2016-01-01

    Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is well known for its worldwide application in yogurt production. Flavor production and acid producing are considered as the most important characteristics for starter culture screening. To our knowledge this is the first study applying functional gene sequence multilocus sequence typing technology to predict the fermentation and flavor-producing characteristics of yogurt-producing bacteria. In the present study, phenotypic characteristics of 35 L. bulgaricus strains were quantified during the fermentation of milk to yogurt and during its subsequent storage; these included fermentation time, acidification rate, pH, titratable acidity, and flavor characteristics (acetaldehyde concentration). Furthermore, multilocus sequence typing analysis of 7 functional genes associated with fermentation time, acid production, and flavor formation was done to elucidate the phylogeny and genetic evolution of the same L. bulgaricus isolates. The results showed that strains significantly differed in fermentation time, acidification rate, and acetaldehyde production. Combining functional gene sequence analysis with phenotypic characteristics demonstrated that groups of strains established using genotype data were consistent with groups identified based on their phenotypic traits. This study has established an efficient and rapid molecular genotyping method to identify strains with good fermentation traits; this has the potential to replace time-consuming conventional methods based on direct measurement of phenotypic traits. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  19. Ranalexin. A novel antimicrobial peptide from bullfrog (Rana catesbeiana) skin, structurally related to the bacterial antibiotic, polymyxin.

    PubMed

    Clark, D P; Durell, S; Maloy, W L; Zasloff, M

    1994-04-08

    Antimicrobial peptides comprise a diverse class of molecules used in host defense by plants, insects, and animals. In this study we have isolated a novel antimicrobial peptide from the skin of the bullfrog, Rana catesbeiana. This 20 amino acid peptide, which we have termed Ranalexin, has the amino acid sequence: NH2-Phe-Leu-Gly-Gly-Leu-Ile-Lys-Ile-Val-Pro-Ala-Met-Ile-Cys-Ala-Val-Thr- Lys-Lys - Cys-COOH, and it contains a single intramolecular disulfide bond which forms a heptapeptide ring within the molecule. Structurally, Ranalexin resembles the bacterial antibiotic, polymyxin, which contains a similar heptapeptide ring. We have also cloned the cDNA for Ranalexin from a metamorphic R. catesbeiana tadpole cDNA library. Based on the cDNA sequence, it appears that Ranalexin is initially synthesized as a propeptide with a putative signal sequence and an acidic amino acid-rich region at its amino-terminal end. Interestingly, the putative signal sequence of the Ranalexin cDNA is strikingly similar to the signal sequence of opioid peptide precursors isolated from the skin of the South American frogs Phyllomedusa sauvagei and Phyllomedusa bicolor. Northern blot analysis and in situ hybridization experiments demonstrated that Ranalexin mRNA is first expressed in R. catesbeiana skin at metamorphosis and continues to be expressed into adulthood.

  20. A dehydrin cognate protein from pea (Pisum sativum L.) with an atypical pattern of expression.

    PubMed

    Robertson, M; Chandler, P M

    1994-11-01

    Dehydrins are a family of proteins characterised by conserved amino acid motifs, and induced in plants by dehydration or treatment with ABA. An antiserum was raised against a synthetic oligopeptide based on the most highly conserved dehydrin amino acid motif, the lysine-rich (core sequence KIKEK-LPG). This antiserum detected a novel M(r) 40,000 polypeptide and enabled isolation of a corresponding cDNA clone, pPsB61 (B61). The deduced amino acid sequence contained two lysine-rich blocks, however the remainder of the sequenced differed markedly from other pea dehydrins. Surprisingly, the sequence contained a stretch of serine residues, a characteristic common to dehydrins from many plant species but which is missing in pea dehydrin. The expression patterns of B61 mRNA and polypeptide were distinctively different from those of the pea dehydrins during seed development, germination and in young seedlings exposed to dehydration stress or treated with ABA. In particular, dehydration stress led to slightly reduced levels of B61 RNA, and ABA application to young seedlings had no marked effect on its abundance. The M(r) 40,000 polypeptide is thus related to pea dehydrin by the presence of the most highly conserved amino acid sequence motifs, but lacks the characteristic expression pattern of dehydrin. By analogy with heat shock cognate proteins we refer to this protein as a dehydrin cognate.

  1. Isolation, cDNA cloning and gene expression of an antibacterial protein from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros.

    PubMed

    Yang, J; Yamamoto, M; Ishibashi, J; Taniai, K; Yamakawa, M

    1998-08-01

    An antibacterial protein, designated rhinocerosin, was purified to homogeneity from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros immunized with Escherichia coli. Based on the amino acid sequence of the N-terminal region, a degenerate primer was synthesized and reverse-transcriptase PCR was performed to clone rhinocerosin cDNA. As a result, a 279-bp fragment was obtained. The complete nucleotide sequence was determined by sequencing the extended rhinocerosin cDNA clone by 5' rapid amplification of cDNA ends. The deduced amino acid sequence of the mature portion of rhinocerosin was composed of 72 amino acids without cystein residues and was shown to be rich in glycine (11.1%) and proline (11.1%) residues. Comparison of the deduced amino acid sequence of rhinocerosin with those of other antibacterial proteins indicated that it has 77.8% and 44.6% identity with holotricin 2 and coleoptrecin, respectively. Rhinocerosin had strong antibacterial activity against E. coli, Streptococcus pyogenes, Staphylococcus aureus but not against Pseudomonas aeruginosa. Results of reverse-transcriptase PCR analysis of gene expression in different tissues indicated that the rhinocerosin gene is strongly expressed in the fat body and the Malpighian tubule, and weakly expressed in hemocytes and midgut. In addition, gene expression was inducible by bacteria in the fat body, the Malpighian tubule and hemocyte but constitutive expression was observed in the midgut.

  2. Development of chemiluminescent probe hybridization, RT-PCR and nucleic acid cycle sequencing assays of Sabin type 3 isolates to identify base pair 472 Sabin type 3 mutants associated with vaccine associated paralytic poliomyelitis.

    PubMed

    Old, M O; Logan, L H; Maldonado, Y A

    1997-11-01

    Sabin type 3 polio vaccine virus is the most common cause of poliovaccine associated paralytic poliomyelitis. Vaccine associated paralytic poliomyelitis cases have been associated with Sabin type 3 revertants containing a single U to C substitution at bp 472 of Sabin type 3. A rapid method of identification of Sabin type 3 bp 472 mutants is described. An enterovirus group-specific probe for use in a chemiluminescent dot blot hybridization assay was developed to identify enterovirus positive viral lysates. A reverse transcription-polymerase chain reaction (RT-PCR) assay producing a 319 bp PCR product containing the Sabin type 3 bp 472 mutation site was then employed to identify Sabin type 3 isolates. Chemiluminescent nucleic acid cycle sequencing of the purified 319 bp PCR product was then employed to identify nucleic acid sequences at bp 472. The enterovirus group probe hybridization procedure and isolation of the Sabin type 3 PCR product were highly sensitive and specific; nucleic acid cycle sequencing corresponded to the known sequence of stock Sabin type 3 isolates. These methods will be used to identify the Sabin type 3 reversion rate from sequential stool samples of infants obtained after the first and second doses of oral poliovirus vaccine.

  3. Computational mining for hypothetical patterns of amino acid side chains in protein data bank (PDB)

    NASA Astrophysics Data System (ADS)

    Ghani, Nur Syatila Ab; Firdaus-Raih, Mohd

    2018-04-01

    The three-dimensional structure of a protein can provide insights regarding its function. Functional relationship between proteins can be inferred from fold and sequence similarities. In certain cases, sequence or fold comparison fails to conclude homology between proteins with similar mechanism. Since the structure is more conserved than the sequence, a constellation of functional residues can be similarly arranged among proteins of similar mechanism. Local structural similarity searches are able to detect such constellation of amino acids among distinct proteins, which can be useful to annotate proteins of unknown function. Detection of such patterns of amino acids on a large scale can increase the repertoire of important 3D motifs since available known 3D motifs currently, could not compensate the ever-increasing numbers of uncharacterized proteins to be annotated. Here, a computational platform for an automated detection of 3D motifs is described. A fuzzy-pattern searching algorithm derived from IMagine an Amino Acid 3D Arrangement search EnGINE (IMAAAGINE) was implemented to develop an automated method for searching of hypothetical patterns of amino acid side chains in Protein Data Bank (PDB), without the need for prior knowledge on related sequence or structure of pattern of interest. We present an example of the searches, which is the detection of a hypothetical pattern derived from known structural motif of C2H2 structural pattern from zinc fingers. The conservation of particular patterns of amino acid side chains in unrelated proteins is highlighted. This approach can act as a complementary method for available structure- and sequence-based platforms and may contribute in improving functional association between proteins.

  4. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  5. Promoting Student Development of Models and Scientific Inquiry Skills in Acid-Base Chemistry: An Important Skill Development in Preparation for AP Chemistry

    ERIC Educational Resources Information Center

    Hale-Hanes, Cara

    2015-01-01

    In this study, two groups of 11th grade chemistry students (n = 210) performed a sequence of hands-on and virtual laboratories that were progressively more inquiry-based. One-half of the students did the laboratory sequence with the addition of a teacher-led discussion connecting student data to student-generated visual representations of…

  6. Novel rod-shaped viruses isolated from garlic, Allium sativum, possessing a unique genome organization.

    PubMed

    Sumi, S; Tsuneyoshi, T; Furutani, H

    1993-09-01

    Rod-shaped flexuous viruses were partially purified from garlic plants (Allium sativum) showing typical mosaic symptoms. The genome was shown to be composed of RNA with a poly(A) tail of an estimated size of 10 kb as shown by denaturing agarose gel electrophoresis. We constructed cDNA libraries and screened four independent clones, which were designated GV-A, GV-B, GV-C and GV-D, using Northern and Southern blot hybridization. Nucleotide sequence determination of the cDNAs, two of which correspond to nearly one-third of the virus genomic RNA, shows that all of these viruses possess an identical genomic structure and that also at least four proteins are encoded in the viral cDNA, their M(r)s being estimated to be 15K, 27K, 40K and 11K. The 15K open reading frame (ORF) encodes the core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues. The 27K ORF probably encodes the viral coat protein (CP), based on both the existence of some conserved sequences observed in many other rod-shaped or flexuous virus CPs and an overall amino acid sequence similarity to potexvirus and carlavirus CPs. The 11K ORF shows significant amino acid sequence similarities to the corresponding 12K proteins of the potexviruses and carlaviruses. On the other hand, the 40K ORF product does not resemble any other plant virus gene products reported so far. The genomic organization in the 3' region of the garlic viruses resembles, but clearly differs from, that of carlaviruses. Phylogenetic analysis based upon the amino acid sequence of the viral capsid protein also indicates that the garlic viruses have a unique and distinct domain different from those of the potexvirus and carlavirus groups. The results suggest that the garlic viruses described here belong to an unclassified and new virus group closely related to the carlaviruses.

  7. Structure and characterization of a cDNA clone for phenylalanine ammonia-lyase from cut-injured roots of sweet potato

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tanaka, Yoshiyuki; Matsuoka, Makoto; Yamanoto, Naoki

    A cDNA clone for phenylalanine ammonia-lyase (PAL) induced in wounded sweet potato (Ipomoea batatas Lam.) root was obtained by immunoscreening a cDNA library. The protein produced in Escherichia coli cells containing the plasmid pPAL02 was indistinguishable from sweet potato PAL as judged by Ouchterlony double diffusion assays. The M{sub r} of its subunit was 77,000. The cells converted ({sup 14}C)-L-phenylalanine into ({sup 14}C)-t-cinnamic acid and PAL activity was detected in the homogenate of the cells. The activity was dependent on the presence of the pPAL02 plasmid DNA. The nucleotide sequence of the cDNA contained a 2,121-base pair (bp) open-reading framemore » capable of coding for a polypeptide with 707 amino acids (M{sub r} 77,137), a 22-bp 5{prime}-noncoding region and a 207-bp 3{prime}-noncoding region. The results suggest that the insert DNA fully encoded the amino acid sequence for sweet potato PAL that is induced by wounding. Comparison of the deduced amino acid sequence with that of a PAL cDNA fragment from Phaseolus vulgaris revealed 78.9% homology. The sequence from amino acid residues 258 to 494 was highly conserved, showing 90.7% homology.« less

  8. Nucleotide sequence of the Varkud mitochondrial plasmid of Neurospora and synthesis of a hybrid transcript with a 5' leader derived from mitochondrial RNA.

    PubMed

    Akins, R A; Grant, D M; Stohl, L L; Bottorff, D A; Nargang, F E; Lambowitz, A M

    1988-11-05

    The Mauriceville and Varkud mitochondrial plasmids of Neurospora are closely related, closed circular DNAs (3.6 and 3.7 kb, respectively; 1 kb = 10(3) bases or base-pairs), whose characteristics suggest relationships to mitochondrial DNA introns and retrotransposons. Here, we characterized the structure of the Varkud plasmid, determined its complete nucleotide sequence and mapped its major transcripts. The Mauriceville and Varkud plasmids have more than 97% positional identity. Both plasmids contain a 710 amino acid open reading frame that encodes a reverse transcriptase-like protein. The amino acid sequence of this open reading frame is strongly conserved between the two plasmids (701/710 amino acids) as expected for a functionally important protein. Both plasmids have a 0.4 kb region that contains five PstI palindromes and a direct repeat of approximately 160 base-pairs. Comparison of sequences in this region suggests that the Varkud plasmid has diverged less from a common ancestor than has the Mauriceville plasmid. Two major transcripts of the Varkud plasmid were detected by Northern hybridization experiments: a full-length linear RNA of 3.7 kb and an additional prominent transcript of 4.9 kb, 1.2 kb longer than monomer plasmid. Remarkably, we find that the 4.9 kb transcript is a hybrid RNA consisting of the full-length 3.7 kb Varkud plasmid transcript plus a 5' leader of 1.2 kb that is derived from the 5' end of the mitochondrial small rRNA. This and other findings suggest that the Varkud plasmid, like certain RNA viruses, has a mechanism for joining heterologous RNAs to the 5' end of its major transcript, and that, under some circumstances, nucleotide sequences in mitochondria may be recombined at the RNA level.

  9. Sequence similarity is more relevant than species specificity in probabilistic backtranslation.

    PubMed

    Ferro, Alfredo; Giugno, Rosalba; Pigola, Giuseppe; Pulvirenti, Alfredo; Di Pietro, Cinzia; Purrello, Michele; Ragusa, Marco

    2007-02-21

    Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically.

  10. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  11. EGVII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2014-02-25

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

  12. EGVII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2006-05-16

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

  13. EGVI endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2008-04-01

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.

  14. EGVI endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2010-10-12

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.

  15. EGVIII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2006-05-23

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl8, and the corresponding EGVIII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVIII, recombinant EGVIII proteins and methods for producing the same.

  16. EGVI endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2010-10-05

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.

  17. EGVI endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2006-06-06

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.

  18. EGVII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2009-05-05

    The present invention provides an endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

  19. EGVII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2013-07-16

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

  20. EGVII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2012-02-14

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

  1. EGVII endoglucanase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2015-04-14

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

  2. Geometric Patterns for Neighboring Bases Near the Stacked State in Nucleic Acid Strands.

    PubMed

    Sedova, Ada; Banavali, Nilesh K

    2017-03-14

    Structural variation in base stacking has been analyzed frequently in isolated double helical contexts for nucleic acids, but not as often in nonhelical geometries or in complex biomolecular environments. In this study, conformations of two neighboring bases near their stacked state in any environment are comprehensively characterized for single-strand dinucleotide (SSD) nucleic acid crystal structure conformations. An ensemble clustering method is used to identify a reduced set of representative stacking geometries based on pairwise distances between select atoms in consecutive bases, with multiple separable conformational clusters obtained for categories divided by nucleic acid type (DNA/RNA), SSD sequence, stacking face orientation, and the presence or absence of a protein environment. For both DNA and RNA, SSD conformations are observed that are either close to the A-form, or close to the B-form, or intermediate between the two forms, or further away from either form, illustrating the local structural heterogeneity near the stacked state. Among this large variety of distinct conformations, several common stacking patterns are observed between DNA and RNA, and between nucleic acids in isolation or in complex with proteins, suggesting that these might be stable stacking orientations. Noncanonical face/face orientations of the two bases are also observed for neighboring bases in the same strand, but their frequency is much lower, with multiple SSD sequences across categories showing no occurrences of such unusual stacked conformations. The resulting reduced set of stacking geometries is directly useful for stacking-energy comparisons between empirical force fields, prediction of plausible localized variations in single-strand structures near their canonical states, and identification of analogous stacking patterns in newly solved nucleic acid containing structures.

  3. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.

  4. Intramolecular interactions in aminoacyl nucleotides: Implications regarding the origin of genetic coding and protein synthesis

    NASA Technical Reports Server (NTRS)

    Lacey, J. C., Jr.; Mullins, D. W., Jr.; Watkins, C. L.; Hall, L. M.

    1986-01-01

    Cellular organisms store information as sequences of nucleotides in double stranded DNA. This information is useless unless it can be converted into the active molecular species, protein. This is done in contemporary creatures first by transcription of one strand to give a complementary strand of mRNA. The sequence of nucleotides is then translated into a specific sequence of amino acids in a protein. Translation is made possible by a genetic coding system in which a sequence of three nucleotides codes for a specific amino acid. The origin and evolution of any chemical system can be understood through elucidation of the properties of the chemical entities which make up the system. There is an underlying logic to the coding system revealed by a correlation of the hydrophobicities of amino acids and their anticodonic nucleotides (i.e., the complement of the codon). Its importance lies in the fact that every amino acid going into protein synthesis must first be activated. This is universally accomplished with ATP. Past studies have concentrated on the chemistry of the adenylates, but more recently we have found, through the use of NMR, that we can observe intramolecular interactions even at low concentrations, between amino acid side chains and nucleotide base rings in these adenylates. The use of this type of compound thus affords a novel way of elucidating the manner in which amino acids and nucleotides interact with each other. In aqueous solution, when a hydrophobic amino acid is attached to the most hydrophobic nucleotide, AMP, a hydrophobic interaction takes place between the amino acid side chain and the adenine ring. The studies to be reported concern these hydrophobic interactions.

  5. Can identification of a fourth domain of life be made from sequence data alone, and could it be done on Mars?

    PubMed

    Poole, Anthony M; Willerslev, Eske

    2007-10-01

    A central question in astrobiology is whether life exists elsewhere in the universe. If so, is it related to Earth life? Technologies exist that enable identification of DNA- or RNA-based microbial life directly from environmental samples here on Earth. Such technologies could, in principle, be applied to the search for life elsewhere; indeed, efforts are underway to initiate such a search. However, surveying for nucleic acid-based life on other planets, if attempted, must be carried out with caution, owing to the risk of contamination by Earth-based life. Here we argue that the null hypothesis must be that any DNA discovered and sequenced from samples taken elsewhere in the universe are Earth-based contaminants. Experience from studies of low-biomass ancient DNA demonstrates that some results, by their very nature, will not enable complete rejection of the null hypothesis. In terms of eliminating contamination as an explanation of the data, there may be value in identification of sequences that lie outside the known diversity of the three domains of life. We therefore have examined whether a fourth domain could be readily identified from environmental DNA sequence data alone. We concluded that, even on Earth, this would be far from trivial, and we illustrate this point by way of examples drawn from the literature. Overall, our conclusions do not bode well for planned PCR-based surveys for life on Mars, and we argue that other independent biosignatures will be essential in corroborating any claims for the presence of life based on nucleic acid sequences.

  6. The Functional Human C-Terminome

    PubMed Central

    Hedden, Michael; Lyon, Kenneth F.; Brooks, Steven B.; David, Roxanne P.; Limtong, Justin; Newsome, Jacklyn M.; Novakovic, Nemanja; Rajasekaran, Sanguthevar; Thapar, Vishal; Williams, Sean R.; Schiller, Martin R.

    2016-01-01

    All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new “C-terminome” database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3–10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com. PMID:27050421

  7. Isolation and characterization of NBS-LRR- resistance gene candidates in turmeric (Curcuma longa cv. surama).

    PubMed

    Joshi, R K; Mohanty, S; Subudhi, E; Nayak, S

    2010-09-08

    Turmeric (Curcuma longa), an important asexually reproducing spice crop of the family Zingiberaceae is highly susceptible to bacterial and fungal pathogens. The identification of resistance gene analogs holds great promise for development of resistant turmeric cultivars. Degenerate primers designed based on known resistance genes (R-genes) were used in combinations to elucidate resistance gene analogs from Curcuma longa cultivar surama. The three primers resulted in amplicons with expected sizes of 450-600 bp. The nucleotide sequence of these amplicons was obtained through sequencing; their predicted amino acid sequences compared to each other and to the amino acid sequences of known R-genes revealed significant sequence similarity. The finding of conserved domains, viz., kinase-1a, kinase-2 and hydrophobic motif, provided evidence that the sequences belong to the NBS-LRR class gene family. The presence of tryptophan as the last residue of kinase-2 motif further qualified them to be in the non-TIR-NBS-LRR subfamily of resistance genes. A cluster analysis based on the neighbor-joining method was carried out using Curcuma NBS analogs together with several resistance gene analogs and known R-genes, which classified them into two distinct subclasses, corresponding to clades N3 and N4 of non-TIR-NBS sequences described in plants. The NBS analogs that we isolated can be used as guidelines to eventually isolate numerous R-genes in turmeric.

  8. "De-novo" amino acid sequence elucidation of protein G'e by combined "top-down" and "bottom-up" mass spectrometry.

    PubMed

    Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F M; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L; Glocker, Michael O

    2015-03-01

    Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein G´ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α-N-gluconoylation and α-N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α-N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant (K(d)) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins.

  9. Distribution and diversity of Verrucomicrobia methanotrophs in geothermal and acidic environments.

    PubMed

    Sharp, Christine E; Smirnova, Angela V; Graham, Jaime M; Stott, Matthew B; Khadka, Roshan; Moore, Tim R; Grasby, Stephen E; Strack, Maria; Dunfield, Peter F

    2014-06-01

    Recently, methanotrophic members of the phylum Verrucomicrobia have been described, but little is known about their distribution in nature. We surveyed methanotrophic bacteria in geothermal springs and acidic wetlands via pyrosequencing of 16S rRNA gene amplicons. Putative methanotrophic Verrucomicrobia were found in samples covering a broad temperature range (22.5-81.6°C), but only in acidic conditions (pH 1.8-5.0) and only in geothermal environments, not in acidic bogs or fens. Phylogenetically, three 16S rRNA gene sequence clusters of putative methanotrophic Verrucomicrobia were observed. Those detected in high-temperature geothermal samples (44.1-81.6°C) grouped with known thermoacidiphilic 'Methylacidiphilum' isolates. A second group dominated in moderate-temperature geothermal samples (22.5-40.1°C) and a representative mesophilic methanotroph from this group was isolated (strain LP2A). Genome sequencing verified that strain LP2A possessed particulate methane monooxygenase, but its 16S rRNA gene sequence identity to 'Methylacidiphilum infernorum' strain V4 was only 90.6%. A third group clustered distantly with known methanotrophic Verrucomicrobia. Using pmoA-gene targeted quantitative polymerase chain reaction, two geothermal soil profiles showed a dominance of LP2A-like pmoA sequences in the cooler surface layers and 'Methylacidiphilum'-like pmoA sequences in deeper, hotter layers. Based on these results, there appears to be a thermophilic group and a mesophilic group of methanotrophic Verrucomicrobia. However, both were detected only in acidic geothermal environments. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  10. Transposon Tn10 contains two structural genes with opposite polarity between tetA and IS10R.

    PubMed Central

    Schollmeier, K; Hillen, W

    1984-01-01

    The nucleotide sequence of the central part of Tn10 has been determined from the rightmost HindIII site to IS10R. This sequence contains two open reading frames with opposite polarity. The in vivo transcription start points in this sequence have been determined by S1 mapping. These results define one minor and two major promoters. The transcription starts of the two major promoters are only 18 base pairs apart, and the transcripts show different polarity and overlap by 18 base pairs. The nucleotide sequence reveals two regions with palindromic symmetry which may serve as operators. Their possible involvement in the regulation of transcription of both genes is discussed. Taken together these results allow for a maximal coding capacity of 138 amino acids directed toward IS10R and 197 amino acids directed toward tetA. The possible function of these gene products is discussed. The accompanying article (Braus et al., J. Bacteriol. 160:504-509, 1984) presents evidence that these genes are expressed. Images PMID:6094471

  11. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids

    PubMed Central

    Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

    2010-01-01

    Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279–284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reiser, Steven E.; Somerville, Chris R.

    The present invention relates to bacterial enzymes, in particular to an acyl-CoA reductase and a gene encoding an acyl-CoA reductase, the amino acid and nucleic acid sequences corresponding to the reductase polypeptide and gene, respectively, and to methods of obtaining such enzymes, amino acid sequences and nucleic acid sequences. The invention also relates to the use of such sequences to provide transgenic host cells capable of producing fatty alcohols and fatty aldehydes.

  13. BGL7 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Ward, Michael

    2013-01-29

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.

  14. BGL6 .beta.-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Ward, Michael

    2012-10-02

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.

  15. BGL5 .beta.-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2006-02-28

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.

  16. BGL5 .beta.-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2008-03-18

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.

  17. BGL6 beta-glucosidase and nucleic acids encoding the same

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dunn-Coleman, Nigel; Ward, Michael

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.

  18. BGL6 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Ward, Michael

    2014-03-04

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.

  19. BGL7 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Ward, Michael

    2015-04-14

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.

  20. BGL7 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Ward, Michael

    2014-03-25

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.

  1. BGL6 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Ward, Michael

    2015-08-11

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.

  2. BGL3 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2007-09-25

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.

  3. BGL3 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2008-04-01

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.

  4. BGL4 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2011-12-06

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.

  5. BGL4 .beta.-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2006-05-16

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.

  6. BGL3 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2011-06-14

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.

  7. BGL6 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Ward, Michael [San Francisco, CA

    2009-09-01

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.

  8. BGL3 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2012-10-30

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.

  9. BGL4 beta-glucosidase and nucleic acids encoding the same

    DOEpatents

    Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

    2008-01-22

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.

  10. Molecular characterization of southern bluefin tuna myoglobin (Thunnus maccoyii).

    PubMed

    Nurilmala, Mala; Ochiai, Yoshihiro

    2016-10-01

    The primary structure of southern bluefin tuna Thunnus maccoyii Mb has been elucidated by molecular cloning techniques. The cDNA of this tuna encoding Mb contained 776 nucleotides, with an open reading frame of 444 nucleotides encoding 147 amino acids. The nucleotide sequence of the coding region was identical to those of other bluefin tunas (T. thynnus and T. orientalis), thus giving the same amino acid sequences. Based on the deduced amino acid sequence, bioinformatic analysis was performed including phylogenic tree, hydropathy plot and homology modeling. In order to investigate the autoxidation profiles, the isolation of Mb was performed from the dark muscle. The water soluble fraction was subjected to ammonium sulfate fractionation (60-90 % saturation) followed by preparative gel electrophoresis. Autoxidation profiles of Mb were delineated at pH 5.6, 6.5 and 7.4 at temperature 37 °C. The autoxidation rate of tuna Mb was slightly higher than that of horse Mb at all pH examined. These results revealed that tuna myoglobin was unstable than that of horse Mb mainly at acidic pH.

  11. Pseudopropionibacterium sp. nov., a novel red-pigmented species isolated from human gingival sulcus.

    PubMed

    Saito, Masanori; Shinozaki-Kuwahara, Noriko; Tsudukibashi, Osamu; Hashizume-Takizawa, Tomomi; Kobayashi, Ryoki; Kurita-Ochiai, Tomoko

    2018-04-24

    Strain SK-1 T is a novel Gram stain-positive, pleomorphic, rod-shaped, non-spore forming, and non-motile organism, designated SK-1 T , isolated from human gingival sulcus that produces acetic acid, propionic acid, lactic acid, and succinic acid as end products of glucose fermentation. Strain SK-1 T had the closest relatedness to Pseudopropionibacterium (Propionibacterium) propionicum with sequence homologies of the 16S rRNA and RNA polymerase β subunit (rpoB) genes of 96.6% and 93.1%, respectively. The genomic DNA G + C content of the isolate was 61.8 mol%. Based on the sequence data of the 16S rRNA and housekeeping (rpoB) genes, we propose a novel taxon, Pseudopropionibacterium rubrum sp. nov. (type strain SK-1 T = JCM 31317T= DSM 100122T). The 16S rRNA and rpoB gene sequences of strain SK-1 T were deposited to the DNA Data Bank of Japan under the accession numbers LC002971 and LC102236, respectively. © 2018 The Societies and John Wiley & Sons Australia, Ltd.

  12. Construction and characterization of a normalized cDNA library of Nannochloropsis oculata (Eustigmatophyceae)

    NASA Astrophysics Data System (ADS)

    Yu, Jianzhong; Ma, Xiaolei; Pan, Kehou; Yang, Guanpin; Yu, Wengong

    2010-07-01

    We constructed and characterized a normalized cDNA library of Nannochloropsis oculata CS-179, and obtained 905 nonredundant sequences (NRSs) ranging from 431-1 756 bp in length. Among them, 496 were very similar to nonredundant ones in the GenBank ( E ≤1.0e-05), and 349 ESTs had significant hits with the clusters of eukaryotic orthologous groups (KOG). Bases G and/or C at the third position of codons of 14 amino acid residues suggested a strong bias in the conserved domain of 362 NRSs (>60%). We also identified the unigenes encoding phosphorus and nitrogen transporters, suggesting that N. oculata could efficiently transport and metabolize phosphorus and nitrogen, and recognized the unigenes that involved in biosynthesis and storage of both fatty acids and polyunsaturated fatty acids (PUFAs), which will facilitate the demonstration of eicosapentaenoic acid (EPA) biosynthesis pathway of N. oculata. In comparison with the original cDNA library, the normalized library significantly increased the efficiencies of random sequencing and rarely expressed genes discovering, and decreased the frequency of abundant gene sequences.

  13. Acid-base chemical reaction model for nucleation rates in the polluted atmospheric boundary layer.

    PubMed

    Chen, Modi; Titcombe, Mari; Jiang, Jingkun; Jen, Coty; Kuang, Chongai; Fischer, Marc L; Eisele, Fred L; Siepmann, J Ilja; Hanson, David R; Zhao, Jun; McMurry, Peter H

    2012-11-13

    Climate models show that particles formed by nucleation can affect cloud cover and, therefore, the earth's radiation budget. Measurements worldwide show that nucleation rates in the atmospheric boundary layer are positively correlated with concentrations of sulfuric acid vapor. However, current nucleation theories do not correctly predict either the observed nucleation rates or their functional dependence on sulfuric acid concentrations. This paper develops an alternative approach for modeling nucleation rates, based on a sequence of acid-base reactions. The model uses empirical estimates of sulfuric acid evaporation rates obtained from new measurements of neutral molecular clusters. The model predicts that nucleation rates equal the sulfuric acid vapor collision rate times a prefactor that is less than unity and that depends on the concentrations of basic gaseous compounds and preexisting particles. Predicted nucleation rates and their dependence on sulfuric acid vapor concentrations are in reasonable agreement with measurements from Mexico City and Atlanta.

  14. Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins.

    PubMed

    Raimondi, Daniele; Orlando, Gabriele; Pancsa, Rita; Khan, Taushif; Vranken, Wim F

    2017-08-18

    Protein folding is a complex process that can lead to disease when it fails. Especially poorly understood are the very early stages of protein folding, which are likely defined by intrinsic local interactions between amino acids close to each other in the protein sequence. We here present EFoldMine, a method that predicts, from the primary amino acid sequence of a protein, which amino acids are likely involved in early folding events. The method is based on early folding data from hydrogen deuterium exchange (HDX) data from NMR pulsed labelling experiments, and uses backbone and sidechain dynamics as well as secondary structure propensities as features. The EFoldMine predictions give insights into the folding process, as illustrated by a qualitative comparison with independent experimental observations. Furthermore, on a quantitative proteome scale, the predicted early folding residues tend to become the residues that interact the most in the folded structure, and they are often residues that display evolutionary covariation. The connection of the EFoldMine predictions with both folding pathway data and the folded protein structure suggests that the initial statistical behavior of the protein chain with respect to local structure formation has a lasting effect on its subsequent states.

  15. Homology between DNA polymerases of poxviruses, herpesviruses, and adenoviruses: nucleotide sequence of the vaccinia virus DNA polymerase gene.

    PubMed Central

    Earl, P L; Jones, E V; Moss, B

    1986-01-01

    A 5400-base-pair segment of the vaccinia virus genome was sequenced and an open reading frame of 938 codons was found precisely where the DNA polymerase had been mapped by transfer of a phosphonoacetate-resistance marker. A single nucleotide substitution changing glycine at position 347 to aspartic acid accounts for the drug resistance of the mutant vaccinia virus. The 5' end of the DNA polymerase mRNA was located 80 base pairs before the methionine codon initiating the open reading frame. Correspondence between the predicted Mr 108,577 polypeptide and the 110,000 purified enzyme indicates that little or no proteolytic processing occurs. Extensive homology, extending over 435 amino acids, was found upon comparing the DNA polymerase of vaccinia virus and DNA polymerase of Epstein-Barr virus. A highly conserved sequence of 14 amino acids in the carboxyl-terminal regions of the above DNA polymerases is also present at a similar location in adenovirus DNA polymerase. This structure, which is predicted to form a turn flanked by beta-pleated sheets, may form part of an essential binding or catalytic site that accounts for its presence in DNA polymerases of poxviruses, herpesviruses, and adenoviruses. Images PMID:3012524

  16. Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

    PubMed Central

    Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.

    2011-01-01

    The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515

  17. Degenerative minimalism in the genome of a psyllid endosymbiont.

    PubMed

    Clark, M A; Baumann, L; Thao, M L; Moran, N A; Baumann, P

    2001-03-01

    Psyllids, like aphids, feed on plant phloem sap and are obligately associated with prokaryotic endosymbionts acquired through vertical transmission from an ancestral infection. We have sequenced 37 kb of DNA of the genome of Carsonella ruddii, the endosymbiont of psyllids, and found that it has a number of unusual properties revealing a more extreme case of degeneration than was previously reported from studies of eubacterial genomes, including that of the aphid endosymbiont Buchnera aphidicola. Among the unusual properties are an exceptionally low guanine-plus-cytosine content (19.9%), almost complete absence of intergenic spaces, operon fusion, and lack of the usual promoter sequences upstream of 16S rDNA. These features suggest the synthesis of long mRNAs and translational coupling. The most extreme instances of base compositional bias occur in the genes encoding proteins that have less highly conserved amino acid sequences; the guanine-plus-cytosine content of some protein-coding sequences is as low as 10%. The shift in base composition has a large effect on proteins: in polypeptides of C. ruddii, half of the residues consist of five amino acids with codons low in guanine plus cytosine. Furthermore, the proteins of C. ruddii are reduced in size, with an average of about 9% fewer amino acids than in homologous proteins of related bacteria. These observations suggest that the C. ruddii genome is not subject to constraints that limit the evolution of other known eubacteria.

  18. Neutral changes during divergent evolution of hemoglobins

    NASA Technical Reports Server (NTRS)

    Jukes, T. H.

    1978-01-01

    A comparison of the mRNAs for rabbit and human beta-hemoglobins shows that synonymous changes in codons have accumulated three times as rapidly as nucleotide replacements that produced changes in amino acids. This agrees with predictions based on the so-called neutral theory. In addition, seven codon changes that appear to be single-base changes (according to maximum parsimony) are actually two-base changes. This indicates that the construction of primordial sequences is of limited significance when based on inferences that assume minimum base changes for amino acid replacements.

  19. The Biomolecule Sequencer Project: Nanopore Sequencing as a Dual-Use Tool for Crew Health and Astrobiology Investigations

    NASA Technical Reports Server (NTRS)

    John, K. K.; Botkin, D. S.; Burton, A. S.; Castro-Wallace, S. L.; Chaput, J. D.; Dworkin, J. P.; Lehman, N.; Lupisella, M. L.; Mason, C. E.; Smith, D. J.; hide

    2016-01-01

    Human missions to Mars will fundamentally transform how the planet is explored, enabling new scientific discoveries through more sophisticated sample acquisition and processing than can currently be implemented in robotic exploration. The presence of humans also poses new challenges, including ensuring astronaut safety and health and monitoring contamination. Because the capability to transfer materials to Earth will be extremely limited, there is a strong need for in situ diagnostic capabilities. Nucleotide sequencing is a particularly powerful tool because it can be used to: (1) mitigate microbial risks to crew by allowing identification of microbes in water, in air, and on surfaces; (2) identify optimal treatment strategies for infections that arise in crew members; and (3) track how crew members, microbes, and mission-relevant organisms (e.g., farmed plants) respond to conditions on Mars through transcriptomic and genomic changes. Sequencing would also offer benefits for science investigations occurring on the surface of Mars by permitting identification of Earth-derived contamination in samples. If Mars contains indigenous life, and that life is based on nucleic acids or other closely related molecules, sequencing would serve as a critical tool for the characterization of those molecules. Therefore, spaceflight-compatible nucleic acid sequencing would be an important capability for both crew health and astrobiology exploration. Advances in sequencing technology on Earth have been driven largely by needs for higher throughput and read accuracy. Although some reduction in size has been achieved, nearly all commercially available sequencers are not compatible with spaceflight due to size, power, and operational requirements. Exceptions are nanopore-based sequencers that measure changes in current caused by DNA passing through pores; these devices are inherently much smaller and require significantly less power than sequencers using other detection methods. Consequently, nanopore-based sequencers could be made flight-ready with only minimal modifications.

  20. Characterization and prediction of residues determining protein functional specificity.

    PubMed

    Capra, John A; Singh, Mona

    2008-07-01

    Within a homologous protein family, proteins may be grouped into subtypes that share specific functions that are not common to the entire family. Often, the amino acids present in a small number of sequence positions determine each protein's particular functional specificity. Knowledge of these specificity determining positions (SDPs) aids in protein function prediction, drug design and experimental analysis. A number of sequence-based computational methods have been introduced for identifying SDPs; however, their further development and evaluation have been hindered by the limited number of known experimentally determined SDPs. We combine several bioinformatics resources to automate a process, typically undertaken manually, to build a dataset of SDPs. The resulting large dataset, which consists of SDPs in enzymes, enables us to characterize SDPs in terms of their physicochemical and evolutionary properties. It also facilitates the large-scale evaluation of sequence-based SDP prediction methods. We present a simple sequence-based SDP prediction method, GroupSim, and show that, surprisingly, it is competitive with a representative set of current methods. We also describe ConsWin, a heuristic that considers sequence conservation of neighboring amino acids, and demonstrate that it improves the performance of all methods tested on our large dataset of enzyme SDPs. Datasets and GroupSim code are available online at http://compbio.cs.princeton.edu/specificity/. Supplementary data are available at Bioinformatics online.

  1. Preparation and properties of pure, full-length IclR protein of Escherichia coli. Use of time-of-flight mass spectrometry to investigate the problems encountered.

    PubMed Central

    Donald, L. J.; Chernushevich, I. V.; Zhou, J.; Verentchikov, A.; Poppe-Schriemer, N.; Hosfield, D. J.; Westmore, J. B.; Ens, W.; Duckworth, H. W.; Standing, K. G.

    1996-01-01

    IclR protein, the repressor of the aceBAK operon of Escherichia coli, has been examined by time-of-flight mass spectrometry, with ionization by matrix assisted laser desorption or by electrospray. The purified protein was found to have a smaller mass than that predicted from the base sequence of the cloned iclR gene. Additional measurements were made on mixtures of peptides derived from IclR by treatment with trypsin and cyanogen bromide. They showed that the amino acid sequence is that predicted from the gene sequence, except that the protein has suffered truncation by removal of the N-terminal eight or, in some cases, nine amino acid residues. The peptide bond whose hydrolysis would remove eight residues is a typical target for the E. coli protease OmpT. We find that, by taking precautions to minimize Omp T proteolysis, or by eliminating it through mutation of the host strain, we can isolate full-length IclR protein (lacking only the N-terminal methionine residue). Full-length IclR is a much better DNA-binding protein than the truncated versions: it binds the aceBAK operator sequence 44-fold more tightly, presumably because of additional contacts that the N-terminal residues make with the DNA. Our experience thus demonstrates the advantages of using mass spectrometry to characterize newly purified proteins produced from cloned genes, especially where proteolysis or other covalent modification is a concern. This technique gives mass spectra from complex peptide mixtures that can be analyzed completely, without any fractionation of the mixtures, by reference to the amino acid sequence inferred from the base sequence of the cloned gene. PMID:8844850

  2. Human jagged polypeptide, encoding nucleic acids and methods of use

    DOEpatents

    Li, Linheng; Hood, Leroy

    2000-01-01

    The present invention provides an isolated polypeptide exhibiting substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the polypeptide does not have the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. The invention further provides an isolated nucleic acid molecule containing a nucleotide sequence encoding substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the nucleotide sequence does not encode the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. Also provided herein is a method of inhibiting differentiation of hematopoietic progenitor cells by contacting the progenitor cells with an isolated JAGGED polypeptide, or active fragment thereof. The invention additionally provides a method of diagnosing Alagille Syndrome in an individual. The method consists of detecting an Alagille Syndrome disease-associated mutation linked to a JAGGED locus.

  3. Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof

    DOEpatents

    Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

    2016-02-16

    The invention relates to a polypeptide which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  4. Polypeptide having beta-glucosidase activity and uses thereof

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well asmore » the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.« less

  5. Polypeptide having swollenin activity and uses thereof

    DOEpatents

    Schoonneveld-Bergmans, Margot Elizabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica D; Damveld, Robbertus Antonius

    2015-11-04

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  6. Polypeptide having beta-glucosidase activity and uses thereof

    DOEpatents

    Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel; Damveld, Robbertus Antonius

    2015-09-01

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  7. Polypeptide having cellobiohydrolase activity and uses thereof

    DOEpatents

    Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

    2015-09-15

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  8. Polypeptide having acetyl xylan esterase activity and uses thereof

    DOEpatents

    Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

    2015-10-20

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  9. Polypeptide having carbohydrate degrading activity and uses thereof

    DOEpatents

    Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica Diana; Damveld, Robbertus Antonius

    2015-08-18

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  10. The alphabet of intrinsic disorder

    PubMed Central

    Uversky, Vladimir N

    2013-01-01

    The ability of a protein to fold into unique functional state or to stay intrinsically disordered is encoded in its amino acid sequence. Both ordered and intrinsically disordered proteins (IDPs) are natural polypeptides that use the same arsenal of 20 proteinogenic amino acid residues as their major building blocks. The exceptional structural plasticity of IDPs, their capability to exist as heterogeneous structural ensembles and their wide array of important disorder-based biological functions that complements functional repertoire of ordered proteins are all rooted within the peculiar differential usage of these building blocks by ordered proteins and IDPs. In fact, some residues (so-called disorder-promoting residues) are noticeably more common in IDPs than in sequences of ordered proteins, which, in their turn, are enriched in several order-promoting residues. Furthermore, residues can be arranged according to their “disorder promoting potencies,” which are evaluated based on the relative abundances of various amino acids in ordered and disordered proteins. This review continues a series of publications on the roles of different amino acids in defining the phenomenon of protein intrinsic disorder and concerns glutamic acid, which is the second most disorder-promoting residue. PMID:28516010

  11. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

    PubMed

    Hiscock, D; Upton, C

    2000-05-01

    The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .

  12. Molecular classification based on apomorphic amino acids (Arthropoda, Hexapoda): Integrative taxonomy in the era of phylogenomics.

    PubMed

    Wu, Hao-Yang; Wang, Yan-Hui; Xie, Qiang; Ke, Yun-Ling; Bu, Wen-Jun

    2016-06-17

    With the great development of sequencing technologies and systematic methods, our understanding of evolutionary relationships at deeper levels within the tree of life has greatly improved over the last decade. However, the current taxonomic methodology is insufficient to describe the growing levels of diversity in both a standardised and general way due to the limitations of using only morphological traits to describe clades. Herein, we propose the idea of a molecular classification based on hierarchical and discrete amino acid characters. Clades are classified based on the results of phylogenetic analyses and described using amino acids with group specificity in phylograms. Practices based on the recently published phylogenomic datasets of insects together with 15 de novo sequenced transcriptomes in this study demonstrate that such a methodology can accommodate various higher ranks of taxonomy. Such an approach has the advantage of describing organisms in a standard and discrete way within a phylogenetic framework, thereby facilitating the recognition of clades from the view of the whole lineage, as indicated by PhyloCode. By combining identification keys and phylogenies, the molecular classification based on hierarchical and discrete characters may greatly boost the progress of integrative taxonomy.

  13. Molecular classification based on apomorphic amino acids (Arthropoda, Hexapoda): Integrative taxonomy in the era of phylogenomics

    PubMed Central

    Wu, Hao-Yang; Wang, Yan-Hui; Xie, Qiang; Ke, Yun-Ling; Bu, Wen-Jun

    2016-01-01

    With the great development of sequencing technologies and systematic methods, our understanding of evolutionary relationships at deeper levels within the tree of life has greatly improved over the last decade. However, the current taxonomic methodology is insufficient to describe the growing levels of diversity in both a standardised and general way due to the limitations of using only morphological traits to describe clades. Herein, we propose the idea of a molecular classification based on hierarchical and discrete amino acid characters. Clades are classified based on the results of phylogenetic analyses and described using amino acids with group specificity in phylograms. Practices based on the recently published phylogenomic datasets of insects together with 15 de novo sequenced transcriptomes in this study demonstrate that such a methodology can accommodate various higher ranks of taxonomy. Such an approach has the advantage of describing organisms in a standard and discrete way within a phylogenetic framework, thereby facilitating the recognition of clades from the view of the whole lineage, as indicated by PhyloCode. By combining identification keys and phylogenies, the molecular classification based on hierarchical and discrete characters may greatly boost the progress of integrative taxonomy. PMID:27312960

  14. Influence of physicochemical treatments on iron-based spent catalyst for catalytic oxidation of toluene.

    PubMed

    Kim, Sang Chai; Shim, Wang Geun

    2008-06-15

    The catalytic oxidation of toluene was studied over an iron-based spent and regenerated catalysts. Air, hydrogen, or four different acid solutions (oxalic acid (C2H2O4), citric acid (C6H8O7), acetic acid (CH3COOH), and nitric acid (HNO3)) were employed to regenerate the spent catalyst. The properties of pretreated spent catalyst were characterized by the Brunauer Emmett Teller (BET), inductively coupled plasma (ICP), temperature programmed reduction (TPR), and X-ray diffraction (XRD) analyses. The air pretreatment significantly enhanced the catalytic activity of the spent catalyst in the pretreatment temperature range of 200-400 degrees C, but its catalytic activity diminished at the pretreatment temperature of 600 degrees C. The catalytic activity sequence with respect to the air pretreatment temperatures was 400 degrees C>200 degrees C>parent>600 degrees C. The TPR results indicated that the catalytic activity was correlated with both the oxygen mobility and the amount of available oxygen on the catalyst. In contrast, the hydrogen pretreatment had a negative effect on the catalytic activity, and toluene conversion decreased with increasing pretreatment temperatures (200-600 degrees C). The XRD and TPR results confirmed the formation of metallic iron which had a negative effect on the catalytic activity with increasing pretreatment temperature. The acid pretreatment improved the catalytic activity of the spent catalyst. The catalytic activity sequence with respect to different acids pretreatment was found to be oxalic acid>citric acid>acetic acid>or=nitric acid>parent. The TPR results of acid pretreated samples showed an increased amount of available oxygen which gave a positive effect on the catalytic activity. Accordingly, air or acid pretreatments were more promising methods of regenerating the iron-based spent catalyst. In particular, the oxalic acid pretreatment was found to be most effective in the formation of FeC2O4 species which contributed highly to the catalytic combustion of toluene.

  15. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  16. 37 CFR 5.31-5.33 - [Reserved

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... from abandonment 1.135 Amino Acid Sequences. (See Nucleotide and/or Amino Acid Sequences) Appeal to... Appeals and Interference 41.47 Of rejection of an application 1.104(a) Nucleotide and/or Amino Acid...) Symbols for nucleotide and/or amino acid sequence data 1.822 T Tables in patent applications 1.58 Terminal...

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  18. Isolation of prolactin and growth hormone from the pituitary of the holostean fish Amia calva.

    PubMed

    Dores, R M; Noso, T; Rand-Weaver, M; Kawauchi, H

    1993-06-01

    Pituitaries from adult male and female Amia calva (Order Holostei) were acid extracted and fractionated by gel filtration column chromatography and reversed-phase high performance liquid chromatography. This two-step isolation procedure yielded homogeneous pools of Amia prolaction (PRL) and growth hormone (GH). The amino acid composition of both purified polypeptides was determined. Primary sequence analysis of the first 22 positions at the N-terminal of Amia PRL revealed that this region has 63% sequence identity with eel PRL-1. The N-terminal region of Amia PRL lacks the disulfide bridge which is characteristic of tetrapod PRLs. Primary sequence analysis of the first 24 positions at the N-terminal of Amia GH revealed that this region has 62% sequence identity with eel GH and 54% sequence identity with both blue shark GH and sea turtle GH. Based on N-terminal analysis, it appears that Amia PRL and GH are more closely related to teleost PRLs and GHs than they are to tetrapod PRLs and GHs.

  19. Automated design evolution of stereochemically randomized protein foldamers

    NASA Astrophysics Data System (ADS)

    Ranbhor, Ranjit; Kumar, Anil; Patel, Kirti; Ramakrishnan, Vibin; Durani, Susheel

    2018-05-01

    Diversification of chain stereochemistry opens up the possibilities of an ‘in principle’ increase in the design space of proteins. This huge increase in the sequence and consequent structural variation is aimed at the generation of smart materials. To diversify protein structure stereochemically, we introduced L- and D-α-amino acids as the design alphabet. With a sequence design algorithm, we explored the usage of specific variables such as chirality and the sequence of this alphabet in independent steps. With molecular dynamics, we folded stereochemically diverse homopolypeptides and evaluated their ‘fitness’ for possible design as protein-like foldamers. We propose a fitness function to prune the most optimal fold among 1000 structures simulated with an automated repetitive simulated annealing molecular dynamics (AR-SAMD) approach. The highly scored poly-leucine fold with sequence lengths of 24 and 30 amino acids were later sequence-optimized using a Dead End Elimination cum Monte Carlo based optimization tool. This paper demonstrates a novel approach for the de novo design of protein-like foldamers.

  20. Self-Organizing Hidden Markov Model Map (SOHMMM).

    PubMed

    Ferles, Christos; Stafylopatis, Andreas

    2013-12-01

    A hybrid approach combining the Self-Organizing Map (SOM) and the Hidden Markov Model (HMM) is presented. The Self-Organizing Hidden Markov Model Map (SOHMMM) establishes a cross-section between the theoretic foundations and algorithmic realizations of its constituents. The respective architectures and learning methodologies are fused in an attempt to meet the increasing requirements imposed by the properties of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein chain molecules. The fusion and synergy of the SOM unsupervised training and the HMM dynamic programming algorithms bring forth a novel on-line gradient descent unsupervised learning algorithm, which is fully integrated into the SOHMMM. Since the SOHMMM carries out probabilistic sequence analysis with little or no prior knowledge, it can have a variety of applications in clustering, dimensionality reduction and visualization of large-scale sequence spaces, and also, in sequence discrimination, search and classification. Two series of experiments based on artificial sequence data and splice junction gene sequences demonstrate the SOHMMM's characteristics and capabilities. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Molecular cloning of crustins from the hemocytes of Brazilian penaeid shrimps.

    PubMed

    Rosa, Rafael Diego; Bandeira, Paula Terra; Barracco, Margherita Anna

    2007-09-01

    Crustins are antimicrobial peptides initially identified in the hemocytes of the crab Carcinus maenas (11.5-kDa peptide or carcinin) and recently also recognized in penaeid shrimps and other crustacean species. The aim of this study was to identify sequences encoding for crustins from the hemocytes of four Brazilian penaeid species: Farfantepenaeus paulensis, Farfantepenaeus subtilis, Farfantepenaeus brasiliensis and Litopenaeus schmitti. Using primers based on consensus nucleotide alignment of crustins from different crustaceans, cDNA sequences coding for crustins in all indigenous penaeid species were amplified. The obtained four crustin sequences encoded for peptides containing a hydrophobic N-terminal region rich in glycine repeats and a C-terminal part with 12 cysteine residues and a conserved whey acidic protein domain. All obtained crustin sequences showed high amino acidic similarity among each other and with crustins from litopenaeid shrimps (76-98%). This is the first report of crustins in native Brazilian penaeid shrimps.

  2. Kinetics of coffee industrial residue pyrolysis using distributed activation energy model and components separation of bio-oil by sequencing temperature-raising pyrolysis.

    PubMed

    Chen, Nanwei; Ren, Jie; Ye, Ziwei; Xu, Qizhi; Liu, Jingyong; Sun, Shuiyu

    2016-12-01

    This study was carried out to investigate the kinetics of coffee industrial residue (CIR) pyrolysis, the effect of pyrolysis factors on yield of bio-oil component and components separation of bio-oil. The kinetics of CIR pyrolysis was analyzed using distributed activation energy model (DAEM), based on the experiments in thermogravimetric analyzer (TGA), and it indicated that the average of activation energy (E) is 187.86kJ·mol -1 . The bio-oils were prepared from CIR pyrolysis in vacuum tube furnace, and its components were determined by gas chromatography/mass spectrometry (GC-MS). Among pyrolysis factors, pyrolysis temperature is the most influential factor on components yield of bio-oil, directly concerned with the volatilization and yield of components (palmitic acid, linoleic acid, oleic acid, octadecanoic acid and caffeine). Furthermore, a new method (sequencing temperature-raising pyrolysis) was put forward and applied to the components separation of bio-oil. Based on experiments, a solution of components separation of bio-oil was come out. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. The sequence of sequencers: The history of sequencing DNA.

    PubMed

    Heather, James M; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  4. MACARON: A python framework to identify and re-annotate multi-base affected codons in whole genome/exome sequence data.

    PubMed

    Khan, Waqasuddin; Saripella, Ganapathi Varma-; Ludwig, Thomas; Cuppens, Tania; Thibord, Florian; Génin, Emmanuelle; Deleuze, Jean-Francois; Trégouët, David-Alexandre

    2018-05-03

    Predicted deleteriousness of coding variants is a frequently used criterion to filter out variants detected in next-generation sequencing projects and to select candidates impacting on the risk of human diseases. Most available dedicated tools implement a base-to-base annotation approach that could be biased in presence of several variants in the same genetic codon. We here proposed the MACARON program that, from a standard VCF file, identifies, re-annotates and predicts the amino acid change resulting from multiple single nucleotide variants (SNVs) within the same genetic codon. Applied to the whole exome dataset of 573 individuals, MACARON identifies 114 situations where multiple SNVs within a genetic codon induce an amino acid change that is different from those predicted by standard single SNV annotation tool. Such events are not uncommon and deserve to be studied in sequencing projects with inconclusive findings. MACARON is written in python with codes available on the GENMED website (www.genmed.fr). david-alexandre.tregouet@inserm.fr. Supplementary data are available at Bioinformatics online.

  5. In Vitro Selection for Small-Molecule-Triggered Strand Displacement and Riboswitch Activity.

    PubMed

    Martini, Laura; Meyer, Adam J; Ellefson, Jared W; Milligan, John N; Forlin, Michele; Ellington, Andrew D; Mansy, Sheref S

    2015-10-16

    An in vitro selection method for ligand-responsive RNA sensors was developed that exploited strand displacement reactions. The RNA library was based on the thiamine pyrophosphate (TPP) riboswitch, and RNA sequences capable of hybridizing to a target duplex DNA in a TPP regulated manner were identified. After three rounds of selection, RNA molecules that mediated a strand exchange reaction upon TPP binding were enriched. The enriched sequences also showed riboswitch activity. Our results demonstrated that small-molecule-responsive nucleic acid sensors can be selected to control the activity of target nucleic acid circuitry.

  6. Molecular cloning and sequencing of the cDNA and gene for a novel elastinolytic metalloproteinase from Aspergillus fumigatus and its expression in Escherichia coli.

    PubMed Central

    Sirakova, T D; Markaryan, A; Kolattukudy, P E

    1994-01-01

    An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676

  7. Cloning and expression of a cDNA coding for a human monocyte-derived plasminogen activator inhibitor.

    PubMed

    Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P

    1988-02-01

    Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.

  8. Cloning and expression of a cDNA coding for a human monocyte-derived plasminogen activator inhibitor.

    PubMed Central

    Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P

    1988-01-01

    Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578

  9. Gene encoding a novel extracellular metalloprotease in Bacillus subtilis.

    PubMed Central

    Sloma, A; Rudolph, C F; Rufo, G A; Sullivan, B J; Theriault, K A; Ally, D; Pero, J

    1990-01-01

    The gene for a novel extracellular metalloprotease was cloned, and its nucleotide sequence was determined. The gene (mpr) encodes a primary product of 313 amino acids that has little similarity to other known Bacillus proteases. The amino acid sequence of the mature protease was preceded by a signal sequence of approximately 34 amino acids and a pro sequence of 58 amino acids. Four cysteine residues were found in the deduced amino acid sequence of the mature protein, indicating the possible presence of disulfide bonds. The mpr gene mapped in the cysA-aroI region of the chromosome and was not required for growth or sporulation. Images FIG. 2 FIG. 7 PMID:2105291

  10. Signal peptide discrimination and cleavage site identification using SVM and NN.

    PubMed

    Kazemian, H B; Yusuf, S A; White, K

    2014-02-01

    About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model. © 2013 Published by Elsevier Ltd.

  11. Methods for chromosome-specific staining

    DOEpatents

    Gray, Joe W.; Pinkel, Daniel

    1995-01-01

    Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogenous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include methods for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes.

  12. Leuconostoc pseudomesenteroides WCFur3 partial 16S rRNA gene

    USDA-ARS?s Scientific Manuscript database

    This study used a partial 535 base pair 16S rRNA gene sequence to identify a bacterial isolate. Fatty acid profiles are consistent with the 16S rRNA gene sequence identification of this bacterium. The isolate was obtained from a compost bin in Fort Collins, Colorado, USA. The 16S rRNA gene sequen...

  13. FASH: A web application for nucleotides sequence search.

    PubMed

    Veksler-Lublinksy, Isana; Barash, Danny; Avisar, Chai; Troim, Einav; Chew, Paul; Kedem, Klara

    2008-05-27

    : FASH (Fourier Alignment Sequence Heuristics) is a web application, based on the Fast Fourier Transform, for finding remote homologs within a long nucleic acid sequence. Given a query sequence and a long text-sequence (e.g, the human genome), FASH detects subsequences within the text that are remotely-similar to the query. FASH offers an alternative approach to Blast/Fasta for querying long RNA/DNA sequences. FASH differs from these other approaches in that it does not depend on the existence of contiguous seed-sequences in its initial detection phase. The FASH web server is user friendly and very easy to operate. FASH can be accessed athttps://fash.bgu.ac.il:8443/fash/default.jsp (secured website).

  14. Thermophilic cellobiohydrolase

    DOEpatents

    Sapra, Rajat; Park, Joshua I.; Datta, Supratim; Simmons, Blake A.

    2017-04-18

    The present invention provides for a composition comprising a polypeptide comprising a first amino acid sequence having at least 70% identity with the amino acid sequence of Csac GH5 wherein said first amino acid sequence has a thermostable or thermophilic cellobiohydrolase (CBH) or exoglucanase activity.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kerr, J.M.; Fisher, L.W.; Termine, J.D.

    The authors have isolated and partially sequenced the human bone sialoprotein gene (IBSP). IBSP has been sublocalized by in situ hybridization to chromosome 4q38-q31 and is composed of six small exons (51 to 159 bp) and 1 large exon ([approximately]2.6 kb). The intron/exon junctions defined by sequence analysis are of class O, retaining an intact coding triplet. Sequence analysis of the 5[prime] upstream region revealed a TATAA (nucleotides -30 to-25 from the transcriptional start point) and a CCAAT (nucleotides -56 to-52) box, both in the reverse orientation. Intron 1 contains interesting structural elements composed of polypyrimidine repeats followed by amore » poly(AC)[sub n] tract. Both types of structural elements have been detected in promoter regions of other genes and have been implicated in transcriptional regulation. Several differences between the previously published cDNA sequence and the authors' sequence have been identified, most of which are contained within the untranslated exon 1. Three base revisions in the coding region include a G to T (Gly to Val, amino acid 195), T to C (Val to Ala, amino acid 268), and T to A (Glu to Asp, amino acid 270). In conclusion, the genomic organization and potential regulatory elements of human IBSP have been elucidated. 42 refs., 4 figs., 1 tab.« less

  16. Sequencing of the amylopullulanase (apu) gene of Thermoanaerobacter ethanolicus 39E, and identification of the active site by site-directed mutagenesis.

    PubMed

    Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G

    1993-08-05

    The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.

  17. Gordonia caeni sp. nov., isolated from sludge of a sewage disposal plant.

    PubMed

    Srinivasan, Sathiyaraj; Park, Giho; Yang, Hyejin; Hwang, Supyong; Bae, Yoonjung; Jung, Yong-An; Kim, Myung Kyum; Lee, Myungjin

    2012-11-01

    A Gram-stain-positive, strictly aerobic, short-rod-shaped, non-motile strain (designated MJ32(T)) was isolated from a sludge sample of the Daejeon sewage disposal plant in South Korea. A polyphasic approach was applied to study the taxonomic position of strain MJ32(T). Strain MJ32(T) showed highest 16S rRNA gene sequence similarity to Gordonia hirsuta DSM 44140(T) (98.1%) and Gordonia hydrophobica DSM 44015(T) (97.0%); levels of sequence similarity to the type strains of other recognized Gordonia species were less than 97.0%. Phylogenetic analysis based on 16S rRNA gene sequences showed that strain MJ32(T) belonged to the clade formed by members of the genus Gordonia in the family Gordoniaceae. The G+C content of the genomic DNA of strain MJ32(T) was 69.2 mol%. Chemotaxonomically, strain MJ32(T) showed features typical of the genus Gordonia. The predominant respiratory quinone was MK-9(H(2)), the mycolic acids present had C(56)-C(60) carbon atoms, and the major fatty acids were C(16:0) (34.6%), tuberculostearic acid (21.8%), C(16:1)ω7c (19.5%) and C(18:1)ω9c (12.7%). The peptidoglycan type was based on meso-2,6-diaminopimelic acid as the diagnostic diamino acid with glycolated sugars. On the basis of phylogenetic inference, fatty acid profile and other phenotypic properties, strain MJ32(T) is considered to represent a novel species of the genus Gordonia, for which the name Gordonia caeni sp. nov. is proposed. The type strain is MJ32(T) (=KCTC 19771(T)=JCM 16923(T)).

  18. Tobacco rattle virus (TRV) based silencing of cotton enoyl-CoA reductase (ECR) gene and the role of very long chain fatty acids in normal leaf development and resistance to wilt disease

    USDA-ARS?s Scientific Manuscript database

    A Tobacco rattle virus (TRV) based virus-induced gene silencing (VIGS) assay was employed as a reverse genetic approach to study gene function in cotton (Gossypium hirsutum). This approach was used to investigate the function of Enoyl-CoA reductase (GhECR) in pathogen defense. Amino acid sequence al...

  19. Isolation of Lactobacillus sakei strain KJ-2008 and its removal of characteristic malodorous gases under anaerobic culture conditions.

    PubMed

    Kim, Jeong-Dong; Kang, Kook-Hee

    2004-12-01

    A number of different sources, such as composts, leachates, and pig feces samples were collected from different pig farms in Korea. Several microorganisms were screened for their ability to deodorize the malodorous gases. As a result, a novel malodorous gas-deodorizing bacterial strain KJ-2008 was isolated due to the most abundant of nitrate-supplemented minimal media under anaerobic conditions. Crimp-sealed serum bottles containing nitrate-supplemented minimal medium (MM-NO(3)(-)) in airtight conditions were inoculated with KJ-2008. Nitrate concentration decreased rapidly after 20 h incubation and nitrite production reached almost zero during the time the experimental was carried out. Taxonomic identification including 16S rDNA base sequencing and phylogenetic analysis indicated that the isolate KJ-2008 had a 99.8% homology in its 16S rDNA base sequence with Lactobacillus sakei. Among the volatile fatty acids, acetic acid contained in large amounts in fresh piggery slurry decreased about 40% after 50 h incubation of the strain KJ-2008. n-Butyric acid, n-valeric acid, and iso-valeric acid gradually decreased, and iso-butyric acid and capronic acid dramatically eliminated at initial time with the treatment. Moreover, NH(3) removal efficiency reached a maximum of 98.5% after 50 h of incubation. The concentration of H(2)S did not change.

  20. Cell culture compositions

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yiao, Jian

    2014-03-18

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6 (SEQ ID NO:1 encodes the full length endoglucanase; SEQ ID NO:4 encodes the mature form), and the corresponding endoglucanase VI amino acid sequence ("EGVI"; SEQ ID NO:3 is the signal sequence; SEQ ID NO:2 is the mature sequence). The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.

  1. Capturing the genetic makeup of the active microbiome in situ.

    PubMed

    Singer, Esther; Wagner, Michael; Woyke, Tanja

    2017-09-01

    More than any other technology, nucleic acid sequencing has enabled microbial ecology studies to be complemented with the data volumes necessary to capture the extent of microbial diversity and dynamics in a wide range of environments. In order to truly understand and predict environmental processes, however, the distinction between active, inactive and dead microbial cells is critical. Also, experimental designs need to be sensitive toward varying population complexity and activity, and temporal as well as spatial scales of process rates. There are a number of approaches, including single-cell techniques, which were designed to study in situ microbial activity and that have been successively coupled to nucleic acid sequencing. The exciting new discoveries regarding in situ microbial activity provide evidence that future microbial ecology studies will indispensably rely on techniques that specifically capture members of the microbiome active in the environment. Herein, we review those currently used activity-based approaches that can be directly linked to shotgun nucleic acid sequencing, evaluate their relevance to ecology studies, and discuss future directions.

  2. Cloning and characterization of the novel D-aspartyl endopeptidase, paenidase, from Paenibacillus sp. B38.

    PubMed

    Nirasawa, Satoru; Nakahara, Kazuhiko; Takahashi, Saori

    2018-02-27

    Paenidase is the first microorganism-derived D-aspartyl endopeptidase that specifically recognizes an internal D-Asp residue to cleave [D-Asp]-X peptide bonds. Using peptide sequences obtained from the protein, we performed PCR with degenerate primers to amplify the paenidase I-encoding gene. Nucleotide sequencing revealed that mature paenidase I consists of 322 amino acid residues and that the protein is encoded as a pro-protein with a 197-amino-acid N-terminal extension compared to the mature protein. Paenidase I exhibits amino acid sequence similarity to several penicillin-binding proteins. In addition, paenidase I was classified into peptidase family S12 based on a MEROPS database search. Family S12 contains serine-type D-Ala-D-Ala carboxypeptidases that have three active site residues (Ser, Lys, and Tyr) in the conserved motifs Ser-Xaa-Thr-Lys and Tyr-Xaa-Asn. These motifs were conserved in the primary structure of paenidase I, and the role of these residues was confirmed by site-directed mutagenesis.

  3. Practical multipeptide synthesis: dedicated software for the definition of multiple, overlapping peptides covering polypeptide sequences.

    PubMed

    Heegaard, P M; Holm, A; Hagerup, M

    1993-01-01

    A personal computer program for the conversion of linear amino acid sequences to multiple, small, overlapping peptide sequences has been developed. Peptide lengths and "jumps" (the distance between two consecutive overlapping peptides) are defined by the user. To facilitate the use of the program for parallel solid-phase chemical peptide syntheses for the synchronous production of multiple peptides, amino acids at each acylation step are laid out by the program in a convenient standard multi-well setup. Also, the total number of equivalents, as well as the derived amount in milligrams (depend-ending on user-defined equivalent weights and molar surplus), of each amino acid are given. The program facilitates the implementation of multipeptide synthesis, e.g., for the elucidation of polypeptide structure-function relationships, and greatly reduces the risk of introducing mistakes at the planning step. It is written in Pascal and runs on any DOS-based personal computer. No special graphic display is needed.

  4. Capturing the genetic makeup of the active microbiome in situ

    PubMed Central

    Singer, Esther; Wagner, Michael; Woyke, Tanja

    2017-01-01

    More than any other technology, nucleic acid sequencing has enabled microbial ecology studies to be complemented with the data volumes necessary to capture the extent of microbial diversity and dynamics in a wide range of environments. In order to truly understand and predict environmental processes, however, the distinction between active, inactive and dead microbial cells is critical. Also, experimental designs need to be sensitive toward varying population complexity and activity, and temporal as well as spatial scales of process rates. There are a number of approaches, including single-cell techniques, which were designed to study in situ microbial activity and that have been successively coupled to nucleic acid sequencing. The exciting new discoveries regarding in situ microbial activity provide evidence that future microbial ecology studies will indispensably rely on techniques that specifically capture members of the microbiome active in the environment. Herein, we review those currently used activity-based approaches that can be directly linked to shotgun nucleic acid sequencing, evaluate their relevance to ecology studies, and discuss future directions. PMID:28574490

  5. Lactobacillus allii sp. nov. isolated from scallion kimchi.

    PubMed

    Jung, Min Young; Lee, Se Hee; Lee, Moeun; Song, Jung Hee; Chang, Ji Yoon

    2017-12-01

    A novel strain of lactic acid bacteria, WiKim39 T , was isolated from a scallion kimchi sample consisting of fermented chili peppers and vegetables. The isolate was a Gram-positive, rod-shaped, non-motile, catalase-negative and facultatively anaerobic lactic acid bacterium. Phylogenetic analysis of the 16S rRNA gene sequence showed that strain WiKim39 T belonged to the genus Lactobacillus, and shared 97.1-98.2 % pair-wise sequence similarities with related type strains, Lactobacillus nodensis, Lactobacillus insicii, Lactobacillus versmoldensis, Lactobacillus tucceti and Lactobacillus furfuricola. The G+C content of the strain based on its genome sequence was 35.3 mol%. The ANI values between WiKim39 T and the closest relatives were lower than 80 %. Based on the phenotypic, biochemical, and phylogenetic analyses, strain WiKim39 T represents a novel species of the genus Lactobacillus, for which the name Lactobacillus allii sp. nov. is proposed. The type strain is WiKim39 T (=KCTC 21077 T =JCM 31938 T ).

  6. Detection and quantification of Plasmodium falciparum in blood samples using quantitative nucleic acid sequence-based amplification.

    PubMed

    Schoone, G J; Oskam, L; Kroon, N C; Schallig, H D; Omar, S A

    2000-11-01

    A quantitative nucleic acid sequence-based amplification (QT-NASBA) assay for the detection of Plasmodium parasites has been developed. Primers and probes were selected on the basis of the sequence of the small-subunit rRNA gene. Quantification was achieved by coamplification of the RNA in the sample with one modified in vitro RNA as a competitor in a single-tube NASBA reaction. Parasite densities ranging from 10 to 10(8) Plasmodium falciparum parasites per ml could be demonstrated and quantified in whole blood. This is approximately 1,000 times more sensitive than conventional microscopy analysis of thick blood smears. Comparison of the parasite densities obtained by microscopy and QT-NASBA with 120 blood samples from Kenyan patients with clinical malaria revealed that for 112 of 120 (93%) of the samples results were within a 1-log difference. QT-NASBA may be especially useful for the detection of low parasite levels in patients with early-stage malaria and for the monitoring of the efficacy of drug treatment.

  7. Lactobacillus allii sp. nov. isolated from scallion kimchi

    PubMed Central

    Jung, Min Young; Lee, Se Hee; Lee, Moeun; Song, Jung Hee; Chang, Ji Yoon

    2017-01-01

    A novel strain of lactic acid bacteria, WiKim39T, was isolated from a scallion kimchi sample consisting of fermented chili peppers and vegetables. The isolate was a Gram-positive, rod-shaped, non-motile, catalase-negative and facultatively anaerobic lactic acid bacterium. Phylogenetic analysis of the 16S rRNA gene sequence showed that strain WiKim39T belonged to the genus Lactobacillus, and shared 97.1–98.2 % pair-wise sequence similarities with related type strains, Lactobacillus nodensis, Lactobacillus insicii, Lactobacillus versmoldensis, Lactobacillus tucceti and Lactobacillus furfuricola. The G+C content of the strain based on its genome sequence was 35.3 mol%. The ANI values between WiKim39T and the closest relatives were lower than 80 %. Based on the phenotypic, biochemical, and phylogenetic analyses, strain WiKim39T represents a novel species of the genus Lactobacillus, for which the name Lactobacillus allii sp. nov. is proposed. The type strain is WiKim39T (=KCTC 21077T=JCM 31938T). PMID:29043955

  8. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India.

    PubMed

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-03-01

    Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.

  9. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India

    PubMed Central

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-01-01

    Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199

  10. Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

    PubMed Central

    Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

    2018-01-01

    Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139

  11. An unusual mode of DNA duplex association: Watson-Crick interaction of all-purine deoxyribonucleic acids.

    PubMed

    Battersby, Thomas R; Albalos, Maria; Friesenhahn, Michel J

    2007-05-01

    Nucleic acid duplexes associating through purine-purine base pairing have been constructed and characterized in a remarkable demonstration of nucleic acids with mixed sequence and a natural backbone in an alternative duplex structure. The antiparallel deoxyribose all-purine duplexes associate specifically through Watson-Crick pairing, violating the nucleobase size-complementarity pairing convention found in Nature. Sequence-specific recognition displayed by these structures makes the duplexes suitable, in principle, for information storage and replication fundamental to molecular evolution in all living organisms. All-purine duplexes can be formed through association of purines found in natural ribonucleosides. Key to the formation of these duplexes is the N(3)-H tautomer of isoguanine, preferred in the duplex, but not in aqueous solution. The duplexes have relevance to evolution of the modern genetic code and can be used for molecular recognition of natural nucleic acids.

  12. Biosynthesis of Lipoic Acid in Arabidopsis: Cloning and Characterization of the cDNA for Lipoic Acid Synthase1

    PubMed Central

    Yasuno, Rie; Wada, Hajime

    1998-01-01

    Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738

  13. Virulence and molecular polymorphism of Prunus necrotic ringspot virus isolates.

    PubMed

    Hammond, R W; Crosslin, J M

    1998-07-01

    Prunus necrotic ringspot virus (PNRSV) occurs as numerous strains or isolates that vary widely in their pathogenic, biophysical and serological properties. Prior attempts to distinguish pathotypes based upon physical properties have not been successful; our approach was to examine the molecular properties that may distinguish these isolates. The nucleic acid sequence was determined from 1.65 kbp RT-PCR products derived from RNA 3 of seven distinct isolates of PNRSV that differ serologically and in pathology on sweet cherry. Sequence comparisons of ORF 3a (putative movement protein) and ORF 3b (coat protein) revealed single nucleotide and amino acid differences with strong correlations to serology and symptom types (pathotypes). Sequence differences between serotypes and pathotypes were also reflected in the overall phylogenetic relationships between the isolates.

  14. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes.

    PubMed

    Chou, Kuo-Chen

    2005-01-01

    With protein sequences entering into databanks at an explosive pace, the early determination of the family or subfamily class for a newly found enzyme molecule becomes important because this is directly related to the detailed information about which specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is both time-consuming and costly to do so by experiments alone. In a previous study, the covariant-discriminant algorithm was introduced to identify the 16 subfamily classes of oxidoreductases. Although the results were quite encouraging, the entire prediction process was based on the amino acid composition alone without including any sequence-order information. Therefore, it is worthy of further investigation. To incorporate the sequence-order effects into the predictor, the 'amphiphilic pseudo amino acid composition' is introduced to represent the statistical sample of a protein. The novel representation contains 20 + 2lambda discrete numbers: the first 20 numbers are the components of the conventional amino acid composition; the next 2lambda numbers are a set of correlation factors that reflect different hydrophobicity and hydrophilicity distribution patterns along a protein chain. Based on such a concept and formulation scheme, a new predictor is developed. It is shown by the self-consistency test, jackknife test and independent dataset tests that the success rates obtained by the new predictor are all significantly higher than those by the previous predictors. The significant enhancement in success rates also implies that the distribution of hydrophobicity and hydrophilicity of the amino acid residues along a protein chain plays a very important role to its structure and function.

  15. A symmetry model for genetic coding via a wallpaper group composed of the traditional four bases and an imaginary base E: towards category theory-like systematization of molecular/genetic biology.

    PubMed

    Sawamura, Jitsuki; Morishita, Shigeru; Ishigooka, Jun

    2014-05-07

    Previously, we suggested prototypal models that describe some clinical states based on group postulates. Here, we demonstrate a group/category theory-like model for molecular/genetic biology as an alternative application of our previous model. Specifically, we focus on deoxyribonucleic acid (DNA) base sequences. We construct a wallpaper pattern based on a five-letter cruciform motif with letters C, A, T, G, and E. Whereas the first four letters represent the standard DNA bases, the fifth is introduced for ease in formulating group operations that reproduce insertions and deletions of DNA base sequences. A basic group Z5 = {r, u, d, l, n} of operations is defined for the wallpaper pattern, with which a sequence of points can be generated corresponding to changes of a base in a DNA sequence by following the orbit of a point of the pattern under operations in group Z5. Other manipulations of DNA sequence can be treated using a vector-like notation 'Dj' corresponding to a DNA sequence but based on the five-letter base set; also, 'Dj's are expressed graphically. Insertions and deletions of a series of letters 'E' are admitted to assist in describing DNA recombination. Likewise, a vector-like notation Rj can be constructed for sequences of ribonucleic acid (RNA). The wallpaper group B = {Z5×∞, ●} (an ∞-fold Cartesian product of Z5) acts on Dj (or Rj) yielding changes to Dj (or Rj) denoted by 'Dj◦B(j→k) = Dk' (or 'Rj◦B(j→k) = Rk'). Based on the operations of this group, two types of groups-a modulo 5 linear group and a rotational group over the Gaussian plane, acting on the five bases-are linked as parts of the wallpaper group for broader applications. As a result, changes, insertions/deletions and DNA (RNA) recombination (partial/total conversion) are described. As an exploratory study, a notation for the canonical "central dogma" via a category theory-like way is presented for future developments. Despite the large incompleteness of our methodology, there is fertile ground to consider a symmetry model for genetic coding based on our specific wallpaper group. A more integrated formulation containing "central dogma" for future molecular/genetic biology remains to be explored.

  16. Development and evaluation of novel sensing materials for detecting food contamination

    NASA Astrophysics Data System (ADS)

    Sankaran, Sindhuja

    Rapid detection of food-borne volatile organic compounds (VOCs) such as organic acids and alcohols released by bacterial pathogens is being used as an indicator for detecting bacterial contamination in food by our research group. One of our current research thrusts is to develop novel sensors that will be sensitive to specific compounds (at low operating temperature) associated with food safety. This study evaluates two approaches employed to develop sensors for detecting acid and alcohols at low concentrations. Chemoresistive and piezoelectric sensors were developed based on metal oxides and olfactory system based biomaterials, respectively to detect acetic acid, butanol, 3-methyl-1-butanol, 1-pentanol, and 1-hexanol. The metal oxide based sensors were developed by the sol-gel method. A zinc oxide (ZnO) sensor was found to be sensitive to acetic acid with lower detection limit ranging from 13-40 ppm. The three-layered dip-coated gold electrode based ZnO sensors had a LDL of 18 ppm for acetic acid detection. The ZnO-iron oxide (Fe2O3) based nanocomposite sensors were developed to detect butanol operating at 100°C. The 5% Fe/Zn mole ratio based ZnO-Fe2O3 nanocomposite sensors had high correlation coefficients (>0.90) of calibration curves, low butanol LDLs (26 +/- 7 ppm), and lower variation among the sensor responses. The ZnO and ZnO-Fe2O3 nanocomposite sensors showed potential to detect acetic acid and butanol at low concentrations, respectively at 100°C. QCM based olfactory sensors were developed from olfactory receptor and odorant binding protein based sequences to detect low concentrations of acetic acid and alcohols (3-methyl-1-butanol and 1-hexanol), respectively. The average LDLs for acetic acid as well as alcohols detection of the QCM sensors were < 5 ppm. The linear calibration curve based correlation coefficients of the QCM sensors were > 0.80. Finally, a computational simulation based peptide sequences was designed from olfactory receptors and evaluated as sensor material for the detection of alcohols at low concentrations. The results indicated that the QCM sensors exhibited a good sensitivity to 1-hexanol and 1-pentanol with the estimated LDLs in the range of 2-3 ppm and 3-5 ppm, respectively. This research work was successful in developing multiple novel sensing materials to detect alcohols and acid associated with meat contaminations at low concentrations.

  17. Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition.

    PubMed

    Tamura, Takeyuki; Akutsu, Tatsuya

    2007-11-30

    Subcellular location prediction of proteins is an important and well-studied problem in bioinformatics. This is a problem of predicting which part in a cell a given protein is transported to, where an amino acid sequence of the protein is given as an input. This problem is becoming more important since information on subcellular location is helpful for annotation of proteins and genes and the number of complete genomes is rapidly increasing. Since existing predictors are based on various heuristics, it is important to develop a simple method with high prediction accuracies. In this paper, we propose a novel and general predicting method by combining techniques for sequence alignment and feature vectors based on amino acid composition. We implemented this method with support vector machines on plant data sets extracted from the TargetP database. Through fivefold cross validation tests, the obtained overall accuracies and average MCC were 0.9096 and 0.8655 respectively. We also applied our method to other datasets including that of WoLF PSORT. Although there is a predictor which uses the information of gene ontology and yields higher accuracy than ours, our accuracies are higher than existing predictors which use only sequence information. Since such information as gene ontology can be obtained only for known proteins, our predictor is considered to be useful for subcellular location prediction of newly-discovered proteins. Furthermore, the idea of combination of alignment and amino acid frequency is novel and general so that it may be applied to other problems in bioinformatics. Our method for plant is also implemented as a web-system and available on http://sunflower.kuicr.kyoto-u.ac.jp/~tamura/slpfa.html.

  18. Trichoderma .beta.-glucosidase

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2006-01-03

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.

  19. Carbohydrate degrading polypeptide and uses thereof

    DOEpatents

    Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

    2015-10-20

    The invention relates to a polypeptide having carbohydrate material degrading activity which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 4, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional protein and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  20. 2-Methoxypyridine as a Thymidine Mimic in Watson-Crick Base Pairs of DNA and PNA: Synthesis, Thermal Stability, and NMR Structural Studies.

    PubMed

    Novosjolova, Irina; Kennedy, Scott D; Rozners, Eriks

    2017-11-02

    The development of nucleic acid base-pair analogues that use new modes of molecular recognition is important both for fundamental research and practical applications. The goal of this study was to evaluate 2-methoxypyridine as a cationic thymidine mimic in the A-T base pair. The hypothesis was that including protonation in the Watson-Crick base pairing scheme would enhance the thermal stability of the DNA double helix without compromising the sequence selectivity. DNA and peptide nucleic acid (PNA) sequences containing the new 2-methoxypyridine nucleobase (P) were synthesized and studied by using UV thermal melting and NMR spectroscopy. Introduction of P nucleobase caused a loss of thermal stability of ≈10 °C in DNA-DNA duplexes and ≈20 °C in PNA-DNA duplexes over a range of mildly acidic to neutral pH. Despite the decrease in thermal stability, the NMR structural studies showed that P-A formed the expected protonated base pair at pH 4.3. Our study demonstrates the feasibility of cationic unnatural base pairs; however, future optimization of such analogues will be required. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. A simple nucleic acid hybridization/latex agglutination assay for the rapid detection of polymerase chain reaction amplicons.

    PubMed

    Vollenhofer-Schrumpf, Sabine; Buresch, Ronald; Schinkinger, Manfred

    2007-03-01

    We have developed a new method for the detection of nucleic acid hybridization, based on a simple latex agglutination test that can be evaluated by the unaided eye. Nucleic acid, e.g., a polymerase chain reaction (PCR) product, is denatured and incubated with polystyrene beads carrying covalently bound complementary oligonucleotide sequences. Hybridization of the nucleic acids leads to aggregation of the latex particles, thereby verifying the presence of target sequence. The test is performed at room temperature, and results are available within 10 min. As a proof of principle, the hybridization/latex agglutination assay was applied to the detection of purified PCR fragments either specific for Salmonella spp. or a synthetic sequence, and to the detection of Salmonella enterica in artificially contaminated chicken samples. A few nanograms of purified PCR fragments were detectable. In artificially contaminated chicken samples, 3 colony-forming units (cfu)/25 g were detected in one of three replicates, and 30 cfu/25 g were detected in both of two replicates when samples for PCR were taken directly from primary enrichment, demonstrating the practical applicability of this test system. Even multiplex detection might be achievable. This novel kind of assay could be useful for a range of applications where hybridization of nucleic acids, e.g., PCR fragments, is to be detected.

  2. Sequence diversity within the reovirus S2 gene: reovirus genes reassort in nature, and their termini are predicted to form a panhandle motif.

    PubMed Central

    Chapell, J D; Goral, M I; Rodgers, S E; dePamphilis, C W; Dermody, T S

    1994-01-01

    To better understand genetic diversity within mammalian reoviruses, we determined S2 nucleotide and deduced sigma 2 amino acid sequences of nine reovirus strains and compared these sequences with those of prototype strains of the three reovirus serotypes. The S2 gene and sigma 2 protein are highly conserved among the four type 1, one type 2, and seven type 3 strains studied. Phylogenetic analyses based on S2 nucleotide sequences of the 12 reovirus strains indicate that diversity within the S2 gene is independent of viral serotype. Additionally, we found marked topological differences between phylogenetic trees generated from S1 and S2 gene nucleotide sequences of the seven type 3 strains. These results demonstrate that reovirus S1 and S2 genes have distinct evolutionary histories, thus providing phylogenetic evidence for lateral transfer of reovirus genes in nature. When variability among the 12 sigma 2-encoding S2 nucleotide sequences was analyzed at synonymous positions, we found that approximately 60 nucleotides at the 5' terminus and 30 nucleotides at the 3' terminus were markedly conserved in comparison with other sigma 2-encoding regions of S2. Predictions of RNA secondary structures indicate that the more conserved S2 sequences participate in the formation of an extended region of duplex RNA interrupted by a pair of stem-loops. Among the 12 deduced sigma 2 amino acid sequences examined, substitutions were observed at only 11% of amino acid positions. This finding suggests that constraints on the structure or function of sigma 2, perhaps in part because of its location in the virion core, have limited sequence diversity within this protein. PMID:8289378

  3. Automated Sanger Analysis Pipeline (ASAP): A Tool for Rapidly Analyzing Sanger Sequencing Data with Minimum User Interference.

    PubMed

    Singh, Aditya; Bhatia, Prateek

    2016-12-01

    Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.

  4. Organization of the hao gene cluster of Nitrosomonas europaea: genes for two tetraheme c cytochromes.

    PubMed

    Bergmann, D J; Arciero, D M; Hooper, A B

    1994-06-01

    The organization of genes for three proteins involved in ammonia oxidation in Nitrosomonas europaea has been investigated. The amino acid sequence of the N-terminal region and four heme-containing peptides produced by proteolysis of the tetraheme cytochrome c554 of N. europaea were determined by Edman degradation. The gene (cycA) encoding this cytochrome is present in three copies per genome (H. McTavish, F. LaQuier, D. Arciero, M. Logan, G. Mundfrom, J.A. Fuchs, and A. B. Hooper, J. Bacteriol. 175:2445-2447, 1993). Three clones, representing at least two copies of cycA, were isolated and sequenced by the dideoxy-chain termination procedure. In both copies, the sequences of 211 amino acids derived from the gene sequence are identical and include all amino acids predicted by the proteolytic peptides. In two copies, the cycA open reading frame (ORF) is followed closely (three bases in one copy) by a second ORF predicted to encode a 28-kDa tetraheme c cytochrome not previously characterized but similar to the nirT gene product of Pseudomonas stutzeri. In one copy of the cycA gene cluster, the second ORF is absent.

  5. Proteins without unique 3D structures: biotechnological applications of intrinsically unstable/disordered proteins.

    PubMed

    Uversky, Vladimir N

    2015-03-01

    Intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs) are functional proteins or regions that do not have unique 3D structures under functional conditions. Therefore, from the viewpoint of their lack of stable 3D structure, IDPs/IDPRs are inherently unstable. As much as structure and function of normal ordered globular proteins are determined by their amino acid sequences, the lack of unique 3D structure in IDPs/IDPRs and their disorder-based functionality are also encoded in the amino acid sequences. Because of their specific sequence features and distinctive conformational behavior, these intrinsically unstable proteins or regions have several applications in biotechnology. This review introduces some of the most characteristic features of IDPs/IDPRs (such as peculiarities of amino acid sequences of these proteins and regions, their major structural features, and peculiar responses to changes in their environment) and describes how these features can be used in the biotechnology, for example for the proteome-wide analysis of the abundance of extended IDPs, for recombinant protein isolation and purification, as polypeptide nanoparticles for drug delivery, as solubilization tools, and as thermally sensitive carriers of active peptides and proteins. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Replacement of all arginine residues with canavanine in MazF-bs mRNA interferase changes its specificity.

    PubMed

    Ishida, Yojiro; Park, Jung-Ho; Mao, Lili; Yamaguchi, Yoshihiro; Inouye, Masayori

    2013-03-15

    Replacement of a specific amino acid residue in a protein with nonnatural analogues is highly challenging because of their cellular toxicity. We demonstrate for the first time the replacement of all arginine (Arg) residues in a protein with canavanine (Can), a toxic Arg analogue. All Arg residues in the 5-base specific (UACAU) mRNA interferase from Bacillus subtilis (MazF-bs(arg)) were replaced with Can by using the single-protein production system in Escherichia coli. The resulting MazF-bs(can) gained a 6-base recognition sequence, UACAUA, for RNA cleavage instead of the 5-base sequence, UACAU, for MazF-bs(arg). Mass spectrometry analysis confirmed that all Arg residues were replaced with Can. The present system offers a novel approach to create new functional proteins by replacing a specific amino acid in a protein with its analogues.

  7. Advances in Understanding Stimulus Responsive Phase Behavior of Intrinsically Disordered Protein Polymers.

    PubMed

    Ruff, Kiersten M; Roberts, Stefan; Chilkoti, Ashutosh; Pappu, Rohit V

    2018-06-24

    Proteins and synthetic polymers can undergo phase transitions in response to changes to intensive solution parameters such as temperature, proton chemical potentials (pH), and hydrostatic pressure. For proteins and protein-based polymers, the information required for stimulus responsive phase transitions is encoded in their amino acid sequence. Here, we review some of the key physical principles that govern the phase transitions of archetypal intrinsically disordered protein polymers (IDPPs). These are disordered proteins with highly repetitive amino acid sequences. Advances in recombinant technologies have enabled the design and synthesis of protein sequences of a variety of sequence complexities and lengths. We summarize insights that have been gleaned from the design and characterization of IDPPs that undergo thermo-responsive phase transitions and build on these insights to present a general framework for IDPPs with pH and pressure responsive phase behavior. In doing so, we connect the stimulus responsive phase behavior of IDPPs with repetitive sequences to the coil-to-globule transitions that these sequences undergo at the single chain level in response to changes in stimuli. The proposed framework and ongoing studies of stimulus responsive phase behavior of designed IDPPs have direct implications in bioengineering, where designing sequences with bespoke material properties broadens the spectrum of applications, and in biology and medicine for understanding the sequence-specific driving forces for the formation of protein-based membraneless organelles as well as biological matrices that act as scaffolds for cells and mediators of cell-to-cell communication. Copyright © 2018. Published by Elsevier Ltd.

  8. Pseudomonas japonica sp. nov., a novel species that assimilates straight chain alkylphenols.

    PubMed

    Pungrasmi, Wiboonluk; Lee, Haeng-Seog; Yokota, Akira; Ohta, Akinori

    2008-02-01

    A bacterial strain, WL(T), which was isolated from an activated sludge, was able to degrade alkylphenols. 16S rDNA sequence analysis indicated that strain WL(T) belonged to the genus Pseudomonas (sensu stricto) and formed a monophyletic clade with the type strain of Pseudomonas graminis and other members in the Pseudomonas putida subcluster with sequence similarity values higher than 97%. Genomic relatedness based on DNA-DNA hybridization of strain WL(T) to these strains is 2-41%. Strain WL(T) contained ubiquinone-9 as the main respiratory quinone, and the G+C content of DNA was 66 mol%. The organism contained hexadecanoic acid (16:0), hexadecenoic acid (16:1) and octadecenoic acid (18:1) as major cellular fatty acids. The hydroxy fatty acids detected were 3-hydroxydecanoic acid (3-OH 10:0), 3-hydroxydodecanoic acid (3-OH 12:0) and 2-hydroxydodecanoic acid (2-OH 12:0). These results, as well as physiological and biochemical characteristics clearly indicate that the strain WL(T) represents a new Pseudomonas species, for which the name Pseudomonas japonica is proposed. The type strain is strain WL(T) (=IAM 15071T=TISTR 1526T).

  9. A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses

    USDA-ARS?s Scientific Manuscript database

    Background: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected ...

  10. .beta.-glucosidase 5 (BGL5) compositions

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

    2010-06-01

    The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.

  11. Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences

    PubMed Central

    Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia

    2017-01-01

    Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613

  12. Methods of diagnosing alagille syndrome

    DOEpatents

    Li, Linheng; Hood, Leroy; Krantz, Ian D.; Spinner, Nancy B.

    2004-03-09

    The present invention provides an isolated polypeptide exhibiting substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the polypeptide does not have the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. The invention further provides an isolated nucleic acid molecule containing a nucleotide sequence encoding substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the nucleotide sequence does not encode the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. Also provided herein is a method of inhibiting differentiation of hematopoietic progenitor cells by contacting the progenitor cells with an isolated JAGGED polypeptide, or active fragment thereof. The invention additionally provides a method of diagnosing Alagille Syndrome in an individual. The method consists of detecting an Alagille Syndrome disease-associated mutation linked to a JAGGED locus.

  13. Methods for chromosome-specific staining

    DOEpatents

    Gray, J.W.; Pinkel, D.

    1995-09-05

    Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogeneous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include ways for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes. 3 figs.

  14. Methods and compositions for chromosome-specific staining

    DOEpatents

    Gray, Joe W.; Pinkel, Daniel

    2003-07-22

    Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogenous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include methods for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes.

  15. [Sequence analysis of LEAFY homologous gene from Dendrobium moniliforme and application for identification of medicinal Dendrobium].

    PubMed

    Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu

    2013-04-01

    The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.

  16. Degenerative Minimalism in the Genome of a Psyllid Endosymbiont

    PubMed Central

    Clark, Marta A.; Baumann, Linda; Thao, MyLo Ly; Moran, Nancy A.; Baumann, Paul

    2001-01-01

    Psyllids, like aphids, feed on plant phloem sap and are obligately associated with prokaryotic endosymbionts acquired through vertical transmission from an ancestral infection. We have sequenced 37 kb of DNA of the genome of Carsonella ruddii, the endosymbiont of psyllids, and found that it has a number of unusual properties revealing a more extreme case of degeneration than was previously reported from studies of eubacterial genomes, including that of the aphid endosymbiont Buchnera aphidicola. Among the unusual properties are an exceptionally low guanine-plus-cytosine content (19.9%), almost complete absence of intergenic spaces, operon fusion, and lack of the usual promoter sequences upstream of 16S rDNA. These features suggest the synthesis of long mRNAs and translational coupling. The most extreme instances of base compositional bias occur in the genes encoding proteins that have less highly conserved amino acid sequences; the guanine-plus-cytosine content of some protein-coding sequences is as low as 10%. The shift in base composition has a large effect on proteins: in polypeptides of C. ruddii, half of the residues consist of five amino acids with codons low in guanine plus cytosine. Furthermore, the proteins of C. ruddii are reduced in size, with an average of about 9% fewer amino acids than in homologous proteins of related bacteria. These observations suggest that the C. ruddii genome is not subject to constraints that limit the evolution of other known eubacteria. PMID:11222582

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Denef, Vincent; Shah, Manesh B; Verberkmoes, Nathan C

    The recent surge in microbial genomic sequencing, combined with the development of high-throughput liquid chromatography-mass-spectrometry-based (LC/LC-MS/MS) proteomics, has raised the question of the extent to which genomic information of one strain or environmental sample can be used to profile proteomes of related strains or samples. Even with decreasing sequencing costs, it remains impractical to obtain genomic sequence for every strain or sample analyzed. Here, we evaluate how shotgun proteomics is affected by amino acid divergence between the sample and the genomic database using a probability-based model and a random mutation simulation model constrained by experimental data. To assess the effectsmore » of nonrandom distribution of mutations, we also evaluated identification levels using in silico peptide data from sequenced isolates with average amino acid identities (AAI) varying between 76 and 98%. We compared the predictions to experimental protein identification levels for a sample that was evaluated using a database that included genomic information for the dominant organism and for a closely related variant (95% AAI). The range of models set the boundaries at which half of the proteins in a proteomic experiment can be identified to be 77-92% AAI between orthologs in the sample and database. Consistent with this prediction, experimental data indicated loss of half the identifiable proteins at 90% AAI. Additional analysis indicated a 6.4% reduction of the initial protein coverage per 1% amino acid divergence and total identification loss at 86% AAI. Consequently, shotgun proteomics is capable of cross-strain identifications but avoids most crossspecies false positives.« less

  18. Hydrophobic cluster analysis of G protein-coupled receptors: a powerful tool to derive structural and functional information from 2D-representation of protein sequences.

    PubMed

    Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A

    1993-01-01

    Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.

  19. An Accurate Scalable Template-based Alignment Algorithm

    PubMed Central

    Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

    2013-01-01

    The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376

  20. Complete amino acid sequence of bovine colostrum low-Mr cysteine proteinase inhibitor.

    PubMed

    Hirado, M; Tsunasawa, S; Sakiyama, F; Niinobe, M; Fujii, S

    1985-07-01

    The complete amino acid sequence of bovine colostrum cysteine proteinase inhibitor was determined by sequencing native inhibitor and peptides obtained by cyanogen bromide degradation, Achromobacter lysylendopeptidase digestion and partial acid hydrolysis of reduced and S-carboxymethylated protein. Achromobacter peptidase digestion was successfully used to isolate two disulfide-containing peptides. The inhibitor consists of 112 amino acids with an Mr of 12787. Two disulfide bonds were established between Cys 66 and Cys 77 and between Cys 90 and Cys 110. A high degree of homology in the sequence was found between the colostrum inhibitor and human gamma-trace, human salivary acidic protein and chicken egg-white cystatin.

  1. Sequence determination and analysis of the NSs genes of two tospoviruses.

    PubMed

    Hallwass, Mariana; Leastro, Mikhail O; Lima, Mirtes F; Inoue-Nagata, Alice K; Resende, Renato O

    2012-03-01

    The tospoviruses groundnut ringspot virus (GRSV) and zucchini lethal chlorosis virus (ZLCV) cause severe losses in many crops, especially in solanaceous and cucurbit species. In this study, the non-structural NSs gene and the 5'UTRs of these two biologically distinct tospoviruses were cloned and sequenced. The NSs sequence of GRSV and ZLCV were both 1,404 nucleotides long. Pairwise comparison showed that the NSs amino acid sequence of GRSV shared 69.6% identity with that of ZLCV and 75.9% identity with that of TSWV, while the NSs sequence of ZLCV and TSWV shared 67.9% identity. Phylogenetic analysis based on NSs sequences confirmed that these viruses cluster in the American clade.

  2. Complete genome sequence of lymphocystis disease virus isolated from China.

    PubMed

    Zhang, Qi-Ya; Xiao, Feng; Xie, Jian; Li, Zheng-Qiu; Gui, Jian-Fang

    2004-07-01

    Lymphocystis diseases in fish throughout the world have been extensively described. Here we report the complete genome sequence of lymphocystis disease virus isolated in China (LCDV-C), an LCDV isolated from cultured flounder (Paralichthys olivaceus) with lymphocystis disease in China. The LCDV-C genome is 186,250 bp, with a base composition of 27.25% G+C. Computer-assisted analysis revealed 240 potential open reading frames (ORFs) and 176 nonoverlapping putative viral genes, which encode polypeptides ranging from 40 to 1,193 amino acids. The percent coding density is 67%, and the average length of each ORF is 702 bp. A search of the GenBank database using the 176 individual putative genes revealed 103 homologues to the corresponding ORFs of LCDV-1 and 73 potential genes that were not found in LCDV-1 and other iridoviruses. Among the 73 genes, there are 8 genes that contain conserved domains of cellular genes and 65 novel genes that do not show any significant homology with the sequences in public databases. Although a certain extent of similarity between putative gene products of LCDV-C and corresponding proteins of LCDV-1 was revealed, no colinearity was detected when their ORF arrangements and coding strategies were compared to each other, suggesting that a high degree of genetic rearrangements between them has occurred. And a large number of tandem and overlapping repeated sequences were observed in the LCDV-C genome. The deduced amino acid sequence of the major capsid protein (MCP) presents the highest identity to those of LCDV-1 and other iridoviruses among the LCDV-C gene products. Furthermore, a phylogenetic tree was constructed based on the multiple alignments of nine MCP amino acid sequences. Interestingly, LCDV-C and LCDV-1 were clustered together, but their amino acid identity is much less than that in other clusters. The unexpected levels of divergence between their genomes in size, gene organization, and gene product identity suggest that LCDV-C and LCDV-1 shouldn't belong to a same species and that LCDV-C should be considered a species different from LCDV-1.

  3. Complete Genome Sequence of Lymphocystis Disease Virus Isolated from China

    PubMed Central

    Zhang, Qi-Ya; Xiao, Feng; Xie, Jian; Li, Zheng-Qiu; Gui, Jian-Fang

    2004-01-01

    Lymphocystis diseases in fish throughout the world have been extensively described. Here we report the complete genome sequence of lymphocystis disease virus isolated in China (LCDV-C), an LCDV isolated from cultured flounder (Paralichthys olivaceus) with lymphocystis disease in China. The LCDV-C genome is 186,250 bp, with a base composition of 27.25% G+C. Computer-assisted analysis revealed 240 potential open reading frames (ORFs) and 176 nonoverlapping putative viral genes, which encode polypeptides ranging from 40 to 1,193 amino acids. The percent coding density is 67%, and the average length of each ORF is 702 bp. A search of the GenBank database using the 176 individual putative genes revealed 103 homologues to the corresponding ORFs of LCDV-1 and 73 potential genes that were not found in LCDV-1 and other iridoviruses. Among the 73 genes, there are 8 genes that contain conserved domains of cellular genes and 65 novel genes that do not show any significant homology with the sequences in public databases. Although a certain extent of similarity between putative gene products of LCDV-C and corresponding proteins of LCDV-1 was revealed, no colinearity was detected when their ORF arrangements and coding strategies were compared to each other, suggesting that a high degree of genetic rearrangements between them has occurred. And a large number of tandem and overlapping repeated sequences were observed in the LCDV-C genome. The deduced amino acid sequence of the major capsid protein (MCP) presents the highest identity to those of LCDV-1 and other iridoviruses among the LCDV-C gene products. Furthermore, a phylogenetic tree was constructed based on the multiple alignments of nine MCP amino acid sequences. Interestingly, LCDV-C and LCDV-1 were clustered together, but their amino acid identity is much less than that in other clusters. The unexpected levels of divergence between their genomes in size, gene organization, and gene product identity suggest that LCDV-C and LCDV-1 shouldn't belong to a same species and that LCDV-C should be considered a species different from LCDV-1. PMID:15194775

  4. The shikimate pathway: review of amino acid sequence, function and three-dimensional structures of the enzymes.

    PubMed

    Mir, Rafia; Jallu, Shais; Singh, T P

    2015-06-01

    The aromatic compounds such as aromatic amino acids, vitamin K and ubiquinone are important prerequisites for the metabolism of an organism. All organisms can synthesize these aromatic metabolites through shikimate pathway, except for mammals which are dependent on their diet for these compounds. The pathway converts phosphoenolpyruvate and erythrose 4-phosphate to chorismate through seven enzymatically catalyzed steps and chorismate serves as a precursor for the synthesis of variety of aromatic compounds. These enzymes have shown to play a vital role for the viability of microorganisms and thus are suggested to present attractive molecular targets for the design of novel antimicrobial drugs. This review focuses on the seven enzymes of the shikimate pathway, highlighting their primary sequences, functions and three-dimensional structures. The understanding of their active site amino acid maps, functions and three-dimensional structures will provide a framework on which the rational design of antimicrobial drugs would be based. Comparing the full length amino acid sequences and the X-ray crystal structures of these enzymes from bacteria, fungi and plant sources would contribute in designing a specific drug and/or in developing broad-spectrum compounds with efficacy against a variety of pathogens.

  5. Detection of viral infection and gene expression in clinical tissue specimens using branched DNA (bDNA) in situ hybridization.

    PubMed

    Kenny, Daryn; Shen, Lu-Ping; Kolberg, Janice A

    2002-09-01

    In situ hybridization (ISH) methods for detection of nucleic acid sequences have proved especially powerful for revealing genetic markers and gene expression in a morphological context. Although target and signal amplification technologies have enabled researchers to detect relatively low-abundance molecules in cell extracts, the sensitive detection of nucleic acid sequences in tissue specimens has proved more challenging. We recently reported the development of a branched DNA (bDNA) ISH method for detection of DNA and mRNA in whole cells. Based on bDNA signal amplification technology, bDNA ISH is highly sensitive and can detect one or two copies of DNA per cell. In this study we evaluated bDNA ISH for detection of nucleic acid sequences in tissue specimens. Using normal and human papillomavirus (HPV)-infected cervical biopsy specimens, we explored the cell type-specific distribution of HPV DNA and mRNA by bDNA ISH. We found that bDNA ISH allowed rapid, sensitive detection of nucleic acids with high specificity while preserving tissue morphology. As an adjunct to conventional histopathology, bDNA ISH may improve diagnostic accuracy and prognosis for viral and neoplastic diseases.

  6. Studies of the structure-activity relationships of peptides and proteins involved in growth and development based on their three-dimensional structures.

    PubMed

    Nagata, Koji

    2010-01-01

    Peptides and proteins with similar amino acid sequences can have different biological functions. Knowledge of their three-dimensional molecular structures is critically important in identifying their functional determinants. In this review, I describe the results of our and other groups' structure-based functional characterization of insect insulin-like peptides, a crustacean hyperglycemic hormone-family peptide, a mammalian epidermal growth factor-family protein, and an intracellular signaling domain that recognizes proline-rich sequence.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gelb, Bruce D; Tartaglia, Marco; Pennacchio, Len

    Diagnostic and therapeutic applications for Noonan Syndrome are described. The diagnostic and therapeutic applications are based on certain mutations in a RAS-specific guanine nucleotide exchange factor gene SOS1 or its expression product. The diagnostic and therapeutic applications are also based on certain mutations in a serine/threonine protein kinase gene RAF1 or its expression product thereof. Also described are nucleotide sequences, amino acid sequences, probes, and primers related to RAF1 or SOS1, and variants thereof, as well as host cells expressing such variants.

  8. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  9. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  10. Molecular cloning of an inducible serine esterase gene from human cytotoxic lymphocytes.

    PubMed Central

    Trapani, J A; Klein, J L; White, P C; Dupont, B

    1988-01-01

    A cDNA clone encoding a human serine esterase gene was isolated from a library constructed from poly(A)+ RNA of allogeneically stimulated, interleukin 2-expanded peripheral blood mononuclear cells. The clone, designated HSE26.1, represents a full-length copy of a 0.9-kilobase mRNA present in human cytotoxic cells but absent from a wide variety of noncytotoxic cell lines. Clone HSE26.1 contains an 892-base-pair sequence, including a single 741-base-pair open reading frame encoding a putative 247-residue polypeptide. The first 20 amino acids of the polypeptide form a leader sequence. The mature protein is predicted to have an unglycosylated Mr of approximately equal to 26,000 and contains a single potential site for N-linked glycosylation. The nucleotide and predicted amino acid sequences of clone HSE26.1 are homologous with all murine and human serine esterases cloned thus far but are most similar to mouse granzyme B (70% nucleotide and 68% amino acid identity). HSE26.1 protein is expressed weakly in unstimulated peripheral blood mononuclear cells but is strongly induced within 6-hr incubation in medium containing phytohemagglutinin. The data suggest that the protein encoded by HSE26.1 plays a role in cell-mediated cytotoxicity. Images PMID:3261871

  11. Differences in acid tolerance between Bifidobacterium breve BB8 and its acid-resistant derivative B. breve BB8dpH, revealed by RNA-sequencing and physiological analysis.

    PubMed

    Yang, Xu; Hang, Xiaomin; Tan, Jing; Yang, Hong

    2015-06-01

    Bifidobacteria are common inhabitants of the human gastrointestinal tract, and their application has increased dramatically in recent years due to their health-promoting effects. The ability of bifidobacteria to tolerate acidic environments is particularly important for their function as probiotics because they encounter such environments in food products and during passage through the gastrointestinal tract. In this study, we generated a derivative, Bifidobacterium breve BB8dpH, which displayed a stable, acid-resistant phenotype. To investigate the possible reasons for the higher acid tolerance of B. breve BB8dpH, as compared with its parental strain B. breve BB8, a combined transcriptome and physiological approach was used to characterize differences between the two strains. An analysis of the transcriptome by RNA-sequencing indicated that the expression of 121 genes was increased by more than 2-fold, while the expression of 146 genes was reduced more than 2-fold, in B. breve BB8dpH. Validation of the RNA-sequencing data using real-time quantitative PCR analysis demonstrated that the RNA-sequencing results were highly reliable. The comparison analysis, based on differentially expressed genes, suggested that the acid tolerance of B. breve BB8dpH was enhanced by regulating the expression of genes involved in carbohydrate transport and metabolism, energy production, synthesis of cell envelope components (peptidoglycan and exopolysaccharide), synthesis and transport of glutamate and glutamine, and histidine synthesis. Furthermore, an analysis of physiological data showed that B. breve BB8dpH displayed higher production of exopolysaccharide and lower H(+)-ATPase activity than B. breve BB8. The results presented here will improve our understanding of acid tolerance in bifidobacteria, and they will lead to the development of new strategies to enhance the acid tolerance of bifidobacterial strains. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach

    PubMed Central

    Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.

    2007-01-01

    We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853

  13. Identification of single amino acid substitutions (SAAS) in neuraminidase from influenza a virus (H1N1) via mass spectrometry analysis coupled with de novo peptide sequencing.

    PubMed

    Peng, Qisheng; Wang, Zijian; Wu, Donglin; Li, Xiaoou; Liu, Xiaofeng; Sun, Wanchun; Liu, Ning

    2016-08-01

    Amino acid substitutions in the neuraminidase of the influenza virus are the main cause of the emergence of resistance to zanamivir or oseltamivir during seasonal influenza treatment; they are the result of non-synonymous mutations in the viral genome that can be successfully detected by polymer chain reaction (PCR)-based approaches. There is always an urgent need to detect variation in amino acid sequences directly at the protein level. Mass spectrometry coupled with de novo sequencing has been explored as an alternative and straightforward strategy for detecting amino acid substitutions, as well - this approach is the primary focus of the present study. Influenza virus (A/Puerto Rico/8/1934 H1N1) propagated in embryonated chicken eggs was purified by ultracentrifugation, followed by PNGase F treatment. The deglycosylated virion was lysed and separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). The gel band corresponding to neuraminidase was picked up and subjected to liquid chromatography tandem mass spectrometry (LC-MS/MS) analysis. LC-MS/MS analyses, coupled with manual de novo sequencing, allowed the determination of three amino acid substitutions: R346K, S349 N, and S370I/L, in the neuraminidase from the influenza virus (A/Puerto Rico/8/1934 H1N1), which were located in three mutated peptides of the neuraminidase: YGNGVWIGK, TKNHSSR, and PNGWTETDI/LK, respectively. We found that the amino acid substitutions in the proteins of RNA viruses (including influenza A virus) resulting from non-synonymous gene mutations can indeed be directly analyzed via mass spectrometry, and that manual interpretation of the MS/MS data may be beneficial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Contribution of silent mutations to thermal adaptation of RNA bacteriophage Qβ.

    PubMed

    Kashiwagi, Akiko; Sugawara, Ryu; Sano Tsushima, Fumie; Kumagai, Tomofumi; Yomo, Tetsuya

    2014-10-01

    Changes in protein function and other biological properties, such as RNA structure, are crucial for adaptation of organisms to novel or inhibitory environments. To investigate how mutations that do not alter amino acid sequence may be positively selected, we performed a thermal adaptation experiment using the single-stranded RNA bacteriophage Qβ in which the culture temperature was increased from 37.2°C to 41.2°C and finally to an inhibitory temperature of 43.6°C in a stepwise manner in three independent lines. Whole-genome analysis revealed 31 mutations, including 14 mutations that did not result in amino acid sequence alterations, in this thermal adaptation. Eight of the 31 mutations were observed in all three lines. Reconstruction and fitness analyses of Qβ strains containing only mutations observed in all three lines indicated that five mutations that did not result in amino acid sequence changes but increased the amplification ratio appeared in the course of adaptation to growth at 41.2°C. Moreover, these mutations provided a suitable genetic background for subsequent mutations, altering the fitness contribution from deleterious to beneficial. These results clearly showed that mutations that do not alter the amino acid sequence play important roles in adaptation of this single-stranded RNA virus to elevated temperature. Recent studies using whole-genome analysis technology suggested the importance of mutations that do not alter the amino acid sequence for adaptation of organisms to novel environmental conditions. It is necessary to investigate how these mutations may be positively selected and to determine to what degree such mutations that do not alter amino acid sequences contribute to adaptive evolution. Here, we report the roles of these silent mutations in thermal adaptation of RNA bacteriophage Qβ based on experimental evolution during which Qβ showed adaptation to growth at an inhibitory temperature. Intriguingly, four synonymous mutations and one mutation in the untranslated region that spread widely in the Qβ population during the adaptation process at moderately high temperature provided a suitable genetic background to alter the fitness contribution of subsequent mutations from deleterious to beneficial at a higher temperature. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  15. A Unique (3+2) Annulation Reaction between Meldrum's Acid and Nitrones: Mechanistic Insight by ESI-IMS-MS and DFT Studies.

    PubMed

    Lespes, Nicolas; Pair, Etienne; Maganga, Clisy; Bretier, Marie; Tognetti, Vincent; Joubert, Laurent; Levacher, Vincent; Hubert-Roux, Marie; Afonso, Carlos; Loutelier-Bourhis, Corinne; Brière, Jean-François

    2018-03-15

    The fragile intermediates of the domino process leading to an isoxazolidin-5-one, triggered by unique reactivity between Meldrum's acid and an N-benzyl nitrone in the presence of a Brønsted base, were determined thanks to the softness and accuracy of electrospray ionization mass spectrometry coupled to ion mobility spectrometry (ESI-IMS-MS). The combined DFT study shed light on the overall organocatalytic sequence that starts with a stepwise (3+2) annulation reaction that is followed by a decarboxylative protonation sequence encompassing a stereoselective pathway issue. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Molecular characterization and phylogenetic analysis of a yak (Bos grunniens) κ-casein cDNA from lactating mammary gland.

    PubMed

    Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H

    2011-04-01

    κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.

  17. Identification of potential platelet alloantigens in the Equidae family by comparison of gene sequences encoding major platelet membrane glycoproteins.

    PubMed

    Boudreaux, Mary K; Humphries, Drew M

    2013-12-01

    Platelet alloantigens in horses may play an important role in the development of neonatal alloimmune thrombocytopenia (NAIT). The objective of this study was to evaluate genes encoding major platelet glycoproteins within the Equidae family in an effort to identify potential alloantigens. DNA was isolated from blood samples obtained from Equidae family members, including a Holsteiner-Oldenburg cross, a Quarter horse, a donkey, and a Plains zebra (Equus burchelli). Gene sequences encoding equine platelet membrane glycoproteins IIb, IIIa (integrin subunits αIIb and β3), Ia (integrin subunit α2), and Ibα were determined using PCR. Gene sequences were compared to the equine genome available on GenBank. Polymorphisms that would be predicted to result in amino acid changes on platelet surfaces were documented and compared with known alloantigenic sites documented on human platelets. Amino acid differences were predicted based on nucleotide sequences for all 4 genes. Nine differences were documented for αIIb, 5 differences were documented for β3, 7 differences were documented for α2, and 16 differences were documented for Ibα outside the macroglycopeptide region. This study represents the first effort at identifying potential platelet alloantigens in members of the Equidae Family based on evaluation of gene sequences. The data obtained form the groundwork for identifying potential platelet alloantigens involved in transfusion reactions and neonatal alloimmune thrombocytopenia (NAIT). More work is required to determine whether the predicted amino acid differences documented in this study play a role in alloimmunity, and whether other polymorphisms not detected in this study are present that may result in alloimmunity. © 2013 American Society for Veterinary Clinical Pathology.

  18. Size and sequence polymorphisms in the glutamate-rich protein gene of the human malaria parasite Plasmodium falciparum in Thailand.

    PubMed

    Pattaradilokrat, Sittiporn; Trakoolsoontorn, Chawinya; Simpalipan, Phumin; Warrit, Natapot; Kaewthamasorn, Morakot; Harnyuttanakorn, Pongchai

    2018-01-22

    The glutamate-rich protein (GLURP) of the malaria parasite Plasmodium falciparum is a key surface antigen that serves as a component of a clinical vaccine. Moreover, the GLURP gene is also employed routinely as a genetic marker for malarial genotyping in epidemiological studies. While extensive size polymorphisms in GLURP are well recorded, the extent of the sequence diversity of this gene is rarely investigated. The present study aimed to explore the genetic diversity of GLURP in natural populations of P. falciparum. The polymorphic C-terminal repetitive R2 region of GLURP sequences from 65 P. falciparum isolates in Thailand were generated and combined with the data from 103 worldwide isolates to generate a GLURP database. The collection was comprised of 168 alleles, encoding 105 unique GLURP subtypes, characterized by 18 types of amino acid repeat units (AAU). Of these, 28 GLURP subtypes, formed by 10 AAU types, were detected in P. falciparum in Thailand. Among them, 19 GLURP subtypes and 2 AAU types are described for the first time in the Thai parasite population. The AAU sequences were highly conserved, which is likely due to negative selection. Standard Fst analysis revealed the shared distributions of GLURP types among the P. falciparum populations, providing evidence of gene flow among the different demographic populations. Sequence diversity causing size variations in GLURP in Thai P. falciparum populations were detected, and caused by non-synonymous substitutions in repeat units and some insertion/deletion of aspartic acid or glutamic acid codons between repeat units. The P. falciparum population structure based on GLURP showed promising implications for the development of GLURP-based vaccines and for monitoring vaccine efficacy.

  19. Quaranfil, Johnston Atoll, and Lake Chad viruses are novel members of the family Orthomyxoviridae.

    PubMed

    Presti, Rachel M; Zhao, Guoyan; Beatty, Wandy L; Mihindukulasuriya, Kathie A; da Rosa, Amelia P A Travassos; Popov, Vsevolod L; Tesh, Robert B; Virgin, Herbert W; Wang, David

    2009-11-01

    Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae.

  20. Quaranfil, Johnston Atoll, and Lake Chad Viruses Are Novel Members of the Family Orthomyxoviridae▿

    PubMed Central

    Presti, Rachel M.; Zhao, Guoyan; Beatty, Wandy L.; Mihindukulasuriya, Kathie A.; Travassos da Rosa, Amelia P. A.; Popov, Vsevolod L.; Tesh, Robert B.; Virgin, Herbert W.; Wang, David

    2009-01-01

    Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae. PMID:19726499

  1. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    NASA Astrophysics Data System (ADS)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  2. Identification of short single disulfide-containing contryphans from the venom of cone snails using de novo mass spectrometry-based sequencing methods.

    PubMed

    Franklin, Jayaseelan Benjamin; Rajesh, Rajaian Pushpabai; Vinithkumar, Nambali Valsalan; Kirubagaran, Ramalingam

    2017-06-15

    We identified 12 short single disulfide-containing conopeptides from the venom of Conus coronatus, C. leopardus, C. lividus and C. zonatus. Interestingly, we detected the shortest contryphan sequence thus far characterized which contains only six amino acid residues. We also identified three distinct contryphan sequences of C. lividus without any proline residues and one sequence with an unusual post-translational modification (bromination of tryptophan). Furthermore, we characterized venom peptides of C. zonatus for the first time. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Conservation and variability of West Nile virus proteins.

    PubMed

    Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas

    2009-01-01

    West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.

  4. Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

    PubMed Central

    Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

    1986-01-01

    A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461

  5. Acid–base chemical reaction model for nucleation rates in the polluted atmospheric boundary layer

    PubMed Central

    Chen, Modi; Titcombe, Mari; Jiang, Jingkun; Jen, Coty; Kuang, Chongai; Fischer, Marc L.; Eisele, Fred L.; Siepmann, J. Ilja; Hanson, David R.; Zhao, Jun; McMurry, Peter H.

    2012-01-01

    Climate models show that particles formed by nucleation can affect cloud cover and, therefore, the earth's radiation budget. Measurements worldwide show that nucleation rates in the atmospheric boundary layer are positively correlated with concentrations of sulfuric acid vapor. However, current nucleation theories do not correctly predict either the observed nucleation rates or their functional dependence on sulfuric acid concentrations. This paper develops an alternative approach for modeling nucleation rates, based on a sequence of acid–base reactions. The model uses empirical estimates of sulfuric acid evaporation rates obtained from new measurements of neutral molecular clusters. The model predicts that nucleation rates equal the sulfuric acid vapor collision rate times a prefactor that is less than unity and that depends on the concentrations of basic gaseous compounds and preexisting particles. Predicted nucleation rates and their dependence on sulfuric acid vapor concentrations are in reasonable agreement with measurements from Mexico City and Atlanta. PMID:23091030

  6. Hierarchical assembly of viral nanotemplates with encoded microparticles via nucleic acid hybridization.

    PubMed

    Tan, Wui Siew; Lewis, Christina L; Horelik, Nicholas E; Pregibon, Daniel C; Doyle, Patrick S; Yi, Hyunmin

    2008-11-04

    We demonstrate hierarchical assembly of tobacco mosaic virus (TMV)-based nanotemplates with hydrogel-based encoded microparticles via nucleic acid hybridization. TMV nanotemplates possess a highly defined structure and a genetically engineered high density thiol functionality. The encoded microparticles are produced in a high throughput microfluidic device via stop-flow lithography (SFL) and consist of spatially discrete regions containing encoded identity information, an internal control, and capture DNAs. For the hybridization-based assembly, partially disassembled TMVs were programmed with linker DNAs that contain sequences complementary to both the virus 5' end and a selected capture DNA. Fluorescence microscopy, atomic force microscopy (AFM), and confocal microscopy results clearly indicate facile assembly of TMV nanotemplates onto microparticles with high spatial and sequence selectivity. We anticipate that our hybridization-based assembly strategy could be employed to create multifunctional viral-synthetic hybrid materials in a rapid and high-throughput manner. Additionally, we believe that these viral-synthetic hybrid microparticles may find broad applications in high capacity, multiplexed target sensing.

  7. An Amino Acid Code for β-sheet Packing Structure

    PubMed Central

    Joo, Hyun; Tsai, Jerry

    2014-01-01

    To understand the relationship between protein sequence and structure, this work extends the knob-socket model in an investigation of β-sheet packing. Over a comprehensive set of β-sheet folds, the contacts between residues were used to identify packing cliques: sets of residues that all contact each other. These packing cliques were then classified based on size and contact order. From this analysis, the 2 types of 4 residue packing cliques necessary to describe β-sheet packing were characterized. Both occur between 2 adjacent hydrogen bonded β-strands. First, defining the secondary structure packing within β-sheets, the combined socket or XY:HG pocket consists of 4 residues i,i+2 on one strand and j,j+2 on the other. Second, characterizing the tertiary packing between β-sheets, the knob-socket XY:H+B consists of a 3 residue XY:H socket (i,i+2 on one strand and j on the other) packed against a knob B residue (residue k distant in sequence). Depending on the packing depth of the knob B residue, 2 types of knob-sockets are found: side-chain and main-chain sockets. The amino acid composition of the pockets and knob-sockets reveal the sequence specificity of β-sheet packing. For β-sheet formation, the XY:HG pocket clearly shows sequence specificity of amino acids. For tertiary packing, the XY:H+B side-chain and main-chain sockets exhibit distinct amino acid preferences at each position. These relationships define an amino acid code for β-sheet structure and provide an intuitive topological mapping of β-sheet packing. PMID:24668690

  8. Method of increasing conversion of a fatty acid to its corresponding dicarboxylic acid

    DOEpatents

    Craft, David L.; Wilson, C. Ron; Eirich, Dudley; Zhang, Yeyan

    2004-09-14

    A nucleic acid sequence including a CYP promoter operably linked to nucleic acid encoding a heterologous protein is provided to increase transcription of the nucleic acid. Expression vectors and host cells containing the nucleic acid sequence are also provided. The methods and compositions described herein are especially useful in the production of polycarboxylic acids by yeast cells.

  9. Variations in gut microbiota and fecal metabolic phenotype associated with depression by 16S rRNA gene sequencing and LC/MS-based metabolomics.

    PubMed

    Yu, Meng; Jia, Hongmei; Zhou, Chao; Yang, Yong; Zhao, Yang; Yang, Maohua; Zou, Zhongmei

    2017-05-10

    As a prevalent, life-threatening and highly recurrent psychiatric illness, depression is characterized by a wide range of pathological changes; however, its etiology remains incompletely understood. Accumulating evidence supports that gut microbiota affects not only gastrointestinal physiology but also central nervous system (CNS) function and behavior through the microbiota-gut-brain axis. To assess the impact of gut microbiota on fecal metabolic phenotype in depressive conditions, an integrated approach of 16S rRNA gene sequencing combined with ultra high-performance liquid chromatography-mass spectrometry (UHPLC-MS) based metabolomics was performed in chronic variable stress (CVS)-induced depression rat model. Interestingly, depression led to significant gut microbiota changes, at the phylum and genus levels in rats treated with CVS compared to controls. The relative abundances of the bacterial genera Marvinbryantia, Corynebacterium, Psychrobacter, Christensenella, Lactobacillus, Peptostreptococcaceae incertae sedis, Anaerovorax, Clostridiales incertae sedis and Coprococcus were significantly decreased, whereas Candidatus Arthromitus and Oscillibacter were markedly increased in model rats compared with normal controls. Meanwhile, distinct changes in fecal metabolic phenotype of depressive rats were also found, including lower levels of amino acids, and fatty acids, and higher amounts of bile acids, hypoxanthine and stercobilins. Moreover, there were substantial associations of perturbed gut microbiota genera with the altered fecal metabolites, especially compounds involved in the metabolism of tryptophan and bile acids. These results showed that the gut microbiota was altered in association with fecal metabolism in depressive conditions. These findings suggest that the 16S rRNA gene sequencing and LC-MS based metabolomics approach can be further applied to assess pathogenesis of depression. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Genetic diversity and phylogenetic analysis of Aleutian mink disease virus isolates in north-east China.

    PubMed

    Leng, Xue; Liu, Dongxu; Li, Jianming; Shi, Kun; Zeng, Fanli; Zong, Ying; Liu, Yi; Sun, Zhibo; Zhang, Shanshan; Liu, Yadong; Du, Rui

    2018-05-01

    Aleutian mink disease is the most important disease in the mink-farming industry worldwide. So far, few large-scale molecular epidemiological studies of AMDV, based on the NS1 and VP2 genes, have been conducted in China. Here, eight new Chinese isolates of AMDV from three provinces in north-east China were analyzed to clarify the molecular epidemiology of AMDV. The seroprevalence of AMDV in north-east China was 41.8% according to counterimmuno-electrophoresis. Genetic variation analysis of the eight isolates showed significant non-synonymous substitutions in the NS1 and VP2 genes, especially in the NS1 gene. All eight isolates included the caspase-recognition sequence NS1:285 (DQTD↓S), but not the caspase recognition sequence NS1:227 (INTD↓S). The LN1 and LN2 strains had a new 10-amino-acid deletion in-between amino acids 28-37, while the JL3 strain had a one-amino-acid deletion at position 28 in the VP2 protein, compared with the AMDV-G strain. Phylogenetic analysis based on most of NS1 (1755 bp) and complete VP2 showed that the AMDV genotypes did not cluster according to their pathogenicity or geographic origin. Local and imported ADMV species are all prevalent in mink-farming populations in the north-east of China. This is the first study to report the molecular epidemiology of AMDV in north-east China based on most of NS1 and the complete VP2, and further provides information about polyG deletions and new variations in the amino acid sequences of NS1 and VP2 proteins. This report is a good foundation for further study of AMDV in China.

  11. Detection and molecular characterization of infectious bronchitis virus isolated from recent outbreaks in broiler flocks in Thailand.

    PubMed

    Pohuang, Tawatchai; Chansiripornchai, Niwat; Tawatsin, Achara; Sasipreeyajan, Jiroj

    2009-09-01

    Thirteen field isolates of infectious bronchitis virus (IBV) were isolated from broiler flocks in Thailand between January and June 2008. The 878-bp of the S1 gene covering a hypervariable region was amplified and sequenced. Phylogenetic analysis based on that region revealed that these viruses were separated into two groups (I and II). IBV isolates in group I were not related to other IBV strains published in the GenBank database. Group 1 nucleotide sequence identities were less than 85% and amino acid sequence identities less than 84% in common with IBVs published in the GenBank database. This group likely represents the strains indigenous to Thailand. The isolates in group II showed a close relationship with Chinese IBVs. They had nucleotide sequence identities of 97-98% and amino acid sequence identities 96-98% in common with Chinese IBVs (strain A2, SH and QXIBV). This finding indicated that the recent Thai IBVs evolved separately and at least two groups of viruses are circulating in Thailand.

  12. A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

    PubMed

    Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

    1995-04-01

    The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).

  13. Aminocella lysinolytica gen. nov., sp. nov., a L-lysine-degrading, strictly anaerobic bacterium in the class Clostridia isolated from a methanogenic reactor of cattle farms.

    PubMed

    Ueki, Atsuko; Shibuya, Toru; Kaku, Nobuo; Ueki, Katsuji

    2015-01-01

    A strictly anaerobic bacterial strain (WN037(T)) was isolated from a methanogenic reactor. Cells were Gram-positive rods. Strain WN037(T) was asaccharolytic. The strain fermented L-lysine in the presence of B-vitamin mixture or vitamin B12 and produced acetate and butyrate. L-arginine and casamino acids poorly supported the growth. Strain WN037(T) used neither other amino acids nor organic acids examined. The strain had C18:1 ω7c, C16:0 and C18:1 ω7c DMA as the predominant cellular fatty acids. The genomic DNA G + C content was 44.2 mol %. Phylogenetic analysis based on the 16S rRNA gene sequence placed strain WN037(T) in the family Eubacteriaceae in the class Clostridia. The closest relative was Eubacterium pyruvativorans (sequence similarity, 92.8 %). Based on the comprehensive analyses, the novel genus and species, Aminocella lysinolytica gen. nov., sp. nov. was proposed to accommodate the strain. The type strain is WN037(T) (= JCM 19863(T) = DSM 28287(T)).

  14. The ABC transporter Rv1272c of Mycobacterium tuberculosis enhances the import of long-chain fatty acids in Escherichia coli.

    PubMed

    Martin, Audrey; Daniel, Jaiyanth

    2018-02-05

    Mycobacterium tuberculosis (Mtb), which causes tuberculosis, is capable of accumulating triacylglycerol (TAG) by utilizing fatty acids from host cells. ATP-binding cassette (ABC) transporters are involved in transport processes in all organisms. Among the classical ABC transporters in Mtb none have been implicated in fatty acid import. Since the transport of fatty acids from the host cell is important for dormancy-associated TAG synthesis in the pathogen, mycobacterial ABC transporter(s) could potentially be involved in this process. Based on sequence identities with a bacterial ABC transporter that mediates fatty acid import for TAG synthesis, we identified Rv1272c, a hitherto uncharacterized ABC-transporter in Mtb that also shows sequence identities with a plant ABC transporter involved in fatty acid transport. We expressed Rv1272c in E. coli and show that it enhances the import of radiolabeled fatty acids. We also show that Rv1272c causes a significant increase in the metabolic incorporation of radiolabeled long-chain fatty acids into cardiolipin, a tetra-acylated phospholipid, and phosphatidylglycerol in E. coli. This is the first report on the function of Rv1272c showing that it displays a long-chain fatty acid transport function. Copyright © 2018 Elsevier Inc. All rights reserved.

  15. Development of SSR Markers Linked to Low Hydrocyanic Acid Content in Sorghum-Sudan Grass Hybrid Based on BSA Method.

    PubMed

    Xiao-Xia, Yu; Zhi-Hua, Liu; Zhuo, Yu; Yue, Shi; Xiao-Yu, Li

    2016-01-01

    Sorghum-Sudan grass hybrid containing high hydrocyanic acid content can cause hydrocyanic acid poisoning to the livestock and limit the popularization of this forage crop. Molecular markers associated with low hydrocyanic acid content can speed up the process of identification of genotypes with low hydrocyanic acid content. In the present study, 11 polymorphic SSR primers were screened and used for bulked segregant analysis and single marker analysis. Three SSR markers Xtxp7230, Xtxp7375 and Bnlg667960 associated with low hydrocyanic acid content were rapidly identified by BSA. In single marker analysis, six markers Xtxp7230, Xtxp7375, Bnlg667960, Xtxp67-11, Xtxp295-7 and Xtxp12-9 were linked to low hydrocyanic acid content, which explained the proportion of phenotypic variation from 7.6 % to 41.2 %. The markers identified by BSA were also verified by single marker analysis. The three SSR marker bands were then cloned and sequenced for sequence homology analysis in NCBI. It is the first report on the development of molecular markers associated with low hydrocyanic acid content in sorghum- Sudan grass hybrid. These markers will be useful for genetic improvement of low hydrocyanic acid sorghum-Sudan grass hybrid by marker-assisted breeding.

  16. A modular DNA signal translator for the controlled release of a protein by an aptamer.

    PubMed

    Beyer, Stefan; Simmel, Friedrich C

    2006-01-01

    Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA 'effector' strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary 'input' sequence into a desired effector sequence may be required. Here we demonstrate a molecular 'translator' for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences.

  17. A modular DNA signal translator for the controlled release of a protein by an aptamer

    PubMed Central

    Beyer, Stefan; Simmel, Friedrich C.

    2006-01-01

    Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA ‘effector’ strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary ‘input’ sequence into a desired effector sequence may be required. Here we demonstrate a molecular ‘translator’ for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences. PMID:16547201

  18. Molecular characterization of chikungunya virus from Andhra Pradesh, India & phylogenetic relationship with Central African isolates.

    PubMed

    M Naresh Kumar, C V; Anthony Johnson, A M; R Sai Gopal, D V

    2007-12-01

    Chikungunya virus has caused numerous large outbreaks in India. Suspected blood samples from the epidemic were collected and characterized for the identification of the responsible causative from Rayalaseema region of Andhra Pradesh. RT-PCR was used for screening of suspected blood samples. Primers were designed to amplify partial E1 gene and the amplified fragment was cloned and sequenced. The sequence was analyzed and compared with other geographical isolates to find the phylogenetic relationship. The sequence was submitted to the Gen bank DNA database (accession DQ888620). Comparative nucleotide homology analysis of the AP Ra-CTR isolate with the other isolates revealed 94.7+/-3.6 per cent of homology of CHIKAPRa-CTR with other isolates of Chikungunya virus at nucleotide level and 96.8+/-3.2 per cent of homology at amino acid level. The current epidemic was caused by the Central African genotype of CHIKV, grouped in Central Africa cluster in phylogenetic trees generated based on nucleotide and amino acid sequences.

  19. RNAHelix: computational modeling of nucleic acid structures with Watson-Crick and non-canonical base pairs.

    PubMed

    Bhattacharyya, Dhananjay; Halder, Sukanya; Basu, Sankar; Mukherjee, Debasish; Kumar, Prasun; Bansal, Manju

    2017-02-01

    Comprehensive analyses of structural features of non-canonical base pairs within a nucleic acid double helix are limited by the availability of a small number of three dimensional structures. Therefore, a procedure for model building of double helices containing any given nucleotide sequence and base pairing information, either canonical or non-canonical, is seriously needed. Here we describe a program RNAHelix, which is an updated version of our widely used software, NUCGEN. The program can regenerate duplexes using the dinucleotide step and base pair orientation parameters for a given double helical DNA or RNA sequence with defined Watson-Crick or non-Watson-Crick base pairs. The original structure and the corresponding regenerated structure of double helices were found to be very close, as indicated by the small RMSD values between positions of the corresponding atoms. Structures of several usual and unusual double helices have been regenerated and compared with their original structures in terms of base pair RMSD, torsion angles and electrostatic potentials and very high agreements have been noted. RNAHelix can also be used to generate a structure with a sequence completely different from an experimentally determined one or to introduce single to multiple mutation, but with the same set of parameters and hence can also be an important tool in homology modeling and study of mutation induced structural changes.

  20. PubDNA Finder: a web database linking full-text articles to sequences of nucleic acids.

    PubMed

    García-Remesal, Miguel; Cuevas, Alejandro; Pérez-Rey, David; Martín, Luis; Anguita, Alberto; de la Iglesia, Diana; de la Calle, Guillermo; Crespo, José; Maojo, Víctor

    2010-11-01

    PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder

  1. Computational analysis of sequence selection mechanisms.

    PubMed

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  2. Structure-Specific Ribonucleases for MS-Based Elucidation of Higher-Order RNA Structure

    NASA Astrophysics Data System (ADS)

    Scalabrin, Matteo; Siu, Yik; Asare-Okai, Papa Nii; Fabris, Daniele

    2014-07-01

    Supported by high-throughput sequencing technologies, structure-specific nucleases are experiencing a renaissance as biochemical probes for genome-wide mapping of nucleic acid structure. This report explores the benefits and pitfalls of the application of Mung bean (Mb) and V1 nuclease, which attack specifically single- and double-stranded regions of nucleic acids, as possible structural probes to be employed in combination with MS detection. Both enzymes were found capable of operating in ammonium-based solutions that are preferred for high-resolution analysis by direct infusion electrospray ionization (ESI). Sequence analysis by tandem mass spectrometry (MS/MS) was performed to confirm mapping assignments and to resolve possible ambiguities arising from the concomitant formation of isobaric products with identical base composition and different sequences. The observed products grouped together into ladder-type series that facilitated their assignment to unique regions of the substrate, but revealed also a certain level of uncertainty in identifying the boundaries between paired and unpaired regions. Various experimental factors that are known to stabilize nucleic acid structure, such as higher ionic strength, presence of Mg(II), etc., increased the accuracy of cleavage information, but did not completely eliminate deviations from expected results. These observations suggest extreme caution in interpreting the results afforded by these types of reagents. Regardless of the analytical platform of choice, the results highlighted the need to repeat probing experiments under the most diverse possible conditions to recognize potential artifacts and to increase the level of confidence in the observed structural information.

  3. Rarimicrobium hominis gen. nov., sp. nov., representing the fifth genus in the phylum Synergistetes that includes human clinical isolates.

    PubMed

    Jumas-Bilak, Estelle; Bouvet, Philippe; Allen-Vercoe, Emma; Aujoulat, Fabien; Lawson, Paul A; Jean-Pierre, Hélène; Marchandin, Hélène

    2015-11-01

    Five human clinical isolates of an unknown, strictly anaerobic, slow-growing, Gram-stain-negative, rod-shaped micro-organism were subjected to a polyphasic taxonomic study. Comparative 16S rRNA gene sequence-based phylogeny showed that the isolates grouped in a clade that included members of the genera Pyramidobacter, Jonquetella, and Dethiosulfovibrio; the type strain of Pyramidobacter piscolens was the closest relative with 91.5-91.7 % 16S rRNA gene sequence similarity. The novel strains were mainly asaccharolytic and unreactive in most conventional biochemical tests. Major metabolic end products in trypticase/glucose/yeast extract broth were acetic acid and propionic acid and the major cellular fatty acids were C13 : 0 and C16 : 0, each of which could be used to differentiate the strains from P. piscolens. The DNA G+C content based on whole genome sequencing for the reference strain 22-5-S 12D6FAA was 57 mol%. Based on these data, a new genus, Rarimicrobium gen. nov., is proposed with one novel species, Rarimicrobium hominis sp. nov., named after the exclusive and rare finding of the taxon in human samples. Rarimicrobium is the fifth genus of the 14 currently characterized in the phylum Synergistetes and the third one in subdivision B that includes human isolates. The type strain of Rarimicrobium hominis is ADV70T ( = LMG 28163T = CCUG 65426T).

  4. Mesonia hippocampi sp. nov., isolated from the brood pouch of a diseased Barbour's Seahorse (Hippocampus barbouri).

    PubMed

    Kolberg, Judy; Busse, Hans-Jürgen; Wilke, Thomas; Schubert, Patrick; Kämpfer, Peter; Glaeser, Stefanie P

    2015-07-01

    An orange-pigmented, Gram-staining-negative, rod-shaped bacterium, designated 96_Hippo_TS_3/13(T) was isolated from the brood pouch of a diseased seahorse male of the species Hippocampus barbouri from the animal facility of the University of Giessen, Germany. Phylogenetic analyses based on the nearly full-length 16S rRNA gene sequence placed strain 96_Hippo_TS_3/13(T) into the monophyletic cluster of the genus Mesonia within the family Flavobacteriaceae. However, the strain shared only 92.2-93.8% sequence similarity to type strains of species of the genus Mesonia, with highest sequence similarity to the type strain of Mesonia aquimarina. Cellular fatty acid analysis showed a Mesonia-typical fatty acid profile including several branched and hydroxyl fatty acids with highest amounts of iso-C15 : 0 (40.9%) followed by iso-C17 : 0 3-OH (14.8%). In the polyamine pattern, sym-homospermidine was predominant. The diagnostic diamino acid of the peptidoglycan was meso-diaminopimelic acid. The quinone system contained exclusively menaquinone MK-6. The only identified compound in the polar lipid profile was phosphatidylethanolamine present in major amounts. Additionally, major amounts of an unidentified aminolipid and two unidentified lipids not containing a phosphate group, an amino group or a sugar residue were detected. The genomic G+C content of strain 96_Hippo_TS_3/13(T) was 30 mol%. Based on genotypic, chemotaxonomic and physiological characterizations we propose a novel species of the genus Mesonia, Mesonia hippocampi sp. nov., with strain 96_Hippo_TS_3/13(T) ( = CIP 110839T =  LMG 28572(T) = CCM 8557(T)) as the type strain. An emended description of the genus Mesonia is also provided.

  5. Phylogenetic analysis of β-xylanase SRXL1 of Sporisorium reilianum and its relationship with families (GH10 and GH11) of Ascomycetes and Basidiomycetes

    PubMed Central

    Álvarez-Cervantes, Jorge; Díaz-Godínez, Gerardo; Mercado-Flores, Yuridia; Gupta, Vijai Kumar; Anducho-Reyes, Miguel Angel

    2016-01-01

    In this paper, the amino acid sequence of the β-xylanase SRXL1 of Sporisorium reilianum, which is a pathogenic fungus of maize was used as a model protein to find its phylogenetic relationship with other xylanases of Ascomycetes and Basidiomycetes and the information obtained allowed to establish a hypothesis of monophyly and of biological role. 84 amino acid sequences of β-xylanase obtained from the GenBank database was used. Groupings analysis of higher-level in the Pfam database allowed to determine that the proteins under study were classified into the GH10 and GH11 families, based on the regions of highly conserved amino acids, 233–318 and 180–193 respectively, where glutamate residues are responsible for the catalysis. PMID:27040368

  6. A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features.

    PubMed

    Li, Liqi; Luo, Qifa; Xiao, Weidong; Li, Jinhui; Zhou, Shiwen; Li, Yongsheng; Zheng, Xiaoqi; Yang, Hua

    2017-02-01

    Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.

  7. LenVarDB: database of length-variant protein domains.

    PubMed

    Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

    2014-01-01

    Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.

  8. Nucleotide sequence analysis of the gene encoding the Deinococcus radiodurans surface protein, derived amino acid sequence, and complementary protein chemical studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peters, J.; Peters, M.; Lottspeich, F.

    1987-11-01

    The complete nucleotide sequence of the gene encoding the surface (hexagonally packed intermediate (HPI))-layer polypeptide of Deinococcus radiodurans Sark was determined and found to encode a polypeptide of 1036 amino acids. Amino acid sequence analysis of about 30% of the residues revealed that the mature polypeptide consists of at least 978 amino acids. The N terminus was blocked to Edman degradation. The results of proteolytic modification of the HPI layer in situ and M/sub r/ estimations of the HPI polypeptide expressed in Escherichia coli indicated that there is a leader sequence. The N-terminal region contained a very high percentage (29%)more » of threonine and serine, including a cluster of nine consecutive serine or threonine residues, whereas a stretch near the C terminus was extremely rich in aromatic amino acids (29%). The protein contained at least two disulfide bridges, as well as tightly bound reducing sugars and fatty acids.« less

  9. Artificial mismatch hybridization

    DOEpatents

    Guo, Zhen; Smith, Lloyd M.

    1998-01-01

    An improved nucleic acid hybridization process is provided which employs a modified oligonucleotide and improves the ability to discriminate a control nucleic acid target from a variant nucleic acid target containing a sequence variation. The modified probe contains at least one artificial mismatch relative to the control nucleic acid target in addition to any mismatch(es) arising from the sequence variation. The invention has direct and advantageous application to numerous existing hybridization methods, including, applications that employ, for example, the Polymerase Chain Reaction, allele-specific nucleic acid sequencing methods, and diagnostic hybridization methods.

  10. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  11. Peptide array-based interaction assay of solid-bound peptides and anchorage-dependant cells and its effectiveness in cell-adhesive peptide design.

    PubMed

    Kato, Ryuji; Kaga, Chiaki; Kunimatsu, Mitoshi; Kobayashi, Takeshi; Honda, Hiroyuki

    2006-06-01

    Peptide array, the designable peptide library covalently synthesized on cellulose support, was applied to assay peptide-cell interaction, between solid-bound peptides and anchorage-dependant cells, to study objective peptide design. As a model case, cell-adhesive peptides that could enhance cell growth as tissue engineering scaffold material, was studied. On the peptide array, the relative cell-adhesion ratio of NIH/3T3 cells was 2.5-fold higher on the RGDS (Arg-Gly-Asp-Ser) peptide spot as compared to the spot with no peptide, thus indicating integrin-mediated peptide-cell interaction. Such strong cell adhesion mediated by the RGDS peptide was easily disrupted by single residue substitution on the peptide array, thus indicating that the sequence recognition accuracy of cells was strictly conserved in our optimized scheme. The observed cellular morphological extension with active actin stress-fiber on the RGD motif-containing peptide supported our strategy that peptide array-based interaction assay of solid-bound peptide and anchorage-dependant cells (PIASPAC) could provide quantitative data on biological peptide-cell interaction. The analysis of 180 peptides obtained from fibronectin type III domain (no. 1447-1629) yielded 18 novel cell-adhesive peptides without the RGD motif. Taken together with the novel candidates, representative rules of ineffective amino acid usage were obtained from non-effective candidate sequences for the effective designing of cell-adhesive peptides. On comparing the amino acid usage of the top 20 and last 20 peptides from the 180 peptides, the following four brief design rules were indicated: (i) Arg or Lys of positively charged amino acids (except His) could enhance cell adhesion, (ii) small hydrophilic amino acids are favored in cell-adhesion peptides, (iii) negatively charged amino acids and small amino acids (except Gly) could reduce cell adhesion, and (iv) Cys and Met could be excluded from the sequence combination since they have less influence on the peptide design. Such rules that are indicative of the nature of the functional peptide sequence can be obtained only by the mass comparison analysis of PIASPAC using peptide array. By following such indicative rules, numerous amino acid combinations can be effectively screened for further examination of novel peptide design.

  12. The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.

    PubMed Central

    Ohno, S; Epplen, J T

    1983-01-01

    Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491

  13. Sequence Alignment to Predict Across Species Susceptibility ...

    EPA Pesticide Factsheets

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev

  14. G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch.

    PubMed

    Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman

    2016-11-02

    Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Capturing the genetic makeup of the active microbiome in situ

    DOE PAGES

    Singer, Esther; Wagner, Michael; Woyke, Tanja

    2017-06-02

    More than any other technology, nucleic acid sequencing has enabled microbial ecology studies to be complemented with the data volumes necessary to capture the extent of microbial diversity and dynamics in a wide range of environments. In order to truly understand and predict environmental processes, however, the distinction between active, inactive and dead microbial cells is critical. Also, experimental designs need to be sensitive toward varying population complexity and activity, and temporal as well as spatial scales of process rates. There are a number of approaches, including single-cell techniques, which were designed to study in situ microbial activity and thatmore » have been successively coupled to nucleic acid sequencing. The exciting new discoveries regarding in situ microbial activity provide evidence that future microbial ecology studies will indispensably rely on techniques that specifically capture members of the microbiome active in the environment. Herein, we review those currently used activity-based approaches that can be directly linked to shotgun nucleic acid sequencing, evaluate their relevance to ecology studies, and discuss future directions.« less

  16. Capturing the genetic makeup of the active microbiome in situ

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singer, Esther; Wagner, Michael; Woyke, Tanja

    More than any other technology, nucleic acid sequencing has enabled microbial ecology studies to be complemented with the data volumes necessary to capture the extent of microbial diversity and dynamics in a wide range of environments. In order to truly understand and predict environmental processes, however, the distinction between active, inactive and dead microbial cells is critical. Also, experimental designs need to be sensitive toward varying population complexity and activity, and temporal as well as spatial scales of process rates. There are a number of approaches, including single-cell techniques, which were designed to study in situ microbial activity and thatmore » have been successively coupled to nucleic acid sequencing. The exciting new discoveries regarding in situ microbial activity provide evidence that future microbial ecology studies will indispensably rely on techniques that specifically capture members of the microbiome active in the environment. Herein, we review those currently used activity-based approaches that can be directly linked to shotgun nucleic acid sequencing, evaluate their relevance to ecology studies, and discuss future directions.« less

  17. Characterization of the genetic elements required for site-specific integration of plasmid pSE211 in Saccharopolyspora erythraea.

    PubMed Central

    Brown, D P; Idler, K B; Katz, L

    1990-01-01

    The 18.1-kilobase plasmid pSE211 integrates into the chromosome of Saccharopolyspora erythraea at a specific attB site. Restriction analysis of the integrated plasmid, pSE211int, and adjacent chromosomal sequences allowed identification of attP, the plasmid attachment site. Nucleotide sequencing of attP, attB, attL, and attR revealed a 57-base-pair sequence common to all sites with no duplications of adjacent plasmid or chromosomal sequences in the integrated state, indicating that integration takes place through conservative, reciprocal strand exchange. An analysis of the sequences indicated the presence of a putative gene for Phe-tRNA at attB which is preserved at attL after integration has occurred. A comparison of the attB site for a number of actinomycete plasmids is presented. Integration at attB was also observed when a 2.4-kilobase segment of pSE211 containing attP and the adjacent plasmid sequence was used to transform a pSE211- host. Nucleotide sequencing of this segment revealed the presence of two complete open reading frames (ORFs) and a segment of a third ORF. The ORF adjacent to attP encodes a putative polypeptide 437 amino acids in length that shows similarity, at its C-terminal domain, to sequences of site-specific recombinases of the integrase family. The adjacent ORF encodes a putative 98-amino-acid basic polypeptide that contains a helix-turn-helix motif at its N terminus which corresponds to domains in the Xis proteins of a number of bacteriophages. A proposal for the function of this polypeptide is presented. The deduced amino acid sequence of the third ORF did not reveal similarities to polypeptide sequences in the current data banks. Images FIG. 2 FIG. 3 PMID:2180909

  18. Computational identification of epitopes in the glycoproteins of novel bunyavirus (SFTS virus) recognized by a human monoclonal antibody (MAb 4-5)

    NASA Astrophysics Data System (ADS)

    Zhang, Wenshuai; Zeng, Xiaoyan; Zhang, Li; Peng, Haiyan; Jiao, Yongjun; Zeng, Jun; Treutlein, Herbert R.

    2013-06-01

    In this work, we have developed a new approach to predict the epitopes of antigens that are recognized by a specific antibody. Our method is based on the "multiple copy simultaneous search" (MCSS) approach which identifies optimal locations of small chemical functional groups on the surfaces of the antibody, and identifying sequence patterns of peptides that can bind to the surface of the antibody. The identified sequence patterns are then used to search the amino-acid sequence of the antigen protein. The approach was validated by reproducing the binding epitope of HIV gp120 envelop glycoprotein for the human neutralizing antibody as revealed in the available crystal structure. Our method was then applied to predict the epitopes of two glycoproteins of a newly discovered bunyavirus recognized by an antibody named MAb 4-5. These predicted epitopes can be verified by experimental methods. We also discuss the involvement of different amino acids in the antigen-antibody recognition based on the distributions of MCSS minima of different functional groups.

  19. Identification of Biomolecular Building Blocks by Recognition Tunneling: Stride towards Nanopore Sequencing of Biomolecules

    NASA Astrophysics Data System (ADS)

    Sen, Suman

    DNA, RNA and Protein are three pivotal biomolecules in human and other organisms, playing decisive roles in functionality, appearance, diseases development and other physiological phenomena. Hence, sequencing of these biomolecules acquires the prime interest in the scientific community. Single molecular identification of their building blocks can be done by a technique called Recognition Tunneling (RT) based on Scanning Tunneling Microscope (STM). A single layer of specially designed recognition molecule is attached to the STM electrodes, which trap the targeted molecules (DNA nucleoside monophosphates, RNA nucleoside monophosphates or amino acids) inside the STM nanogap. Depending on their different binding interactions with the recognition molecules, the analyte molecules generate stochastic signal trains accommodating their "electronic fingerprints". Signal features are used to detect the molecules using a machine learning algorithm and different molecules can be identified with significantly high accuracy. This, in turn, paves the way for rapid, economical nanopore sequencing platform, overcoming the drawbacks of Next Generation Sequencing (NGS) techniques. To read DNA nucleotides with high accuracy in an STM tunnel junction a series of nitrogen-based heterocycles were designed and examined to check their capabilities to interact with naturally occurring DNA nucleotides by hydrogen bonding in the tunnel junction. These recognition molecules are Benzimidazole, Imidazole, Triazole and Pyrrole. Benzimidazole proved to be best among them showing DNA nucleotide classification accuracy close to 99%. Also, Imidazole reader can read an abasic monophosphate (AP), a product from depurination or depyrimidination that occurs 10,000 times per human cell per day. In another study, I have investigated a new universal reader, 1-(2-mercaptoethyl)pyrene (Pyrene reader) based on stacking interactions, which should be more specific to the canonical DNA nucleosides. In addition, Pyrene reader showed higher DNA base-calling accuracy compare to Imidazole reader, the workhorse in our previous projects. In my other projects, various amino acids and RNA nucleoside monophosphates were also classified with significantly high accuracy using RT. Twenty naturally occurring amino acids and various RNA nucleosides (four canonical and two modified) were successfully identified. Thus, we envision nanopore sequencing biomolecules using Recognition Tunneling (RT) that should provide comprehensive betterment over current technologies in terms of time, chemical and instrumental cost and capability of de novo sequencing.

  20. Optical resolution of phenylthiohydantoin-amino acids by capillary electrophoresis and identification of the phenylthiohydantoin-D-amino acid residue of [D-Ala2]-methionine enkephalin.

    PubMed

    Kurosu, Y; Murayama, K; Shindo, N; Shisa, Y; Ishioka, N

    1996-11-01

    This is an initial report to propose a protein sequence analysis system with DL differentiation using capillary electrophoresis (CE). This system consists of a protein sequencer and a CE system. After fractionation of phenyl-thiohydantoin (PTH)-amino acids using a protein sequencer, optical resolution for each PTH-amino acid is performed by CE using some chiral selectors such as digitonin, beta-escin and others. As a model peptide, [D-Ala2]-methionine enkephalin (L-Tyr-D-Ala-Gly-L-Phe-L-Met), was used and the sequence with DL differentiation was determined, with the exception of the fourth amino acid, L-Phe, using our proposed system.

  1. RNA-seq based transcriptomic analysis uncovers α-linolenic acid and jasmonic acid biosynthesis pathways respond to cold acclimation in Camellia japonica

    PubMed Central

    Li, Qingyuan; Lei, Sheng; Du, Kebing; Li, Lizhi; Pang, Xufeng; Wang, Zhanchang; Wei, Ming; Fu, Shao; Hu, Limin; Xu, Lin

    2016-01-01

    Camellia is a well-known ornamental flower native to Southeast of Asia, including regions such as Japan, Korea and South China. However, most species in the genus Camellia are cold sensitive. To elucidate the cold stress responses in camellia plants, we carried out deep transcriptome sequencing of ‘Jiangxue’, a cold-tolerant cultivar of Camellia japonica, and approximately 1,006 million clean reads were generated using Illumina sequencing technology. The assembly of the clean reads produced 367,620 transcripts, including 207,592 unigenes. Overall, 28,038 differentially expressed genes were identified during cold acclimation. Detailed elucidation of responses of transcription factors, protein kinases and plant hormone signalling-related genes described the interplay of signal that allowed the plant to fine-tune cold stress responses. On the basis of global gene regulation of unsaturated fatty acid biosynthesis- and jasmonic acid biosynthesis-related genes, unsaturated fatty acid biosynthesis and jasmonic acid biosynthesis pathways were deduced to be involved in the low temperature responses in C. japonica. These results were supported by the determination of the fatty acid composition and jasmonic acid content. Our results provide insights into the genetic and molecular basis of the responses to cold acclimation in camellia plants. PMID:27819341

  2. Expression of arginine kinase enzymatic activity and mRNA in gills of the euryhaline crabs Carcinus maenas and Callinectes sapidus.

    PubMed

    Kotlyar, S; Weihrauch, D; Paulsen, R S; Towle, D W

    2000-08-01

    Phosphagen kinases catalyze the reversible dephosphorylation of guanidino phosphagens such as phosphocreatine and phosphoarginine, contributing to the restoration of adenosine triphosphate concentrations in cells experiencing high and variable demands on their reserves of high-energy phosphates. The major invertebrate phosphagen kinase, arginine kinase, is expressed in the gills of two species of euryhaline crabs, the blue crab Callinectes sapidus and the shore crab Carcinus maenas, in which energy-requiring functions include monovalent ion transport, acid-base balance, nitrogen excretion and gas exchange. The enzymatic activity of arginine kinase approximately doubles in the ion-transporting gills of C. sapidus, a strong osmoregulator, when the crabs are transferred from high to low salinity, but does not change in C. maenas, a more modest osmoregulator. Amplification and sequencing of arginine kinase cDNA from both species, accomplished by reverse transcription of gill mRNA and the polymerase chain reaction, revealed an open reading frame coding for a 357-amino-acid protein. The predicted amino acid sequences showed a minimum of 75 % identity with arginine kinase sequences of other arthropods. Ten of the 11 amino acid residues believed to participate in arginine binding are completely conserved among the arthropod sequences analyzed. An estimation of arginine kinase mRNA abundance indicated that acclimation salinity has no effect on arginine kinase gene transcription. Thus, the observed enhancement of enzyme activity in C. sapidus probably results from altered translation rates or direct activation of pre-existing enzyme protein.

  3. Sequence Based Structural Characterization and Genetic Diversity Analysis of Full Length TLR4 CDS in Crossbred and Indigenous Cattle.

    PubMed

    Mishra, Chinmoy; Kumar, Subodh; Sonwane, Arvind Asaram; Yathish, H M; Chaudhary, Rajni

    2017-01-02

    The exploration of candidate genes for immune response in cattle may be vital for improving our understanding regarding the species specific response to pathogens. Toll-like receptor 4 (TLR4) is mostly involved in protection against the deleterious effects of Gram negative pathogens. Approximately 2.6 kb long cDNA sequence of TLR4 gene covering the entire coding region was characterized in two Indian milk cattle (Vrindavani and Tharparkar). The phylogenetic analysis confirmed that the bovine TLR4 was apparently evolved from an ancestral form that predated the appearance of vertebrates, and it is grouped with buffalo, yak, and mithun TLR4s. Sequence analysis revealed a 2526-nucleotide long open reading frame (ORF) encoding 841 amino acids, similar to other cattle breeds. The calculated molecular weight of the translated ORF was 96144 and 96040.9 Da; the isoelectric point was 6.35 and 6.42 in Vrindavani and Tharparkar cattle, respectively. The Simple Modular Architecture Research Tool (SMART) analysis identified 14 leucine rich repeats (LRR) motifs in bovine TLR4 protein. The deduced TLR4 amino acid sequence of Tharparkar had 4 different substitutions as compared to Bos taurus, Sahiwal, and Vrindavani. The signal peptide cleavage site predicted to lie between 16th and 17th amino acid of mature peptide. The transmebrane helix was identified between 635-657 amino acids in the mature peptide.

  4. Properties and cDNA cloning of antihemorrhagic factors in sera of Chinese and Japanese mamushi (Gloydius blomhoffi).

    PubMed

    Aoki, Narumi; Tsutsumi, Kadzuyo; Deshimaru, Masanobu; Terada, Shigeyuki

    2008-02-01

    An antihemorrhagic protein has been isolated from the serum of Chinese mamushi (Gloydius blomhoffi brevicaudus) by using a combination of ethanol precipitation and a reverse-phase high-performance liquid chromatography (HPLC) on a C8 column. This protein-designated Chinese mamushi serum factor (cMSF)-suppressed mamushi venom-induced hemorrhage in a dose-dependent manner. It had no effect on trypsin, chymotrypsin, thermolysin, and papain but inhibited the proteinase activities of several snake venom metalloproteinases (SVMPs) including hemorrhagic enzymes isolated from the venoms of mamushi and habu (Trimeresurus flavoviridis). A similar protein (Japanese MSF, jMSF) with antihemorrhagic activity has also been purified from the sera of Japanese mamushi (G. blomhoffi). The N-terminal 70 and 51 residues of the intact cMSF and jMSF were directly analyzed; a similarity between the sequences of two MSFs to that of antihemorrhagic protein (HSF) from habu serum was noticed. To obtain the complete amino acid sequences of MSFs, cDNAs encoding these proteins were cloned from the liver mRNA of Chinese and Japanese vipers based on their N-terminal amino acid sequences. The mature forms of both MSFs consisted of 305 amino acids with a 19-residue signal sequence, and a unique 17-residue deletion was detected in their His-rich domains.

  5. Classifying Membrane Proteins in the Proteome by Using Artificial Neural Networks Based on the Preferential Parameters of Amino Acids

    NASA Astrophysics Data System (ADS)

    Bose, Subrata K.; Browne, Antony; Kazemian, Hassan; White, Kenneth

    Membrane proteins (MPs) are large set of biological macromolecules that play a fundamental role in physiology and pathophysiology for survival. From a pharma-economical perspective, though it is the fact that MPs constitute ˜75% of possible targets for novel drugs but MPs are one of the most understudied groups of proteins in biochemical research. This is mainly because of the technical difficulties of obtaining structural information about trans-membrane regions (these are small sequences that crossways the bilayer lipid membrane). It is quite useful to predict the location of transmembrane segments down the sequence, since these are the elementary structural building blocks defining their topology. There have been several attempts over the last 20 years to develop tools for predicting membrane-spanning regions but current tools are far away from achieving a considerable reliability in prediction. This study aims to exploit the knowledge and current understanding in the field of artificial neural networks (ANNs) in particular data representation through the development of a system to identify and predict membrane-spanning regions by analysing primary amino acids sequence. In this paper we present a novel neural network (NNs) architecture and algorithms for predicting membrane spanning regions from primary amino acids sequences by using their preference parameters.

  6. Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

    PubMed Central

    Harrison, Andrew; Binder, Hans; Buhot, Arnaud; Burden, Conrad J.; Carlon, Enrico; Gibas, Cynthia; Gamble, Lara J.; Halperin, Avraham; Hooyberghs, Jef; Kreil, David P.; Levicky, Rastislav; Noble, Peter A.; Ott, Albrecht; Pettitt, B. Montgomery; Tautz, Diethard; Pozhitkov, Alexander E.

    2013-01-01

    Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized. PMID:23307556

  7. Inhibitor profiling of the Pseudomonas aeruginosa virulence factor LasB using N-alpha mercaptoamide template-based inhibitors.

    PubMed

    Cathcart, George R; Gilmore, Brendan F; Greer, Brett; Harriott, Pat; Walker, Brian

    2009-11-01

    We report on the synthesis and biological evaluation of a focussed library of N-alpha mercaptoamide containing dipeptides as inhibitors of the zinc metallopeptidase Pseudomonas aeruginosa elastase (LasB, EC 3.4.24.26). The aim of the study was to derive an inhibitor profile for LasB with regard to mapping the S'1 binding site of the enzyme. Consequently, a focussed library of 160 members has been synthesised, using standard Fmoc-solid phase methods (on a Rink-amide resin), in which a subset of amino acids including examples of those with basic (Lys, Arg), aromatic (Phe, Trp), large aliphatic (Val, Leu) and acidic (Asp, Glu) side-chains populated the P'2 position of the inhibitor sequence and all 20 natural amino acids were incorporated, in turn, at the P'1 position. The study has revealed a preference for aromatic and/or large aliphatic amino acids at P'1 and a distinct bias against acidic residues at P'2. Ten inhibitor sequences were discovered that exhibited sub to low micromolar Ki values.

  8. High γ-aminobutyric acid production from lactic acid bacteria: Emphasis on Lactobacillus brevis as a functional dairy starter.

    PubMed

    Wu, Qinglong; Shah, Nagendra P

    2017-11-22

    γ-Aminobutyric acid (GABA) and GABA-rich foods have shown anti-hypertensive and anti-depressant activities as the major functions in humans and animals. Hence, high GABA-producing lactic acid bacteria (LAB) could be used as functional starters for manufacturing novel fermented dairy foods. Glutamic acid decarboxylases (GADs) from LAB are highly conserved at the species level based on the phylogenetic tree of GADs from LAB. Moreover, two functionally distinct GADs and one intact gad operon were observed in all the completely sequenced Lactobacillus brevis strains suggesting its common capability to synthesize GABA. Difficulties and strategies for the manufacture of GABA-rich fermented dairy foods have been discussed and proposed, respectively. In addition, a genetic survey on the sequenced LAB strains demonstrated the absence of cell envelope proteinases in the majority of LAB including Lb. brevis, which diminishes their cell viabilities in milk environments due to their non-proteolytic nature. Thus, several strategies have been proposed to overcome the non-proteolytic nature of Lb. brevis in order to produce GABA-rich dairy foods.

  9. Concurrent Automated Sequencing of the Glycan and Peptide Portions of O-Linked Glycopeptide Anions by Ultraviolet Photodissociation Mass Spectrometry

    PubMed Central

    Madsen, James A.; Ko, Byoung Joon; Xu, Hua; Iwashkiw, Jeremy A.; Robotham, Scott A.; Shaw, Jared B.; Feldman, Mario F.; Brodbelt, Jennifer S.

    2013-01-01

    O -glycopeptides are often acidic owing to the frequent occurrence of acidic saccharides in the glycan, rendering traditional proteomic workflows that rely on positive mode tandem mass spectrometry (MS/MS) less effective. In this report, we demonstrate the utility of negative mode ultraviolet photodissociation (UVPD) MS for the characterization of acidic O-linked glycopeptide anions. This method was evaluated for a series of singly- and multiply-deprotonated glycopeptides from the model glycoprotein kappa casein, resulting in production of both peptide and glycan product ions that afforded 100% sequence coverage of the peptide and glycan moieties from a single MS/MS event. The most abundant and frequent peptide sequence ions were a/x-type products, which, importantly, were found to retain the labile glycan modifications. The glycan-specific ions mainly arose from glycosidic bond cleavages (B, Y, C, and Z ions) in addition to some less common cross-ring cleavages. Based on the UVPD fragmentation patterns, an automated database searching strategy (based on the MassMatrix algorithm) was designed that is specific for the analysis of glycopeptide anions by UVPD. This algorithm was used to identify glycopeptides from mixtures of glycosylated and non-glycosylated peptides, sequence both glycan and peptide moieties simultaneously, and pinpoint the correct site(s) of glycosylation. This methodology was applied to uncover novel site-specificity of the O-linked glycosylated OmpA/MotB from the “superbug” A. baumannii to help aid in the elucidation of the functional role that protein glycosylation plays in pathogenesis. PMID:24006841

  10. Proline: The Distribution, Frequency, Positioning, and Common Functional Roles of Proline and Polyproline Sequences in the Human Proteome

    PubMed Central

    Morgan, Alexander A.; Rubenstein, Edward

    2013-01-01

    Proline is an anomalous amino acid. Its nitrogen atom is covalently locked within a ring, thus it is the only proteinogenic amino acid with a constrained phi angle. Sequences of three consecutive prolines can fold into polyproline helices, structures that join alpha helices and beta pleats as architectural motifs in protein configuration. Triproline helices are participants in protein-protein signaling interactions. Longer spans of repeat prolines also occur, containing as many as 27 consecutive proline residues. Little is known about the frequency, positioning, and functional significance of these proline sequences. Therefore we have undertaken a systematic bioinformatics study of proline residues in proteins. We analyzed the distribution and frequency of 687,434 proline residues among 18,666 human proteins, identifying single residues, dimers, trimers, and longer repeats. Proline accounts for 6.3% of the 10,882,808 protein amino acids. Of all proline residues, 4.4% are in trimers or longer spans. We detected patterns that influence function based on proline location, spacing, and concentration. We propose a classification based on proline-rich, polyproline-rich, and proline-poor status. Whereas singlet proline residues are often found in proteins that display recurring architectural patterns, trimers or longer proline sequences tend be associated with the absence of repetitive structural motifs. Spans of 6 or more are associated with DNA/RNA processing, actin, and developmental processes. We also suggest a role for proline in Kruppel-type zinc finger protein control of DNA expression, and in the nucleation and translocation of actin by the formin complex. PMID:23372670

  11. Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

    PubMed Central

    Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

    2007-01-01

    We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688

  12. Genetic diversity of pneumococcal surface protein A in invasive pneumococcal isolates from Korean children, 1991-2016.

    PubMed

    Yun, Ki Wook; Choi, Eun Hwa; Lee, Hoan Jong

    2017-01-01

    Pneumococcal surface protein A (PspA) is an important virulence factor of pneumococci and has been investigated as a primary component of a capsular serotype-independent pneumococcal vaccine. Thus, we sought to determine the genetic diversity of PspA to explore its potential as a vaccine candidate. Among the 190 invasive pneumococcal isolates collected from Korean children between 1991 and 2016, two (1.1%) isolates were found to have no pspA by multiple polymerase chain reactions. The full length pspA genes from 185 pneumococcal isolates were sequenced. The length of pspA varied, ranging from 1,719 to 2,301 base pairs with 55.7-100% nucleotide identity. Based on the sequences of the clade-defining regions, 68.7% and 49.7% were in PspA family 2 and clade 3/family 2, respectively. PspA clade types were correlated with genotypes using multilocus sequence typing and divided into several subclades based on diversity analysis of the N-terminal α-helical regions, which showed nucleotide sequence identities of 45.7-100% and amino acid sequence identities of 23.1-100%. Putative antigenicity plots were also diverse among individual clades and subclades. The differences in antigenicity patterns were concentrated within the N-terminal 120 amino acids. In conclusion, the N-terminal α-helical domain, which is known to be the major immunogenic portion of PspA, is genetically variable and should be further evaluated for antigenic differences and cross-reactivity between various PspA types from pneumococcal isolates.

  13. Lion (Panthera leo) and cheetah (Acinonyx jubatus) IFN-gamma sequences.

    PubMed

    Maas, Miriam; Van Rhijn, Ildiko; Allsopp, Maria T E P; Rutten, Victor P M G

    2010-04-15

    Cloning and sequencing of the full length lion and cheetah interferon-gamma (IFN-gamma) transcript will enable the expression of the recombinant cytokine, to be used for production of monoclonal antibodies and to set up lion and cheetah-specific IFN-gamma ELISAs. These are relevant in blood-based diagnosis of bovine tuberculosis, an important threat to lions in the Kruger National Park. Alignment of nucleotide and amino acid sequences of lion and cheetah and that of domestic cats showed homologies of 97-100%. Copyright 2009 Elsevier B.V. All rights reserved.

  14. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...

  15. Molecular identification of catalases from Nicotiana plumbaginifolia (L.).

    PubMed

    Willekens, H; Villarroel, R; Van Montagu, M; Inzé, D; Van Camp, W

    1994-09-19

    We have isolated three different catalase cDNAs from Nicotiana plumbaginifolia (cat1, cat2, and cat3) and a partial sequence of a fourth catalase gene (cat4) that shows no discernible expression based on Northern analysis. The catalase sequences were used to determine the similarity with other plant catalases and to study the transcriptional response to paraquat, 3-aminotriazole, and salicylic acid. 3-Aminotriazole induces mRNA levels of cat1, cat2 and cat3, indicating that a reduction in catalase activity positively affects catalase mRNA abundance. Salicylic acid that binds catalase in vitro, had no effect on catalase transcript levels at physiological concentrations. Paraquat resulted in the induction of cat1.

  16. A streamlined method for analysing genome-wide DNA methylation patterns from low amounts of FFPE DNA.

    PubMed

    Ludgate, Jackie L; Wright, James; Stockwell, Peter A; Morison, Ian M; Eccles, Michael R; Chatterjee, Aniruddha

    2017-08-31

    Formalin fixed paraffin embedded (FFPE) tumor samples are a major source of DNA from patients in cancer research. However, FFPE is a challenging material to work with due to macromolecular fragmentation and nucleic acid crosslinking. FFPE tissue particularly possesses challenges for methylation analysis and for preparing sequencing-based libraries relying on bisulfite conversion. Successful bisulfite conversion is a key requirement for sequencing-based methylation analysis. Here we describe a complete and streamlined workflow for preparing next generation sequencing libraries for methylation analysis from FFPE tissues. This includes, counting cells from FFPE blocks and extracting DNA from FFPE slides, testing bisulfite conversion efficiency with a polymerase chain reaction (PCR) based test, preparing reduced representation bisulfite sequencing libraries and massively parallel sequencing. The main features and advantages of this protocol are: An optimized method for extracting good quality DNA from FFPE tissues. An efficient bisulfite conversion and next generation sequencing library preparation protocol that uses 50 ng DNA from FFPE tissue. Incorporation of a PCR-based test to assess bisulfite conversion efficiency prior to sequencing. We provide a complete workflow and an integrated protocol for performing DNA methylation analysis at the genome-scale and we believe this will facilitate clinical epigenetic research that involves the use of FFPE tissue.

  17. Opsin cDNA sequences of a UV and green rhodopsin of the satyrine butterfly Bicyclus anynana.

    PubMed

    Vanhoutte, K J A; Eggen, B J L; Janssen, J J M; Stavenga, D G

    2002-11-01

    The cDNAs of an ultraviolet (UV) and long-wavelength (LW) (green) absorbing rhodopsin of the bush brown Bicyclus anynana were partially identified. The UV sequence, encoding 377 amino acids, is 76-79% identical to the UV sequences of the papilionids Papilio glaucus and Papilio xuthus and the moth Manduca sexta. A dendrogram derived from aligning the amino acid sequences reveals an equidistant position of Bicyclus between Papilio and Manduca. The sequence of the green opsin cDNA fragment, which encodes 242 amino acids, represents six of the seven transmembrane regions. At the amino acid level, this fragment is more than 80% identical to the corresponding LW opsin sequences of Dryas, Heliconius, Papilio (rhodopsin 2) and Manduca. Whereas three LW absorbing rhodopsins were identified in the papilionid butterflies, only one green opsin was found in B. anynana.

  18. Complete amino acid sequence of ananain and a comparison with stem bromelain and other plant cysteine proteases.

    PubMed Central

    Lee, K L; Albee, K L; Bernasconi, R J; Edmunds, T

    1997-01-01

    The amino acid sequences of ananain (EC3.4.22.31) and stem bromelain (3.4.22.32), two cysteine proteases from pineapple stem, are similar yet ananain and stem bromelain possess distinct specificities towards synthetic peptide substrates and different reactivities towards the cysteine protease inhibitors E-64 and chicken egg white cystatin. We present here the complete amino acid sequence of ananain and compare it with the reported sequences of pineapple stem bromelain, papain and chymopapain from papaya and actinidin from kiwifruit. Ananain is comprised of 216 residues with a theoretical mass of 23464 Da. This primary structure includes a sequence insert between residues 170 and 174 not present in stem bromelain or papain and a hydrophobic series of amino acids adjacent to His-157. It is possible that these sequence differences contribute to the different substrate and inhibitor specificities exhibited by ananain and stem bromelain. PMID:9355753

  19. Protein structure recognition: From eigenvector analysis to structural threading method

    NASA Astrophysics Data System (ADS)

    Cao, Haibo

    In this work, we try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. We found a strong correlation between amino acid sequence and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, we give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part include discussions of interactions among amino acids residues, lattice HP model, and the designablity principle. In the second part, we try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in our eigenvector study of protein contact matrix. We believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, we discuss a threading method based on the correlation between amino acid sequence and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, we list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  20. Differentiation of highly virulent strains of Streptococcus suis serotype 2 according to glutamate dehydrogenase electrophoretic and sequence type.

    PubMed

    Kutz, Russell; Okwumabua, Ogi

    2008-10-01

    The glutamate dehydrogenase (GDH) enzymes of 19 Streptococcus suis serotype 2 strains, consisting of 18 swine isolates and 1 human clinical isolate from a geographically varied collection, were analyzed by activity staining on a nondenaturing gel. All seven (100%) of the highly virulent strains tested produced an electrophoretic type (ET) distinct from those of moderately virulent and nonvirulent strains. By PCR and nucleotide sequence determination, the gdh genes of the 19 strains and of 2 highly virulent strains involved in recent Chinese outbreaks yielded a 1,820-bp fragment containing an open reading frame of 1,344 nucleotides, which encodes a protein of 448 amino acid residues with a calculated molecular mass of approximately 49 kDa. The nucleotide sequences contained base pair differences, but most were silent. Cluster analysis of the deduced amino acid sequences separated the isolates into three groups. Group I (ETI) consisted of the seven highly virulent isolates and the two Chinese outbreak strains, containing Ala(299)-to-Ser, Glu(305)-to-Lys, and Glu(330)-to-Lys amino acid substitutions compared with groups II and III (ETII). Groups II and III consisted of moderately virulent and nonvirulent strains, which are separated from each other by Tyr(72)-to-Asp and Thr(296)-to-Ala substitutions. Gene exchange studies resulted in the change of ETI to ETII and vice versa. A spectrophotometric activity assay for GDH did not show significant differences between the groups. These results suggest that the GDH ETs and sequence types may serve as useful markers in predicting the pathogenic behavior of strains of this serotype and that the molecular basis for the observed differences in the ETs was amino acid substitutions and not deletion, insertion, or processing uniqueness.

  1. KM+, a mannose-binding lectin from Artocarpus integrifolia: amino acid sequence, predicted tertiary structure, carbohydrate recognition, and analysis of the beta-prism fold.

    PubMed Central

    Rosa, J. C.; De Oliveira, P. S.; Garratt, R.; Beltramini, L.; Resing, K.; Roque-Barreira, M. C.; Greene, L. J.

    1999-01-01

    The complete amino acid sequence of the lectin KM+ from Artocarpus integrifolia (jackfruit), which contains 149 residues/mol, is reported and compared to those of other members of the Moraceae family, particularly that of jacalin, also from jackfruit, with which it shares 52% sequence identity. KM+ presents an acetyl-blocked N-terminus and is not posttranslationally modified by proteolytic cleavage as is the case for jacalin. Rather, it possesses a short, glycine-rich linker that unites the regions homologous to the alpha- and beta-chains of jacalin. The results of homology modeling implicate the linker sequence in sterically impeding rotation of the side chain of Asp141 within the binding site pocket. As a consequence, the aspartic acid is locked into a conformation adequate only for the recognition of equatorial hydroxyl groups on the C4 epimeric center (alpha-D-mannose, alpha-D-glucose, and their derivatives). In contrast, the internal cleavage of the jacalin chain permits free rotation of the homologous aspartic acid, rendering it capable of accepting hydrogen bonds from both possible hydroxyl configurations on C4. We suggest that, together with direct recognition of epimeric hydroxyls and the steric exclusion of disfavored ligands, conformational restriction of the lectin should be considered to be a new mechanism by which selectivity may be built into carbohydrate binding sites. Jacalin and KM+ adopt the beta-prism fold already observed in two unrelated protein families. Despite presenting little or no sequence similarity, an analysis of the beta-prism reveals a canonical feature repeatedly present in all such structures, which is based on six largely hydrophobic residues within a beta-hairpin containing two classic-type beta-bulges. We suggest the term beta-prism motif to describe this feature. PMID:10210179

  2. A symmetry model for genetic coding via a wallpaper group composed of the traditional four bases and an imaginary base E: Towards category theory-like systematization of molecular/genetic biology

    PubMed Central

    2014-01-01

    Background Previously, we suggested prototypal models that describe some clinical states based on group postulates. Here, we demonstrate a group/category theory-like model for molecular/genetic biology as an alternative application of our previous model. Specifically, we focus on deoxyribonucleic acid (DNA) base sequences. Results We construct a wallpaper pattern based on a five-letter cruciform motif with letters C, A, T, G, and E. Whereas the first four letters represent the standard DNA bases, the fifth is introduced for ease in formulating group operations that reproduce insertions and deletions of DNA base sequences. A basic group Z5 = {r, u, d, l, n} of operations is defined for the wallpaper pattern, with which a sequence of points can be generated corresponding to changes of a base in a DNA sequence by following the orbit of a point of the pattern under operations in group Z5. Other manipulations of DNA sequence can be treated using a vector-like notation ‘Dj’ corresponding to a DNA sequence but based on the five-letter base set; also, ‘Dj’s are expressed graphically. Insertions and deletions of a series of letters ‘E’ are admitted to assist in describing DNA recombination. Likewise, a vector-like notation Rj can be constructed for sequences of ribonucleic acid (RNA). The wallpaper group B = {Z5×∞, ●} (an ∞-fold Cartesian product of Z5) acts on Dj (or Rj) yielding changes to Dj (or Rj) denoted by ‘Dj◦B(j→k) = Dk’ (or ‘Rj◦B(j→k) = Rk’). Based on the operations of this group, two types of groups—a modulo 5 linear group and a rotational group over the Gaussian plane, acting on the five bases—are linked as parts of the wallpaper group for broader applications. As a result, changes, insertions/deletions and DNA (RNA) recombination (partial/total conversion) are described. As an exploratory study, a notation for the canonical “central dogma” via a category theory-like way is presented for future developments. Conclusions Despite the large incompleteness of our methodology, there is fertile ground to consider a symmetry model for genetic coding based on our specific wallpaper group. A more integrated formulation containing “central dogma” for future molecular/genetic biology remains to be explored. PMID:24885369

  3. Identification and biochemical characterization of an Arabidopsis indole-3-acetic acid glucosyltransferase.

    PubMed

    Jackson, R G; Lim, E K; Li, Y; Kowalczyk, M; Sandberg, G; Hoggett, J; Ashford, D A; Bowles, D J

    2001-02-09

    Biochemical characterization of recombinant gene products following a phylogenetic analysis of the UDP-glucosyltransferase (UGT) multigene family of Arabidopsis has identified one enzyme (UGT84B1) with high activity toward the plant hormone indole-3-acetic acid (IAA) and three related enzymes (UGT84B2, UGT75B1, and UGT75B2) with trace activities. The identity of the IAA conjugate has been confirmed to be 1-O-indole acetyl glucose ester. A sequence annotated as a UDP-glucose:IAA glucosyltransferase (IAA-UGT) in the Arabidopsis genome and expressed sequence tag data bases given its similarity to the maize iaglu gene sequence showed no activity toward IAA. This study describes the first biochemical analysis of a recombinant IAA-UGT and provides the foundation for future genetic approaches to understand the role of 1-O-indole acetyl glucose ester in Arabidopsis.

  4. The primary structure of L37--a rat ribosomal protein with a zinc finger-like motif.

    PubMed

    Chan, Y L; Paz, V; Olvera, J; Wool, I G

    1993-04-30

    The amino acid sequence of the rat 60S ribosomal subunit protein L37 was deduced from the sequence of nucleotides in a recombinant cDNA. Ribosomal protein L37 has 96 amino acids, the NH2-terminal methionine is removed after translation of the mRNA, and has a molecular weight of 10,939. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type. Hybridization of the cDNA to digests of nuclear DNA suggests that there are 13 or 14 copies of the L37 gene. The mRNA for the protein is about 500 nucleotides in length. Rat L37 is related to Saccharomyces cerevisiae ribosomal protein YL35 and to Caenorhabditis elegans L37. We have identified in the data base a DNA sequence that encodes the chicken homolog of rat L37.

  5. The point mutation process in proteins

    NASA Technical Reports Server (NTRS)

    Schwartz, R. M.; Dayhoff, M. O.

    1978-01-01

    An optimized scoring matrix for residue-by-residue comparisons of distantly related protein sequences has been developed. The scoring matrix is based on observed exchanges and mutabilities of amino acids in 1572 closely related sequences derived from a cross-section of protein groups. Very few superimposed or parallel mutations are included in the data. The scoring matrix is most useful for demonstrating the relatedness of proteins between 65 and 85% different.

  6. Geodermatophilus sabuli sp. nov., a γ-radiation-resistant actinobacterium isolated from desert limestone.

    PubMed

    Hezbri, Karima; Ghodhbane-Gtari, Faten; Montero-Calasanz, Maria del Carmen; Sghaier, Haïtham; Rohde, Manfred; Schumann, Peter; Klenk, Hans-Peter; Gtari, Maher

    2015-10-01

    A novel γ-radiation-resistant and Gram-staining-positive actinobacterium designated BMG 8133T was isolated from a limestone collected in the Sahara desert of Tunisia. The strain produced dry, pale-pink colonies with an optimum growth at 35–40 °C and pH 6.5–8.0. Chemotaxonomic and molecular characteristics of the isolate matched those described for members of the genus Geodermatophilus. The peptidoglycan contained meso-diaminopimelic acid as diagnostic diamino acid. The main polar lipids were phosphatidylcholine, diphosphatidylglycerol, phosphatidylinositol, phosphatidylethanolamine and one unspecified glycolipid. MK-9(H4) was the dominant menaquinone. Galactose and glucose were detected as diagnostic sugars. The major cellular fatty acids were branched-chain saturated acids iso-C16 : 0 and iso-C15 : 0. The DNA G+C content of the novel strain was 74.5 %. The 16S rRNA gene sequence showed highest sequence identity with Geodermatophilus ruber (98.3 %). Based on phenotypic results and 16S rRNA gene sequence analysis, strain BMG 8133T is proposed to represent a novel species, Geodermatophilus sabuli sp. nov. The type strain is BMG 8133T ( = DSM 46844T = CECT 8820T).

  7. A new earthworm cellulase and its possible role in the innate immunity.

    PubMed

    Park, In Yong; Cha, Ju Roung; Ok, Suk-Mi; Shin, Chuog; Kim, Jin-Se; Kwak, Hee-Jin; Yu, Yun-Sang; Kim, Yu-Kyung; Medina, Brenda; Cho, Sung-Jin; Park, Soon Cheol

    2017-02-01

    A new endogenous cellulase (Ean-EG) from the earthworm, Eisenia andrei and its expression pattern are demonstrated. Based on a deduced amino acid sequence, the open reading frame (ORF) of Ean-EG consisted of 1368 bps corresponding to a polypeptide of 456 amino acid residues in which is contained the conserved region specific to GHF9 that has the essential amino acid residues for enzyme activity. In multiple alignments and phylogenetic analysis, the deduced amino acid sequence of Ean- EG showed the highest sequence similarity (about 79%) to that of an annelid (Pheretima hilgendorfi) and could be clustered together with other GHF9 cellulases, indicating that Ean-EG could be categorized as a member of the GHF9 to which most animal cellulases belong. The histological expression pattern of Ean-EG mRNA using in situ hybridization revealed that the most distinct expression was observed in epithelial cells with positive hybridization signal in epidermis, chloragogen tissue cells, coelomic cell-aggregate, and even blood vessel, which could strongly support the fact that at least in the earthworm, Eisenia andrei, cellulase function must not be limited to digestive process but be possibly extended to the innate immunity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Effects of a Non-Conservative Sequence on the Properties of β-glucuronidase from Aspergillus terreus Li-20

    PubMed Central

    Liu, Yanli; Huangfu, Jie; Qi, Feng; Kaleem, Imdad; E, Wenwen; Li, Chun

    2012-01-01

    We cloned the β-glucuronidase gene (AtGUS) from Aspergillus terreus Li-20 encoding 657 amino acids (aa), which can transform glycyrrhizin into glycyrrhetinic acid monoglucuronide (GAMG) and glycyrrhetinic acid (GA). Based on sequence alignment, the C-terminal non-conservative sequence showed low identity with those of other species; thus, the partial sequence AtGUS(-3t) (1–592 aa) was amplified to determine the effects of the non-conservative sequence on the enzymatic properties. AtGUS and AtGUS(-3t) were expressed in E. coli BL21, producing AtGUS-E and AtGUS(-3t)-E, respectively. At the similar optimum temperature (55°C) and pH (AtGUS-E, 6.6; AtGUS(-3t)-E, 7.0) conditions, the thermal stability of AtGUS(-3t)-E was enhanced at 65°C, and the metal ions Co2+, Ca2+ and Ni2+ showed opposite effects on AtGUS-E and AtGUS(-3t)-E, respectively. Furthermore, Km of AtGUS(-3t)-E (1.95 mM) was just nearly one-seventh that of AtGUS-E (12.9 mM), whereas the catalytic efficiency of AtGUS(-3t)-E was 3.2 fold higher than that of AtGUS-E (7.16 vs. 2.24 mM s−1), revealing that the truncation of non-conservative sequence can significantly improve the catalytic efficiency of AtGUS. Conformational analysis illustrated significant difference in the secondary structure between AtGUS-E and AtGUS(-3t)-E by circular dichroism (CD). The results showed that the truncation of the non-conservative sequence could preferably alter and influence the stability and catalytic efficiency of enzyme. PMID:22347419

  9. An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.

    PubMed

    Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel

    2004-01-21

    We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.

  10. DNATagger, colors for codons.

    PubMed

    Scherer, N M; Basso, D M

    2008-09-16

    DNATagger is a web-based tool for coloring and editing DNA, RNA and protein sequences and alignments. It is dedicated to the visualization of protein coding sequences and also protein sequence alignments to facilitate the comprehension of evolutionary processes in sequence analysis. The distinctive feature of DNATagger is the use of codons as informative units for coloring DNA and RNA sequences. The codons are colored according to their corresponding amino acids. It is the first program that colors codons in DNA sequences without being affected by "out-of-frame" gaps of alignments. It can handle single gaps and gaps inside the triplets. The program also provides the possibility to edit the alignments and change color patterns and translation tables. DNATagger is a JavaScript application, following the W3C guidelines, designed to work on standards-compliant web browsers. It therefore requires no installation and is platform independent. The web-based DNATagger is available as free and open source software at http://www.inf.ufrgs.br/~dmbasso/dnatagger/.

  11. Contribution of Tryptophan Residues to the Combining Site of a Monoclonal Anti Dinitrophenyl Spin-Label Antibody

    DTIC Science & Technology

    1987-01-01

    identified in the difference spectra, implying that: there are five to seven tryptophans within 17 A of the spin-label hapten. Amino acid sequences...of the heavy, and light chains were obtained by a combination of amino acid and DNA sequencing. A molecular model’ was constructed from the sequence...Clore & acids yields detailed information about the amino acid com- Gronenborn, 1982, 1983). This technique should also identify position of the combining

  12. Identification of the srtC1 Transcription Start Site and Catalytically Essential Residues Required for Actinomyces oris T14V SrtC1 Activity

    DTIC Science & Technology

    2011-07-27

    domain (type 2 phosphatidic acid phosphatase) and may be a PAP2 like superfamily member. In order to localize the promoter(s) for these three genes...Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 which amino acid residue(s) was critical for the enzyme activity. This enzyme possesses a...analyzed the role of eight conserved amino acid residues. The amino acids to be mutated were chosen based on the sequence alignment of several class C

  13. Methods And Devices For Characterizing Duplex Nucleic Acid Molecules

    DOEpatents

    Akeson, Mark; Vercoutere, Wenonah; Haussler, David; Winters-Hilt, Stephen

    2005-08-30

    Methods and devices are provided for characterizing a duplex nucleic acid, e.g., a duplex DNA molecule. In the subject methods, a fluid conducting medium that includes a duplex nucleic acid molecule is contacted with a nanopore under the influence of an applied electric field and the resulting changes in current through the nanopore caused by the duplex nucleic acid molecule are monitored. The observed changes in current through the nanopore are then employed as a set of data values to characterize the duplex nucleic acid, where the set of data values may be employed in raw form or manipulated, e.g., into a current blockade profile. Also provided are nanopore devices for practicing the subject methods, where the subject nanopore devices are characterized by the presence of an algorithm which directs a processing means to employ monitored changes in current through a nanopore to characterize a duplex nucleic acid molecule responsible for the current changes. The subject methods and devices find use in a variety of applications, including, among other applications, the identification of an analyte duplex DNA molecule in a sample, the specific base sequence at a single nulceotide polymorphism (SNP), and the sequencing of duplex DNA molecules.

  14. Quantum Mechanical Calculations of Cytosine, Thiocytosine and Their Radical Ions

    NASA Astrophysics Data System (ADS)

    Singh, Rashmi

    2010-08-01

    The RNA and DNA are polymer that share some interesting similarities, for instance it is well known that cytosine is the one of the common nucleic acid base. The sulfur is characterized as a very reactive element and it has been used, in chemical warfare agents. Since the genetic information is based on the sequence of the nucleic acid bases. The quantum mechanical calculations of the energies, geometries, charges and vibrational characteristics of the cytosine and thiocytosine. and their corresponding radicals were carried out by using DFT method with b3lyp/6-311++g** basis set.

  15. Identification of a novel bovine enterovirus possessing highly divergent amino acid sequences in capsid protein.

    PubMed

    Tsuchiaka, Shinobu; Rahpaya, Sayed Samim; Otomaru, Konosuke; Aoki, Hiroshi; Kishimoto, Mai; Naoi, Yuki; Omatsu, Tsutomu; Sano, Kaori; Okazaki-Terashima, Sachiko; Katayama, Yukie; Oba, Mami; Nagai, Makoto; Mizutani, Tetsuya

    2017-01-17

    Bovine enterovirus (BEV) belongs to the species Enterovirus E or F, genus Enterovirus and family Picornaviridae. Although numerous studies have identified BEVs in the feces of cattle with diarrhea, the pathogenicity of BEVs remains unclear. Previously, we reported the detection of novel kobu-like virus in calf feces, by metagenomics analysis. In the present study, we identified a novel BEV in diarrheal feces collected for that survey. Complete genome sequences were determined by deep sequencing in feces. Secondary RNA structure analysis of the 5' untranslated region (UTR), phylogenetic tree construction and pairwise identity analysis were conducted. The complete genome sequences of BEV were genetically distant from other EVs and the VP1 coding region contained novel and unique amino acid sequences. We named this strain as BEV AN12/Bos taurus/JPN/2014 (referred to as BEV-AN12). According to genome analysis, the genome length of this virus is 7414 nucleotides excluding the poly (A) tail and its genome consists of a 5'UTR, open reading frame encoding a single polyprotein, and 3'UTR. The results of secondary RNA structure analysis showed that in the 5'UTR, BEV-AN12 had an additional clover leaf structure and small stem loop structure, similarly to other BEVs. In pairwise identity analysis, BEV-AN12 showed high amino acid (aa) identities to Enterovirus F in the polyprotein, P2 and P3 regions (aa identity ≥82.4%). Therefore, BEV-AN12 is closely related to Enterovirus F. However, aa sequences in the capsid protein regions, particularly the VP1 encoding region, showed significantly low aa identity to other viruses in genus Enterovirus (VP1 aa identity ≤58.6%). In addition, BEV-AN12 branched separately from Enterovirus E and F in phylogenetic trees based on the aa sequences of P1 and VP1, although it clustered with Enterovirus F in trees based on sequences in the P2 and P3 genome region. We identified novel BEV possessing highly divergent aa sequences in the VP1 coding region in Japan. According to species definition, we proposed naming this strain as "Enterovirus K", which is a novel species within genus Enterovirus. Further genomic studies are needed to understand the pathogenicity of BEVs.

  16. Characterization of a molt-inhibiting hormone (MIH) of the crayfish, Orconectes limosus, by cDNA cloning and mass spectrometric analysis.

    PubMed

    Bulau, Patrick; Okuno, Atsuro; Thome, Elke; Schmitz, Tina; Peter-Katalinic, Jasna; Keller, Rainer

    2005-11-01

    The structure of the precursor of a molt-inhibiting hormone (MIH) of the American crayfish, Orconectes limosus was determined by cloning of a cDNA based on RNA from the neurosecretory perikarya of the X-organ in the eyestalk ganglia. The open reading frame includes the complete precursor sequence, consisting of a signal peptide of 29, and the MIH sequence of 77 amino acids. In addition, the mature peptide was isolated by HPLC from the neurohemal sinus gland and analyzed by ESI-MS and MALDI-TOF-MS peptide mapping. This showed that the mature peptide (Mass 8664.29 Da) consists of only 75 amino acids, having Ala75-NH2 as C-terminus. Thus, C-terminal Arg77 of the precursor is removed during processing, and Gly76 serves as an amide donor. Sequence comparison confirms this peptide as a novel member of the large family, which includes crustacean hyperglycaemic hormone (CHH), MIH and gonad (vitellogenesis)-inhibiting hormone (GIH/VIH). The lack of a CPRP (CHH-precursor related peptide) in the hormone precursor, the size and specific sequence characteristics show that Orl MIH belongs to the MIH/GIH(VIH) subgroup of this larger family. Comparison with the MIH of Procambarus clarkii, the only other MIH that has thus far been identified in freshwater crayfish, shows extremely high sequence conservation. Both MIHs differ in only one amino acid residue ( approximately 99% identity), whereas the sequence identity to several other known MIHs is between 40 and 46%.

  17. Identification of cancer-specific motifs in mimotope profiles of serum antibody repertoire.

    PubMed

    Gerasimov, Ekaterina; Zelikovsky, Alex; Măndoiu, Ion; Ionov, Yurij

    2017-06-07

    For fighting cancer, earlier detection is crucial. Circulating auto-antibodies produced by the patient's own immune system after exposure to cancer proteins are promising bio-markers for the early detection of cancer. Since an antibody recognizes not the whole antigen but 4-7 critical amino acids within the antigenic determinant (epitope), the whole proteome can be represented by a random peptide phage display library. This opens the possibility to develop an early cancer detection test based on a set of peptide sequences identified by comparing cancer patients' and healthy donors' global peptide profiles of antibody specificities. Due to the enormously large number of peptide sequences contained in global peptide profiles generated by next generation sequencing, the large number of cancer and control sera is required to identify cancer-specific peptides with high degree of statistical significance. To decrease the number of peptides in profiles generated by nextgen sequencing without losing cancer-specific sequences we used for generation of profiles the phage library enriched by panning on the pool of cancer sera. To further decrease the complexity of profiles we used computational methods for transforming a list of peptides constituting the mimotope profiles to the list motifs formed by similar peptide sequences. We have shown that the amino-acid order is meaningful in mimotope motifs since they contain significantly more peptides than motifs among peptides where amino-acids are randomly permuted. Also the single sample motifs significantly differ from motifs in peptides drawn from multiple samples. Finally, multiple cancer-specific motifs have been identified.

  18. C-terminal Amidation of an Osteocalcin-derived Peptide Promotes Hydroxyapatite Crystallization*

    PubMed Central

    Hosseini, Samaneh; Naderi-Manesh, Hossein; Mountassif, Driss; Cerruti, Marta; Vali, Hojatollah; Faghihi, Shahab

    2013-01-01

    Genesis of natural biocomposite-based materials, such as bone, cartilage, and teeth, involves interactions between organic and inorganic systems. Natural biopolymers, such as peptide motif sequences, can be used as a template to direct the nucleation and crystallization of hydroxyapatite (HA). In this study, a natural motif sequence consisting of 13 amino acids present in the first helix of osteocalcin was selected based on its calcium binding ability and used as substrate for nucleation of HA crystals. The acidic (acidic osteocalcin-derived peptide (OSC)) and amidic (amidic osteocalcin-derived peptide (OSN)) forms of this sequence were synthesized to investigate the effects of different C termini on the process of biomineralization. Electron microscopy analyses show the formation of plate-like HA crystals with random size and shape in the presence of OSN. In contrast, spherical amorphous calcium phosphate is formed in the presence of OSC. Circular dichroism experiments indicate conformational changes of amidic peptide to an open and regular structure as a consequence of interaction with calcium and phosphate. There is no conformational change detectable in OSC. It is concluded that HA crystal formation, which only occurred in OSN, is attributable to C-terminal amidation of a natural peptide derived from osteocalcin. It is also proposed that natural peptides with the ability to promote biomineralization have the potential to be utilized in hard tissue regeneration. PMID:23362258

  19. Proposal of Mucilaginibacter galii sp. nov. isolated from leaves of Galium album.

    PubMed

    Aydogan, Ebru L; Busse, Hans-Jürgen; Moser, Gerald; Müller, Christoph; Kämpfer, Peter; Glaeser, Stefanie P

    2017-05-01

    A pale-pink-pigmented, Gram-stain-negative, rod-shaped, non-spore-forming bacterial strain, PP-F2F-G47T, was isolated from the phyllosphere of the herbaceous plant Galium album. Phylogenetic analysis based on the nearly full-length 16S rRNA gene sequence revealed highest sequence similarity to the type strains of Mucilaginibacter daejeonensis (96.2 %), Mucilaginibacter dorajii (95.7 %) and Mucilaginibacter phyllosphaerae (95.5 %). 16S rRNA gene sequence similarities to all other type strains were below 95.5 %. The predominant cellular fatty acids of the strain were C16 : 1ω7c/iso-C15 : 0 2-OH (measured as summed feature 3) and iso-C15 : 0. The major compound in the polyamine pattern was sym-homospermidine and major quinone was menaquinone MK-7. The polar lipid profile was composed of phosphatidylethanolamine and several unidentified aminolipipids, phospholipids, aminophospholipids and lipids without a functional group. A sphingophospholipid could not be detected but a ninhydrin-positive alkaline-stable lipid was visible. The diagnostic diamino acid of the peptidoglycan was meso-diaminopimelic acid. Based on phylogenetic, chemotaxonomic and phenotypic analyses a novel species is proposed, Mucilaginibacter galii sp. nov., with PP-F2F-G47T (=CCM 8711T=CIP 111182T=LMG 29767T) as the type strain.

  20. Sequence of a cDNA encoding pancreatic preprosomatostatin-22.

    PubMed Central

    Magazin, M; Minth, C D; Funckes, C L; Deschenes, R; Tavianini, M A; Dixon, J E

    1982-01-01

    We report the nucleotide sequence of a precursor to somatostatin that upon proteolytic processing may give rise to a hormone of 22 amino acids. The nucleotide sequence of a cDNA from the channel catfish (Ictalurus punctatus) encodes a precursor to somatostatin that is 105 amino acids (Mr, 11,500). The cDNA coding for somatostatin-22 consists of 36 nucleotides in the 5' untranslated region, 315 nucleotides that code for the precursor to somatostatin-22, 269 nucleotides at the 3' untranslated region, and a variable length of poly(A). The putative preprohormone contains a sequence of hydrophobic amino acids at the amino terminus that has the properties of a "signal" peptide. A connecting sequence of approximately 57 amino acids is followed by a single Arg-Arg sequence, which immediately precedes the hormone. Somatostatin-22 is homologous to somatostatin-14 in 7 of the 14 amino acids, including the Phe-Trp-Lys sequence. Hybridization selection of mRNA, followed by its translation in a wheat germ cell-free system, resulted in the synthesis of a single polypeptide having a molecular weight of approximately 10,000 as estimated on Na-DodSO4/polyacrylamide gels. Images PMID:6127673

  1. Phylogenetic Relationship of Necoclí Virus to Other South American Hantaviruses (Bunyaviridae: Hantavirus).

    PubMed

    Montoya-Ruiz, Carolina; Cajimat, Maria N B; Milazzo, Mary Louise; Diaz, Francisco J; Rodas, Juan David; Valbuena, Gustavo; Fulhorst, Charles F

    2015-07-01

    The results of a previous study suggested that Cherrie's cane rat (Zygodontomys cherriei) is the principal host of Necoclí virus (family Bunyaviridae, genus Hantavirus) in Colombia. Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences in this study confirmed that Necoclí virus is phylogenetically closely related to Maporal virus, which is principally associated with the delicate pygmy rice rat (Oligoryzomys delicatus) in western Venezuela. In pairwise comparisons, nonidentities between the complete amino acid sequence of the nucleocapsid protein of Necoclí virus and the complete amino acid sequences of the nucleocapsid proteins of other hantaviruses were ≥8.7%. Likewise, nonidentities between the complete amino acid sequence of the glycoprotein precursor of Necoclí virus and the complete amino acid sequences of the glycoprotein precursors of other hantaviruses were ≥11.7%. Collectively, the unique association of Necoclí virus with Z. cherriei in Colombia, results of the Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences, and results of the pairwise comparisons of amino acid sequences strongly support the notion that Necoclí virus represents a novel species in the genus Hantavirus. Further work is needed to determine whether Calabazo virus (a hantavirus associated with Z. brevicauda cherriei in Panama) and Necoclí virus are conspecific.

  2. Graphene for amino acid biosensing: Theoretical study of the electronic transport

    NASA Astrophysics Data System (ADS)

    Rodríguez, S. J.; Makinistian, L.; Albanesi, E. A.

    2017-10-01

    The study of biosensors based on graphene has increased in the last years, the combination of excellent electrical properties and low noise makes graphene a material for next generation electronic devices. This work discusses the application of a graphene-based biosensor for the detection of amino acids histidine (His), alanine (Ala), aspartic acid (Asp), and tyrosine (Tyr). First, we present the results of modeling from first principles the adsorption of the four amino acids on a graphene sheet, we calculate adsorption energy, substrate-adsorbate distance, equilibrium geometrical configurations (upon relaxation) and densities of states (DOS) for each biomolecule adsorbed. Furthermore, in order to evaluate the effects of amino acid adsorption on the electronic transport of graphene, we modeled a device using first-principles calculations with a combination of Density Functional Theory (DFT) and Nonequilibrium Greens Functions (NEGF). We provide with a detailed discussion in terms of transmission, current-voltage curves, and charge transfer. We found evidence of differences in the electronic transport through the graphene sheet due to amino acid adsorption, reinforcing the possibility of graphene-based sensors for amino acid sequencing of proteins.

  3. TaALMT1 promoter sequence compositions, acid tolerance, and Al tolerance in wheat cultivars and landraces from Sichuan in China.

    PubMed

    Han, C; Dai, S F; Liu, D C; Pu, Z J; Wei, Y M; Zheng, Y L; Wen, D J; Zhao, L; Yan, Z H

    2013-11-18

    Previous genetic studies on wheat from various sources have indicated that aluminum (Al) tolerance may have originated independently in USA, Brazil, and China. Here, TaALMT1 promoter sequences of 92 landraces and cultivars from Sichuan, China, were sequenced. Five promoter types (I', II, III, IV, and V) were observed in 39 cultivars, and only three promoter types (I, II, and III) were observed in 53 landraces. Among the wheat collections worldwide, only the Chinese Spring (CS) landrace native to Sichuan, China, carried the TaALMT1 promoter type III. Besides CS, two other Sichuan-bred landraces and six cultivars with TaALMT1 promoter type III were identified in this study. In the phylogenetic tree constructed based on the TaALMT1 promoter sequences, type III formed a separate branch, which was supported by a high bootstrap value. It is likely that TaALMT1 promoter type III originated from Sichuan-bred wheat landraces of China. In addition, the landraces with promoter type I showed the lowest Al tolerance among all landraces and cultivars. Furthermore, the cultivars with promoter type IV showed better Al tolerance than landraces with promoter type II. A comparison of acid tolerance and Al tolerance between cultivars and landraces showed that the landraces had better acid tolerance than the cultivars, whereas the cultivars showed better Al tolerance than the landraces. Moreover, significant difference in Al tolerance was also observed between the cultivars raised by the National Ministry of Agriculture and by Sichuan Province. Among the landraces from different regions, those from the East showed better acid tolerance and Al tolerance than those from the South and West of Sichuan. Additional Al-tolerant and acid-tolerant wheat lines were also identified.

  4. Molecular cloning, sequence identification and tissue expression profile of three novel sheep (Ovis aries) genes - BCKDHA, NAGA and HEXA.

    PubMed

    Liu, G Y; Gao, S Z

    2009-01-01

    The complete coding sequences of three sheep genes- BCKDHA, NAGA and HEXA were amplified using the reverse transcriptase polymerase chain reaction (RT-PCR), based on the conserved sequence information of the mouse or other mammals. The nucleotide sequences of these three genes revealed that the sheep BCKDHA gene encodes a protein of 313 amino acids which has high homology with the BCKDHA gene that encodes a protein of 447 amino acids that has high homology with the Branched chain keto acid dehydrogenase El, alpha polypeptide (BCKDHA) of five species chimpanzee (93%), human (96%), crab-eating macaque (93%), bovine (98%) and mouse (91%). The sheep NAGA gene encodes a protein of 411 amino acids that has high homology with the alpha-N-acetylgalactosaminidase (NAGA) of five species human (85%), bovine (94%), mouse (91%), rat (83%) and chicken (74%). The sheep HEXA gene encodes a protein of 529 amino acids that has high homology with the hexosaminidase A(HEXA) of five species bovine (98%), human (84%), Bornean orangután (84%), rat (80%) and mouse (81%). Finally these three novel sheep genes were assigned to GenelDs: 100145857, 100145858 and 100145856. The phylogenetic tree analysis revealed that the sheep BCKDHA, NAGA, and HEXA all have closer genetic relationships to the BCKDHA, NAGA, and HEXA of bovine. Tissue expression profile analysis was also carried out and results revealed that sheep BCKDHA, NAGA and HEXA genes were differentially expressed in tissues including muscle, heart, liver, fat, kidney, lung, small and large intestine. Our experiment is the first to establish the primary foundation for further research on these three sheep genes.

  5. Megasphaera hexanoica sp. nov., a medium-chain carboxylic acid-producing bacterium isolated from a cow rumen.

    PubMed

    Jeon, Byoung Seung; Kim, Seil; Sang, Byoung-In

    2017-07-01

    Strain MHT, a strictly anaerobic, Gram-stain-negative, non-spore-forming, spherical coccus or coccoid-shaped microorganism, was isolated from a cow rumen during a screen for hexanoic acid-producing bacteria. The microorganism grew at 30-40 °C and pH 5.5-7.5 and exhibited production of various short- and medium-chain carboxylic acids (acetic acid, butyric acid, pentanoic acid, isobutyric acid, isovaleric acid, hexanoic acid, heptanoic acid and octanoic acid), as well as H2 and CO2 as biogas. Phylogenetic analysis based on 16S rRNA gene sequencing demonstrated that MHT represents a member of the genus Megasphaera, with the closest relatives being Megapsphaera indica NMBHI-10T (94.1 % 16S rRNA sequence similarity), Megasphaera elsdenii DSM 20460T (93.8 %) and Megasphaera paucivorans DSM 16981T (93.8 %). The major cellular fatty acids produced by MHT included C12 : 0, C16 : 0, C18 : 1cis 9, and C18 : 0, and the DNA G+C content of the MHT genome is 51.8 mol%. Together, the distinctive phenotypic and phylogenetic characteristics of MHT indicate that this microorganism represents a novel species of the genus Megasphaera, for which the name Megasphaera hexanoica sp. nov. is herein proposed. The type strain of this species is MHT (=KCCM 43214T=JCM 31403T).

  6. Cloning and expression of cDNA coding for bouganin.

    PubMed

    den Hartog, Marcel T; Lubelli, Chiara; Boon, Louis; Heerkens, Sijmie; Ortiz Buijsse, Antonio P; de Boer, Mark; Stirpe, Fiorenzo

    2002-03-01

    Bouganin is a ribosome-inactivating protein that recently was isolated from Bougainvillea spectabilis Willd. In this work, the cloning and expression of the cDNA encoding for bouganin is described. From the cDNA, the amino-acid sequence was deduced, which correlated with the primary sequence data obtained by amino-acid sequencing on the native protein. Bouganin is synthesized as a pro-peptide consisting of 305 amino acids, the first 26 of which act as a leader signal while the 29 C-terminal amino acids are cleaved during processing of the molecule. The mature protein consists of 250 amino acids. Using the cDNA sequence encoding the mature protein of 250 amino acids, a recombinant protein was expressed, purified and characterized. The recombinant molecule had similar activity in a cell-free protein synthesis assay and had comparable toxicity on living cells as compared to the isolated native bouganin.

  7. In silico cloning and B/T cell epitope prediction of triosephosphate isomerase from Echinococcus granulosus.

    PubMed

    Wang, Fen; Ye, Bin

    2016-10-01

    Cystic echinococcosis is a worldwide zoonosis caused by Echinococcus granulosus. Because the methods of diagnosis and treatment for cystic echinococcosis were limited, it is still necessary to screen target proteins for the development of new anti-hydatidosis vaccine. In this study, the triosephosphate isomerase gene of E. granulosus was in silico cloned. The B cell and T cell epitopes were predicted by bioinformatics methods. The cDNA sequence of EgTIM was composition of 1094 base pairs, with an open reading frame of 753 base pairs. The deduced amino acid sequences were composed of 250 amino acids. Five cross-reactive epitopes, locating on 21aa-35aa, 43aa-57aa, 94aa-107aa, 115-129aa, and 164aa-183aa, could be expected to serve as candidate epitopes in the development of vaccine against E. granulosus. These results could provide bases for gene cloning, recombinant expression, and the designation of anti-hydatidosis vaccine.

  8. The emergence and evolution of life in a "fatty acid world" based on quantum mechanics.

    PubMed

    Tamulis, Arvydas; Grigalavicius, Mantas

    2011-02-01

    Quantum mechanical based electron correlation interactions among molecules are the source of the weak hydrogen and Van der Waals bonds that are critical to the self-assembly of artificial fatty acid micelles. Life on Earth or elsewhere could have emerged in the form of self-reproducing photoactive fatty acid micelles, which gradually evolved into nucleotide-containing micelles due to the enhanced ability of nucleotide-coupled sensitizer molecules to absorb visible light. Comparison of the calculated absorption spectra of micelles with and without nucleotides confirmed this idea and supports the idea of the emergence and evolution of nucleotides in minimal cells of a so-called Fatty Acid World. Furthermore, the nucleotide-caused wavelength shift and broadening of the absorption pattern potentially gives these molecules an additional valuable role, other than a purely genetic one in the early stages of the development of life. From the information theory point of view, the nucleotide sequences in such micelles carry positional information providing better electron transport along the nucleotide-sensitizer chain and, in addition, providing complimentary copies of that information for the next generation. Nucleotide sequences, which in the first period of evolution of fatty acid molecules were useful just for better absorbance of the light in the longer wavelength region, later in the PNA or RNA World, took on the role of genetic information storage.

  9. Method for altering antibody light chain interactions

    DOEpatents

    Stevens, Fred J.; Stevens, Priscilla Wilkins; Raffen, Rosemarie; Schiffer, Marianne

    2002-01-01

    A method for recombinant antibody subunit dimerization including modifying at least one codon of a nucleic acid sequence to replace an amino acid occurring naturally in the antibody with a charged amino acid at a position in the interface segment of the light polypeptide variable region, the charged amino acid having a first polarity; and modifying at least one codon of the nucleic acid sequence to replace an amino acid occurring naturally in the antibody with a charged amino acid at a position in an interface segment of the heavy polypeptide variable region corresponding to a position in the light polypeptide variable region, the charged amino acid having a second polarity opposite the first polarity. Nucleic acid sequences which code for novel light chain proteins, the latter of which are used in conjunction with the inventive method, are also provided.

  10. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...

  11. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...

  12. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...

  13. Defining Electron Bifurcation in the Electron-Transferring Flavoprotein Family.

    PubMed

    Garcia Costas, Amaya M; Poudel, Saroj; Miller, Anne-Frances; Schut, Gerrit J; Ledbetter, Rhesa N; Fixen, Kathryn R; Seefeldt, Lance C; Adams, Michael W W; Harwood, Caroline S; Boyd, Eric S; Peters, John W

    2017-11-01

    Electron bifurcation is the coupling of exergonic and endergonic redox reactions to simultaneously generate (or utilize) low- and high-potential electrons. It is the third recognized form of energy conservation in biology and was recently described for select electron-transferring flavoproteins (Etfs). Etfs are flavin-containing heterodimers best known for donating electrons derived from fatty acid and amino acid oxidation to an electron transfer respiratory chain via Etf-quinone oxidoreductase. Canonical examples contain a flavin adenine dinucleotide (FAD) that is involved in electron transfer, as well as a non-redox-active AMP. However, Etfs demonstrated to bifurcate electrons contain a second FAD in place of the AMP. To expand our understanding of the functional variety and metabolic significance of Etfs and to identify amino acid sequence motifs that potentially enable electron bifurcation, we compiled 1,314 Etf protein sequences from genome sequence databases and subjected them to informatic and structural analyses. Etfs were identified in diverse archaea and bacteria, and they clustered into five distinct well-supported groups, based on their amino acid sequences. Gene neighborhood analyses indicated that these Etf group designations largely correspond to putative differences in functionality. Etfs with the demonstrated ability to bifurcate were found to form one group, suggesting that distinct conserved amino acid sequence motifs enable this capability. Indeed, structural modeling and sequence alignments revealed that identifying residues occur in the NADH- and FAD-binding regions of bifurcating Etfs. Collectively, a new classification scheme for Etf proteins that delineates putative bifurcating versus nonbifurcating members is presented and suggests that Etf-mediated bifurcation is associated with surprisingly diverse enzymes. IMPORTANCE Electron bifurcation has recently been recognized as an electron transfer mechanism used by microorganisms to maximize energy conservation. Bifurcating enzymes couple thermodynamically unfavorable reactions with thermodynamically favorable reactions in an overall spontaneous process. Here we show that the electron-transferring flavoprotein (Etf) enzyme family exhibits far greater diversity than previously recognized, and we provide a phylogenetic analysis that clearly delineates bifurcating versus nonbifurcating members of this family. Structural modeling of proteins within these groups reveals key differences between the bifurcating and nonbifurcating Etfs. Copyright © 2017 American Society for Microbiology.

  14. Defining Electron Bifurcation in the Electron-Transferring Flavoprotein Family

    PubMed Central

    Garcia Costas, Amaya M.; Poudel, Saroj; Miller, Anne-Frances; Schut, Gerrit J.; Ledbetter, Rhesa N.; Seefeldt, Lance C.; Adams, Michael W. W.

    2017-01-01

    ABSTRACT Electron bifurcation is the coupling of exergonic and endergonic redox reactions to simultaneously generate (or utilize) low- and high-potential electrons. It is the third recognized form of energy conservation in biology and was recently described for select electron-transferring flavoproteins (Etfs). Etfs are flavin-containing heterodimers best known for donating electrons derived from fatty acid and amino acid oxidation to an electron transfer respiratory chain via Etf-quinone oxidoreductase. Canonical examples contain a flavin adenine dinucleotide (FAD) that is involved in electron transfer, as well as a non-redox-active AMP. However, Etfs demonstrated to bifurcate electrons contain a second FAD in place of the AMP. To expand our understanding of the functional variety and metabolic significance of Etfs and to identify amino acid sequence motifs that potentially enable electron bifurcation, we compiled 1,314 Etf protein sequences from genome sequence databases and subjected them to informatic and structural analyses. Etfs were identified in diverse archaea and bacteria, and they clustered into five distinct well-supported groups, based on their amino acid sequences. Gene neighborhood analyses indicated that these Etf group designations largely correspond to putative differences in functionality. Etfs with the demonstrated ability to bifurcate were found to form one group, suggesting that distinct conserved amino acid sequence motifs enable this capability. Indeed, structural modeling and sequence alignments revealed that identifying residues occur in the NADH- and FAD-binding regions of bifurcating Etfs. Collectively, a new classification scheme for Etf proteins that delineates putative bifurcating versus nonbifurcating members is presented and suggests that Etf-mediated bifurcation is associated with surprisingly diverse enzymes. IMPORTANCE Electron bifurcation has recently been recognized as an electron transfer mechanism used by microorganisms to maximize energy conservation. Bifurcating enzymes couple thermodynamically unfavorable reactions with thermodynamically favorable reactions in an overall spontaneous process. Here we show that the electron-transferring flavoprotein (Etf) enzyme family exhibits far greater diversity than previously recognized, and we provide a phylogenetic analysis that clearly delineates bifurcating versus nonbifurcating members of this family. Structural modeling of proteins within these groups reveals key differences between the bifurcating and nonbifurcating Etfs. PMID:28808132

  15. Use of CYP52A2A promoter to increase gene expression in yeast

    DOEpatents

    Craft, David L.; Wilson, C. Ron; Eirich, Dudley; Zhang, Yeyan

    2004-01-06

    A nucleic acid sequence including a CYP promoter operably linked to nucleic acid encoding a heterologous protein is provided to increase transcription of the nucleic acid. Expression vectors and host cells containing the nucleic acid sequence are also provided. The methods and compositions described herein are especially useful in the production of polycarboxylic acids by yeast cells.

  16. Biodegradation and Osteosarcoma Cell Cultivation on Poly(aspartic acid) Based Hydrogels.

    PubMed

    Juriga, Dávid; Nagy, Krisztina; Jedlovszky-Hajdú, Angéla; Perczel-Kovách, Katalin; Chen, Yong Mei; Varga, Gábor; Zrínyi, Miklós

    2016-09-14

    Development of novel biodegradable and biocompatible scaffold materials with optimal characteristics is important for both preclinical and clinical applications. The aim of the present study was to analyze the biodegradability of poly(aspartic acid)-based hydrogels, and to test their usability as scaffolds for MG-63 osteoblast-like cells. Poly(aspartic acid) was fabricated from poly(succinimide) and hydrogels were prepared using natural amines as cross-linkers (diaminobutane and cystamine). Disulfide bridges were cleaved to thiol groups and the polymer backbone was further modified with RGD sequence. Biodegradability of the hydrogels was evaluated by experiments on the base of enzymes and cell culture medium. Poly(aspartic acid) hydrogels possessing only disulfide bridges as cross-links proved to be degradable by collagenase I. The MG-63 cells showed healthy, fibroblast-like morphology on the double cross-linked and RGD modified hydrogels. Thiolated poly(aspartic acid) based hydrogels provide ideal conditions for adhesion, survival, proliferation, and migration of osteoblast-like cells. The highest viability was found on the thiolated PASP gels while the RGD motif had influence on compacted cluster formation of the cells. These biodegradable and biocompatible poly(aspartic acid)-based hydrogels are promising scaffolds for cell cultivation.

  17. Molecular cloning and expression of rat liver bile acid CoA ligase.

    PubMed

    Falany, Charles N; Xie, Xiaowei; Wheeler, James B; Wang, Jin; Smith, Michelle; He, Dongning; Barnes, Stephen

    2002-12-01

    Bile acid CoA ligase (BAL) is responsible for catalyzing the first step in the conjugation of bile acids with amino acids. Sequencing of putative rat liver BAL cDNAs identified a cDNA (rBAL-1) possessing a 51 nucleotide 5'-untranslated region, an open reading frame of 2,070 bases encoding a 690 aa protein with a molecular mass of 75,960 Da, and a 138 nucleotide 3'-nontranslated region followed by a poly(A) tail. Identity of the cDNA was established by: 1) the rBAL-1 open reading frame encoded peptides obtained by chemical sequencing of the purified rBAL protein; 2) expressed rBAL-1 protein comigrated with purified rBAL during SDS-polyacrylamide gel electrophoresis; and 3) rBAL-1 expressed in insect Sf9 cells had enzymatic properties that were comparable to the enzyme isolated from rat liver. Evidence for a relationship between fatty acid and bile acid metabolism is suggested by specific inhibition of rBAL-1 by cis-unsaturated fatty acids and its high homology to a human very long chain fatty acid CoA ligase. In summary, these results indicate that the cDNA for rat liver BAL has been isolated and expression of the rBAL cDNA in insect Sf9 cells results in a catalytically active enzyme capable of utilizing several different bile acids as substrates.

  18. Molecular and phylogenetic characterization of bovine coronavirus virus isolated from dairy cattle in Central Region, Thailand.

    PubMed

    Singasa, Kanokwan; Songserm, Taweesak; Lertwatcharasarakul, Preeda; Arunvipas, Pipat

    2017-10-01

    Bovine coronavirus (BCoV) is involved mainly in enteric infections in cattle. This study reports the first molecular detection of BCoV in a diarrhea outbreak in dairy cows in the Central Region, Thailand. BCoV was molecularly detected from bloody diarrheic cattle feces by using nested PCR. Agarose gel electrophoresis of three diarrheic fecal samples yielded from the 25 samples desired amplicons that were 488 base pairs and sequencing substantiated that have BCoV. The sequence alignment indicated that nucleotide and amino acid sequences, the three TWD isolated in Thailand, were more quite homologous to each other (amino acid at position 39 of TWD1, TWD3 was proline, but TWD2 was serine) and closely related to OK-0514-3strain (virulent respiratory strain; RBCoV).The amino acid sequencing identities among TWD1, TWD2,TWD3, and OK-0514-3 strain were 96.0 to 96.6%, those at which T3I, H65N, D87G, H127Y, andQ136R were changed. In addition, the phylogenetic tree of the hypervariable region S1subunit spike glycoprotein BCoV gene was composed of three major clades by using the 54 sequences generated and showed that the evolutionally distance, TWD1, TWD2, and TWD3 were the isolated group together and most similar to OK-0514-3 strain (98.2 to 98.5% similarity). Further study will develop ELISA assay for serologic detection of winter dysentery disease.

  19. Molecular cloning and characterization of beluga whale (Delphinapterus leucas) interleukin-1beta and tumor necrosis factor-alpha.

    PubMed Central

    Denis, F; Archambault, D

    2001-01-01

    Interleukin-1beta (IL-1beta) and tumor necrosis factor-alpha (TNF-alpha) are cytokines produced primarily by monocytes and macrophages with regulatory effects in inflammation and multiple aspects of the immune response. As yet, no molecular data have been reported for IL-1beta and TNF-alpha of the beluga whale. In this study, we cloned and determined the entire cDNA sequence encoding beluga whale IL-1beta and TNF-alpha. The genetic relationship of the cytokine sequences was then analyzed with those from several mammalian species, including the human and the pig. The homology of beluga whale IL-1beta nucleic acid and deduced amino acid sequences with those from these mammalian species ranged from 74.6 to 86.0% and 62.7 to 77.1%, respectively, whereas that of TNF-alpha varied from 79.3 to 90.8% and 75.3 to 87.7%, respectively. Phylogenetic analyses based on deduced amino acid sequences showed that the beluga whale IL-1beta and TNF-alpha were most closely related to those of the ruminant species (cattle, sheep, and deer). The beluga whale IL-1beta- and TNF-alpha-encoding sequences were thereafter successfully expressed in Escherichia coli as fusion proteins by using procaryotic expression vectors. The fusion proteins were used to produce beluga whale IL-1beta- and TNF-alpha-specific rabbit antisera. Images Figure 3. Figure 4. Figure 5. PMID:11768130

  20. Identification of Clinical Coryneform Bacterial Isolates: Comparison of Biochemical Methods and Sequence Analysis of 16S rRNA and rpoB Genes▿

    PubMed Central

    Adderson, Elisabeth E.; Boudreaux, Jan W.; Cummings, Jessica R.; Pounds, Stanley; Wilson, Deborah A.; Procop, Gary W.; Hayden, Randall T.

    2008-01-01

    We compared the relative levels of effectiveness of three commercial identification kits and three nucleic acid amplification tests for the identification of coryneform bacteria by testing 50 diverse isolates, including 12 well-characterized control strains and 38 organisms obtained from pediatric oncology patients at our institution. Between 33.3 and 75.0% of control strains were correctly identified to the species level by phenotypic systems or nucleic acid amplification assays. The most sensitive tests were the API Coryne system and amplification and sequencing of the 16S rRNA gene using primers optimized for coryneform bacteria, which correctly identified 9 of 12 control isolates to the species level, and all strains with a high-confidence call were correctly identified. Organisms not correctly identified were species not included in the test kit databases or not producing a pattern of reactions included in kit databases or which could not be differentiated among several genospecies based on reaction patterns. Nucleic acid amplification assays had limited abilities to identify some bacteria to the species level, and comparison of sequence homologies was complicated by the inclusion of allele sequences obtained from uncultivated and uncharacterized strains in databases. The utility of rpoB genotyping was limited by the small number of representative gene sequences that are currently available for comparison. The correlation between identifications produced by different classification systems was poor, particularly for clinical isolates. PMID:18160450

  1. Isolation and characterization of full-length putative alcohol dehydrogenase genes from polygonum minus

    NASA Astrophysics Data System (ADS)

    Hamid, Nur Athirah Abd; Ismail, Ismanizan

    2013-11-01

    Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.

  2. Hydroxamic acids as weak base indicators: protonation in strong acid media.

    PubMed

    García, B; Ibeas, S; Hoyuelos, F J; Leal, J M; Secco, F; Venturini, M

    2001-11-30

    The protonation equilibria of N-phenylbenzohydroxamic, benzohydroxamic, salicylhydroxamic, and N-p-tolylcinnamohydroxamic acids have been studied at 25 degrees C in concentrated sulfuric, hydrochloric, and perchloric acid media; the UV-vis spectral measurements were analyzed using the Hammett equation and the Bunnett-Olsen and excess acidity methods. The medium effects observed in the UV spectral curves were corrected with the Cox-Yates and vector analysis methods. The H(A) acidity function based on benzamides provided the best results. The range of variation of the solvation coefficient m is similar to that of amides, this indicating similar solvation requirements for amides and hydroxamic acids. For the same substrate, the observed variations of pK(BH)(+) with the mineral acid used was justified by formation of solvent-separated ion pairs; for the same mineral acid, the observed changes in pK(BH)(+) can be explained by the solvation of BH(+). The change of the pK(BH)(+) values was in reasonably good agreement with the sequence of the catalytic efficiency of the mineral acids used, HCl > H(2)SO(4) > HClO(4).

  3. What can we learn about lyssavirus genomes using 454 sequencing?

    PubMed

    Höper, Dirk; Finke, Stefan; Freuling, Conrad M; Hoffmann, Bernd; Beer, Martin

    2012-01-01

    The main task of the individual project number four"Whole genome sequencing, virus-host adaptation, and molecular epidemiological analyses of lyssaviruses "within the network" Lyssaviruses--a potential re-emerging public health threat" is to provide high quality complete genome sequences from lyssaviruses. These sequences are analysed in-depth with regard to the diversity of the viral populations as to both quasi-species and so-called defective interfering RNAs. Moreover, the sequence data will facilitate further epidemiological analyses, will provide insight into the evolution of lyssaviruses and will be the basis for the design of novel nucleic acid based diagnostics. The first results presented here indicate that not only high quality full-length lyssavirus genome sequences can be generated, but indeed efficient analysis of the viral population gets feasible.

  4. Novel poly-uridine insertion in the 3'UTR and E2 amino acid substitutions in a low virulent classical swine fever virus.

    PubMed

    Coronado, Liani; Liniger, Matthias; Muñoz-González, Sara; Postel, Alexander; Pérez, Lester Josue; Pérez-Simó, Marta; Perera, Carmen Laura; Frías-Lepoureau, Maria Teresa; Rosell, Rosa; Grundhoff, Adam; Indenbirken, Daniela; Alawi, Malik; Fischer, Nicole; Becher, Paul; Ruggli, Nicolas; Ganges, Llilianne

    2017-03-01

    In this study, we compared the virulence in weaner pigs of the Pinar del Rio isolate and the virulent Margarita strain. The latter caused the Cuban classical swine fever (CSF) outbreak of 1993. Our results showed that the Pinar del Rio virus isolated during an endemic phase is clearly of low virulence. We analysed the complete nucleotide sequence of the Pinar del Rio virus isolated after persistence in newborn piglets, as well as the genome sequence of the inoculum. The consensus genome sequence of the Pinar del Rio virus remained completely unchanged after 28days of persistent infection in swine. More importantly, a unique poly-uridine tract was discovered in the 3'UTR of the Pinar del Rio virus, which was not found in the Margarita virus or any other known CSFV sequences. Based on RNA secondary structure prediction, the poly-uridine tract results in a long single-stranded intervening sequence (SS) between the stem-loops I and II of the 3'UTR, without major changes in the stem- loop structures when compared to the Margarita virus. The possible implications of this novel insertion on persistence and attenuation remain to be investigated. In addition, comparison of the amino acid sequence of the viral proteins E rns , E1, E2 and p7 of the Margarita and Pinar del Rio viruses showed that all non-conservative amino acid substitutions acquired by the Pinar del Rio isolate clustered in E2, with two of them being located within the B/C domain. Immunisation and cross-neutralisation experiments in pigs and rabbits suggest differences between these two viruses, which may be attributable to the amino acid differences observed in E2. Altogether, these data provide fresh insights into viral molecular features which might be associated with the attenuation and adaptation of CSFV for persistence in the field. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Optimizing the specificity of nucleic acid hybridization.

    PubMed

    Zhang, David Yu; Chen, Sherry Xi; Yin, Peng

    2012-01-22

    The specific hybridization of complementary sequences is an essential property of nucleic acids, enabling diverse biological and biotechnological reactions and functions. However, the specificity of nucleic acid hybridization is compromised for long strands, except near the melting temperature. Here, we analytically derived the thermodynamic properties of a hybridization probe that would enable near-optimal single-base discrimination and perform robustly across diverse temperature, salt and concentration conditions. We rationally designed 'toehold exchange' probes that approximate these properties, and comprehensively tested them against five different DNA targets and 55 spurious analogues with energetically representative single-base changes (replacements, deletions and insertions). These probes produced discrimination factors between 3 and 100+ (median, 26). Without retuning, our probes function robustly from 10 °C to 37 °C, from 1 mM Mg(2+) to 47 mM Mg(2+), and with nucleic acid concentrations from 1 nM to 5 µM. Experiments with RNA also showed effective single-base change discrimination.

  6. Streptococcal phosphoenolpyruvate-sugar phosphotransferase system: amino acid sequence and site of ATP-dependent phosphorylation of HPr

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deutscher, J.; Pevec, B.; Beyreuther, K.

    1986-10-21

    The amino acid sequence of histidine-containing protein (HPr) from Streptococcus faecalis has been determined by direct Edman degradation of intact HPr and by amino acid sequence analysis of tryptic peptides, V8 proteolyptic peptides, thermolytic peptides, and cyanogen bromide cleavage products. HPr from S. faecalis was found to contain 89 amino acid residues, corresponding to a molecular weight of 9438. The amino acid sequence of HPr from S. faecalis shows extended homology to the primary structure of HPr proteins from other bacteria. Besides the phosphoenolpyruvate-dependent phosphorylation of a histidyl residue in HPr, catalyzed by enzyme I of the bacterial phosphotransferase system,more » HPr was also found to be phosphorylated at a seryl residue in an ATP-dependent protein kinase catalyzed reaction. The site of ATP-dependent phosphorylation in HPr of S faecalis has now been determined. (/sup 32/P)P-Ser-HPr was digested with three different proteases, and in each case, a single labeled peptide was isolated. Following digestion with subtilisin, they obtained a peptide with the sequence -(P)Ser-Ile-Met-. Using chymotrypsin, they isolated a peptide with the sequence -Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-Gly-Val-Met-. The longest labeled peptide was obtained with V8 staphylococcal protease. According to amino acid analysis, this peptide contained 36 out of the 89 amino acid residues of HPr. The following sequence of 12 amino acid residues of the V8 peptide was determined: -Tyr-Lys-Gly-Lys-Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-. Thus, the site of ATP-dependent phosphorylation was determined to be Ser-46 within the primary structure of HPr.« less

  7. Sequence and pattern of expression of a bovine homologue of a human mitochondrial transport protein associated with Grave's disease.

    PubMed

    Fiermonte, G; Runswick, M J; Walker, J E; Palmieri, F

    1992-01-01

    A human cDNA has been isolated previously from a thyroid library with the aid of serum from a patient with Grave's disease. It encodes a protein belonging to the mitochondrial metabolite carrier family, referred to as the Grave's disease carrier protein (GDC). Using primers based on this sequence, overlapping cDNAs encoding the bovine homologue of the GDC have been isolated from total bovine heart poly(A)+ cDNA. The bovine protein is 18 amino acids shorter than the published human sequence, but if a frame shift requiring the removal of one nucleotide is introduced into the human cDNA sequence, the human and bovine proteins become identical in their C-terminal regions, and 308 out of 330 amino acids are conserved over their entire sequences. The bovine cDNA has been used to investigate the expression of the GDC in various bovine tissues. In the tissues that were examined, the GDC is most strongly expressed in the thyroid, but substantial amounts of its mRNA were also detected in liver, lung and kidney, and lesser amounts in heart and skeletal muscle.

  8. PLAAC: a web and command-line application to identify proteins with prion-like amino acid composition.

    PubMed

    Lancaster, Alex K; Nutter-Upham, Andrew; Lindquist, Susan; King, Oliver D

    2014-09-01

    Prions are self-templating protein aggregates that stably perpetuate distinct biological states and are of keen interest to researchers in both evolutionary and biomedical science. The best understood prions are from yeast and have a prion-forming domain with strongly biased amino acid composition, most notably enriched for Q or N. PLAAC is a web application that scans protein sequences for domains with P: rion- L: ike A: mino A: cid C: omposition. Users can upload sequence files, or paste sequences directly into a textbox. PLAAC ranks the input sequences by several summary scores and allows scores along sequences to be visualized. Text output files can be downloaded for further analyses, and visualizations saved in PDF and PNG formats. http://plaac.wi.mit.edu/. The Ruby-based web framework and the command-line software (implemented in Java, with visualization routines in R) are available at http://github.com/whitehead/plaac under the MIT license. All software can be run under OS X, Windows and Unix. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Methods and compositions for regulating gene expression in plant cells

    NASA Technical Reports Server (NTRS)

    Dai, Shunhong (Inventor); Beachy, Roger N. (Inventor); Luis, Maria Isabel Ordiz (Inventor)

    2010-01-01

    Novel chimeric plant promoter sequences are provided, together with plant gene expression cassettes comprising such sequences. In certain preferred embodiments, the chimeric plant promoters comprise the BoxII cis element and/or derivatives thereof. In addition, novel transcription factors are provided, together with nucleic acid sequences encoding such transcription factors and plant gene expression cassettes comprising such nucleic acid sequences. In certain preferred embodiments, the novel transcription factors comprise the acidic domain, or fragments thereof, of the RF2a transcription factor. Methods for using the chimeric plant promoter sequences and novel transcription factors in regulating the expression of at least one gene of interest are provided, together with transgenic plants comprising such chimeric plant promoter sequences and novel transcription factors.

  10. The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase.

    PubMed Central

    Freemont, P S; Dunbar, B; Fothergill-Gilmore, L A

    1988-01-01

    The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase, comprising 363 residues, was determined. The sequence was deduced by automated sequencing of CNBr-cleavage, o-iodosobenzoic acid-cleavage, trypsin-digest and staphylococcal-proteinase-digest fragments. Comparison of the sequence with other class I aldolase sequences shows that the mammalian muscle isoenzyme is one of the most highly conserved enzymes known, with only about 2% of the residues changing per 100 million years. Non-mammalian aldolases appear to be evolving at the same rate as other glycolytic enzymes, with about 4% of the residues changing per 100 million years. Secondary-structure predictions are analysed in an accompanying paper [Sawyer, Fothergill-Gilmore & Freemont (1988) Biochem. J. 249, 789-793]. PMID:3355497

  11. A TWO-YEAR DOSE-RESPONSE STUDY OF LESION SEQUENCES DURING HEPATOCELLULAR CARCINOGENESIS IN THE MALE B6C3F1 MOUSE GIVEN THE DRINKING WATER CHEMICAL DICHLOROACETIC ACID

    EPA Science Inventory

    ABSTRACT

    Dichloroacetic acid (DCA) is carcinogenic to the B6C3F 1 mouse and the F344 rat. Given the carcinogenic potential of DCA in rodent liver, and the known concentrations of this compound in drinking water, reliable biologically-based models to reduce the uncertai...

  12. GibbsCluster: unsupervised clustering and alignment of peptide sequences.

    PubMed

    Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten

    2017-07-03

    Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motifs. GibbsCluster is a powerful tool for unsupervised motif discovery because it can simultaneously cluster and align peptide data. The GibbsCluster 2.0 presented here is an improved version incorporating insertion and deletions accounting for variations in motif length in the peptide input. In basic terms, the program takes as input a set of peptide sequences and clusters them into meaningful groups. It returns the optimal number of clusters it identified, together with the sequence alignment and sequence motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. The complete genome sequence of a south Indian isolate of Rice tungro spherical virus reveals evidence of genetic recombination between distinct isolates.

    PubMed

    Sailaja, B; Anjum, Najreen; Patil, Yogesh K; Agarwal, Surekha; Malathi, P; Krishnaveni, D; Balachandran, S M; Viraktamath, B C; Mangrauthia, Satendra K

    2013-12-01

    In this study, complete genome of a south Indian isolate of Rice tungro spherical virus (RTSV) from Andhra Pradesh (AP) was sequenced, and the predicted amino acid sequence was analysed. The RTSV RNA genome consists of 12,171 nt without the poly(A) tail, encoding a putative typical polyprotein of 3,470 amino acids. Furthermore, cleavage sites and sequence motifs of the polyprotein were predicted. Multiple alignment with other RTSV isolates showed a nucleotide sequence identity of 95% to east Indian isolates and 90% to Philippines isolates. A phylogenetic tree based on complete genome sequence showed that Indian isolates clustered together, while Vt6 and PhilA isolates of Philippines formed two separate clusters. Twelve recombination events were detected in RNA genome of RTSV using the Recombination Detection Program version 3. Recombination analysis suggested significant role of 5' end and central region of genome in virus evolution. Further, AP and Odisha isolates appeared as important RTSV isolates involved in diversification of this virus in India through recombination phenomenon. The new addition of complete genome of first south Indian isolate provided an opportunity to establish the molecular evolution of RTSV through recombination analysis and phylogenetic relationship.

  14. Cloning and sequencing of the allophycocyanin genes from Spirulina maxima (Cyanophyta)

    NASA Astrophysics Data System (ADS)

    Qin, Song; Hiroyuki, Kojima; Yoshikazu, Kawata; Shin-Ichi, Yano; Zeng, Cheng-Kui

    1998-03-01

    The genes coding for the α-and β-subunit of allophycocyanin ( apcA and apcB) from the cyanophyte Spirulina maxima were cloned and sequenced. The results revealed 44.4% of nucleotide sequence similarity and 30.4% of similarity of deduced amino acid sequence between them. The amino acid sequence identities between S. maxima and S. platensis are 99.4% for α subunit and 100% for β subunit.

  15. Identification of an additional member of the protein-tyrosine-phosphatase family: evidence for alternative splicing in the tyrosine phosphatase domain.

    PubMed Central

    Matthews, R J; Cahir, E D; Thomas, M L

    1990-01-01

    Protein-tyrosine-phosphatases (protein-tyrosine-phosphate phosphohydrolase, EC 3.13.48) have been implicated in the regulation of cell growth; however, to date few tyrosine phosphatases have been characterized. To identify additional family members, the cDNA for the human tyrosine phosphatase leukocyte common antigen (LCA; CD45) was used to screen, under low stringency, a mouse pre-B-cell cDNA library. Two cDNA clones were isolated and sequence analysis predicts a protein sequence of 793 amino acids. We have named the molecule LRP (LCA-related phosphatase). RNA transfer analysis indicates that the cDNAs were derived from a 3.2-kilobase mRNA. The LRP mRNA is transcribed in a wide variety of tissues. The predicted protein structure can be divided into the following structural features: a short 19-amino acid leader sequence, an exterior domain of 123 amino acids that is predicted to be highly glycosylated, a 24-amino acid membrane-spanning region, and a 627-amino acid cytoplasmic region. The cytoplasmic region contains two approximately 260-amino acid domains, each with homology to the tyrosine phosphatase family. One of the cDNA clones differed in that it had a 108-base-pair insertion that, while preserving the reading frame, would disrupt the first protein-tyrosine-phosphatase domain. Analysis of genomic DNA indicates that the insertion is due to an alternatively spliced exon. LRP appears to be evolutionarily conserved as a putative homologue has been identified in the invertebrate Styela plicata. Images PMID:2162042

  16. Biogeography of sulfur-oxidizing Acidithiobacillus populations in extremely acidic cave biofilms

    PubMed Central

    Jones, Daniel S; Schaperdoth, Irene; Macalady, Jennifer L

    2016-01-01

    Extremely acidic (pH 0–1.5) Acidithiobacillus-dominated biofilms known as snottites are found in sulfide-rich caves around the world. Given the extreme geochemistry and subsurface location of the biofilms, we hypothesized that snottite Acidithiobacillus populations would be genetically isolated. We therefore investigated biogeographic relationships among snottite Acidithiobacillus spp. separated by geographic distances ranging from meters to 1000s of kilometers. We determined genetic relationships among the populations using techniques with three levels of resolution: (i) 16S rRNA gene sequencing, (ii) 16S–23S intergenic transcribed spacer (ITS) region sequencing and (iii) multi-locus sequencing typing (MLST). We also used metagenomics to compare functional gene characteristics of select populations. Based on 16S rRNA genes, snottites in Italy and Mexico are dominated by different sulfur-oxidizing Acidithiobacillus spp. Based on ITS sequences, Acidithiobacillus thiooxidans strains from different cave systems in Italy are genetically distinct. Based on MLST of isolates from Italy, genetic distance is positively correlated with geographic distance both among and within caves. However, metagenomics revealed that At. thiooxidans populations from different cave systems in Italy have different sulfur oxidation pathways and potentially other significant differences in metabolic capabilities. In light of those genomic differences, we argue that the observed correlation between genetic and geographic distance among snottite Acidithiobacillus populations is partially explained by an evolutionary model in which separate cave systems were stochastically colonized by different ancestral surface populations, which then continued to diverge and adapt in situ. PMID:27187796

  17. PhAST: pharmacophore alignment search tool.

    PubMed

    Hähnke, Volker; Hofmann, Bettina; Grgat, Tomislav; Proschak, Ewgenij; Steinhilber, Dieter; Schneider, Gisbert

    2009-04-15

    We present a ligand-based virtual screening technique (PhAST) for rapid hit and lead structure searching in large compound databases. Molecules are represented as strings encoding the distribution of pharmacophoric features on the molecular graph. In contrast to other text-based methods using SMILES strings, we introduce a new form of text representation that describes the pharmacophore of molecules. This string representation opens the opportunity for revealing functional similarity between molecules by sequence alignment techniques in analogy to homology searching in protein or nucleic acid sequence databases. We favorably compared PhAST with other current ligand-based virtual screening methods in a retrospective analysis using the BEDROC metric. In a prospective application, PhAST identified two novel inhibitors of 5-lipoxygenase product formation with minimal experimental effort. This outcome demonstrates the applicability of PhAST to drug discovery projects and provides an innovative concept of sequence-based compound screening with substantial scaffold hopping potential. 2008 Wiley Periodicals, Inc.

  18. High-performance liquid chromatography study of the enantiomer separation of chrysanthemic acid and its analogous compounds on a terguride-based stationary phase.

    PubMed

    Dondi, M; Flieger, M; Olsovska, J; Polcaro, C M; Sinibaldi, M

    1999-10-29

    The direct enantioseparation of chrysanthemic acid [2,2-dimethyl-3-(2-methylpropenyl)-cyclopropanecarboxylic acid] and its halogen-substituted analogues was systematically studied by HPLC using a terguride-based chiral stationary phase in combination with a UV diode array and chiroptical detectors. Isomers with (1R) configuration always eluted before those with (IS) configuration. The elution sequence of cis- and trans-isomers was strongly affected by mobile phase pH, whereas the enantioselectivity remained the same. Conditions for the separation of all the enantiomers were also examined. This method was used for monitor the hydrolytic degradation products of Cyfluthrin (Baythroid) in soil under laboratory conditions.

  19. Direct Comparison of Amino Acid and Salt Interactions with Double-Stranded and Single-Stranded DNA from Explicit-Solvent Molecular Dynamics Simulations.

    PubMed

    Andrews, Casey T; Campbell, Brady A; Elcock, Adrian H

    2017-04-11

    Given the ubiquitous nature of protein-DNA interactions, it is important to understand the interaction thermodynamics of individual amino acid side chains for DNA. One way to assess these preferences is to perform molecular dynamics (MD) simulations. Here we report MD simulations of 20 amino acid side chain analogs interacting simultaneously with both a 70-base-pair double-stranded DNA and with a 70-nucleotide single-stranded DNA. The relative preferences of the amino acid side chains for dsDNA and ssDNA match well with values deduced from crystallographic analyses of protein-DNA complexes. The estimated apparent free energies of interaction for ssDNA, on the other hand, correlate well with previous simulation values reported for interactions with isolated nucleobases, and with experimental values reported for interactions with guanosine. Comparisons of the interactions with dsDNA and ssDNA indicate that, with the exception of the positively charged side chains, all types of amino acid side chain interact more favorably with ssDNA, with intercalation of aromatic and aliphatic side chains being especially notable. Analysis of the data on a base-by-base basis indicates that positively charged side chains, as well as sodium ions, preferentially bind to cytosine in ssDNA, and that negatively charged side chains, and chloride ions, preferentially bind to guanine in ssDNA. These latter observations provide a novel explanation for the lower salt dependence of DNA duplex stability in GC-rich sequences relative to AT-rich sequences.

  20. Use of linalool synthase in genetic engineering of scent production

    DOEpatents

    Pichersky, E.

    1998-12-15

    A purified S-linalool synthase polypeptide from Clarkia breweri is disclosed as is the recombinant polypeptide and nucleic acid sequences encoding the polypeptide. Also disclosed are antibodies immunoreactive with the purified peptide and with recombinant versions of the polypeptide. Methods of using the nucleic acid sequences, as well as methods of enhancing the smell and the flavor of plants expressing the nucleic acid sequences are also disclosed. 5 figs.

Top