acid sequences allowed: Topics by Science.gov

Sample records for acid sequences allowed

A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses

USDA-ARS?s Scientific Manuscript database

Background: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected ...
Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

PubMed Central

Thomsen, Martin Christen Frølund; Nielsen, Morten

2012-01-01

Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583
Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

NASA Astrophysics Data System (ADS)

Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

2000-02-01

Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.
37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...
NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.

PubMed

Liu, Sophia S; Hockenberry, Adam J; Lancichinetti, Andrea; Jewett, Michael C; Amaral, Luís A N

2016-11-01

The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.
Using Maximum Entropy to Find Patterns in Genomes

NASA Astrophysics Data System (ADS)

Liu, Sophia; Hockenberry, Adam; Lancichinetti, Andrea; Jewett, Michael; Amaral, Luis

The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. To accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. This approach can also be easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes. National Institute of General Medical Science, Northwestern University Presidential Fellowship, National Science Foundation, David and Lucile Packard Foundation, Camille Dreyfus Teacher Scholar Award.
Ultra high-throughput nucleic acid sequencing as a tool for virus discovery in the turkey gut.

USDA-ARS?s Scientific Manuscript database

Recently, the use of the next generation of nucleic acid sequencing technology (i.e., 454 pyrosequencing, as developed by Roche/454 Life Sciences) has allowed an in-depth look at the uncultivated microorganisms present in complex environmental samples, including samples with agricultural importance....
WEB-server for search of a periodicity in amino acid and nucleotide sequences

NASA Astrophysics Data System (ADS)

E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

2017-12-01

A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.
CDSbank: taxonomy-aware extraction, selection, renaming and formatting of protein-coding DNA or amino acid sequences.

PubMed

Hazes, Bart

2014-02-28

Protein-coding DNA sequences and their corresponding amino acid sequences are routinely used to study relationships between sequence, structure, function, and evolution. The rapidly growing size of sequence databases increases the power of such comparative analyses but it makes it more challenging to prepare high quality sequence data sets with control over redundancy, quality, completeness, formatting, and labeling. Software tools for some individual steps in this process exist but manual intervention remains a common and time consuming necessity. CDSbank is a database that stores both the protein-coding DNA sequence (CDS) and amino acid sequence for each protein annotated in Genbank. CDSbank also stores Genbank feature annotation, a flag to indicate incomplete 5' and 3' ends, full taxonomic data, and a heuristic to rank the scientific interest of each species. This rich information allows fully automated data set preparation with a level of sophistication that aims to meet or exceed manual processing. Defaults ensure ease of use for typical scenarios while allowing great flexibility when needed. Access is via a free web server at http://hazeslab.med.ualberta.ca/CDSbank/. CDSbank presents a user-friendly web server to download, filter, format, and name large sequence data sets. Common usage scenarios can be accessed via pre-programmed default choices, while optional sections give full control over the processing pipeline. Particular strengths are: extract protein-coding DNA sequences just as easily as amino acid sequences, full access to taxonomy for labeling and filtering, awareness of incomplete sequences, and the ability to take one protein sequence and extract all synonymous CDS or identical protein sequences in other species. Finally, CDSbank can also create labeled property files to, for instance, annotate or re-label phylogenetic trees.
DNA tetrominoes: the construction of DNA nanostructures using self-organised heterogeneous deoxyribonucleic acids shapes.

PubMed

Ong, Hui San; Rahim, Mohd Syafiq; Firdaus-Raih, Mohd; Ramlan, Effirul Ikhwan

2015-01-01

The unique programmability of nucleic acids offers alternative in constructing excitable and functional nanostructures. This work introduces an autonomous protocol to construct DNA Tetris shapes (L-Shape, B-Shape, T-Shape and I-Shape) using modular DNA blocks. The protocol exploits the rich number of sequence combinations available from the nucleic acid alphabets, thus allowing for diversity to be applied in designing various DNA nanostructures. Instead of a deterministic set of sequences corresponding to a particular design, the protocol promotes a large pool of DNA shapes that can assemble to conform to any desired structures. By utilising evolutionary programming in the design stage, DNA blocks are subjected to processes such as sequence insertion, deletion and base shifting in order to enrich the diversity of the resulting shapes based on a set of cascading filters. The optimisation algorithm allows mutation to be exerted indefinitely on the candidate sequences until these sequences complied with all the four fitness criteria. Generated candidates from the protocol are in agreement with the filter cascades and thermodynamic simulation. Further validation using gel electrophoresis indicated the formation of the designed shapes. Thus, supporting the plausibility of constructing DNA nanostructures in a more hierarchical, modular, and interchangeable manner.
Method for analyzing nucleic acids by means of a substrate having a microchannel structure containing immobilized nucleic acid probes

DOEpatents

Ramsey, J. Michael; Foote, Robert S.

2003-12-09

A method and apparatus for analyzing nucleic acids includes immobilizing nucleic probes at specific sites within a microchannel structure and moving target nucleic acids into proximity to the probes in order to allow hybridization and fluorescence detection of specific target sequences.
Method for analyzing nucleic acids by means of a substrate having a microchannel structure containing immobilized nucleic acid probes

DOEpatents

Ramsey, J. Michael; Foote, Robert S.

2002-01-01

A method and apparatus for analyzing nucleic acids includes immobilizing nucleic probes at specific sites within a microchannel structure and moving target nucleic acids into proximity to the probes in order to allow hybridization and fluorescence detection of specific target sequences.
A Novel Cylindrical Representation for Characterizing Intrinsic Properties of Protein Sequences.

PubMed

Yu, Jia-Feng; Dou, Xiang-Hua; Wang, Hong-Bo; Sun, Xiao; Zhao, Hui-Ying; Wang, Ji-Hua

2015-06-22

The composition and sequence order of amino acid residues are the two most important characteristics to describe a protein sequence. Graphical representations facilitate visualization of biological sequences and produce biologically useful numerical descriptors. In this paper, we propose a novel cylindrical representation by placing the 20 amino acid residue types in a circle and sequence positions along the z axis. This representation allows visualization of the composition and sequence order of amino acids at the same time. Ten numerical descriptors and one weighted numerical descriptor have been developed to quantitatively describe intrinsic properties of protein sequences on the basis of the cylindrical model. Their applications to similarity/dissimilarity analysis of nine ND5 proteins indicated that these numerical descriptors are more effective than several classical numerical matrices. Thus, the cylindrical representation obtained here provides a new useful tool for visualizing and charactering protein sequences. An online server is available at http://biophy.dzu.edu.cn:8080/CNumD/input.jsp .
Quantitative thermodynamic predication of interactions between nucleic acid and non-nucleic acid species using Microsoft excel.

PubMed

Zou, Jiaqi; Li, Na

2013-09-01

Proper design of nucleic acid sequences is crucial for many applications. We have previously established a thermodynamics-based quantitative model to help design aptamer-based nucleic acid probes by predicting equilibrium concentrations of all interacting species. To facilitate customization of this thermodynamic model for different applications, here we present a generic and easy-to-use platform to implement the algorithm of the model with Microsoft(®) Excel formulas and VBA (Visual Basic for Applications) macros. Two Excel spreadsheets have been developed: one for the applications involving only nucleic acid species, the other for the applications involving both nucleic acid and non-nucleic acid species. The spreadsheets take the nucleic acid sequences and the initial concentrations of all species as input, guide the user to retrieve the necessary thermodynamic constants, and finally calculate equilibrium concentrations for all species in various bound and unbound conformations. The validity of both spreadsheets has been verified by comparing the modeling results with the experimental results on nucleic acid sequences reported in the literature. This Excel-based platform described here will allow biomedical researchers to rationalize the sequence design of nucleic acid probes using the thermodynamics-based modeling even without relevant theoretical and computational skills. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Phylogenetic analysis of β-xylanase SRXL1 of Sporisorium reilianum and its relationship with families (GH10 and GH11) of Ascomycetes and Basidiomycetes

PubMed Central

Álvarez-Cervantes, Jorge; Díaz-Godínez, Gerardo; Mercado-Flores, Yuridia; Gupta, Vijai Kumar; Anducho-Reyes, Miguel Angel

2016-01-01

In this paper, the amino acid sequence of the β-xylanase SRXL1 of Sporisorium reilianum, which is a pathogenic fungus of maize was used as a model protein to find its phylogenetic relationship with other xylanases of Ascomycetes and Basidiomycetes and the information obtained allowed to establish a hypothesis of monophyly and of biological role. 84 amino acid sequences of β-xylanase obtained from the GenBank database was used. Groupings analysis of higher-level in the Pfam database allowed to determine that the proteins under study were classified into the GH10 and GH11 families, based on the regions of highly conserved amino acids, 233–318 and 180–193 respectively, where glutamate residues are responsible for the catalysis. PMID:27040368
Nucleic Acid Detection Methods

DOEpatents

Smith, Cassandra L.; Yaar, Ron; Szafranski, Przemyslaw; Cantor, Charles R.

1998-05-19

The invention relates to methods for rapidly determining the sequence and/or length a target sequence. The target sequence may be a series of known or unknown repeat sequences which are hybridized to an array of probes. The hybridized array is digested with a single-strand nuclease and free 3'-hydroxyl groups extended with a nucleic acid polymerase. Nuclease cleaved heteroduplexes can be easily distinguish from nuclease uncleaved heteroduplexes by differential labeling. Probes and target can be differentially labeled with detectable labels. Matched target can be detected by cleaving resulting loops from the hybridized target and creating free 3-hydroxyl groups. These groups are recognized and extended by polymerases added into the reaction system which also adds or releases one label into solution. Analysis of the resulting products using either solid phase or solution. These methods can be used to detect characteristic nucleic acid sequences, to determine target sequence and to screen for genetic defects and disorders. Assays can be conducted on solid surfaces allowing for multiple reactions to be conducted in parallel and, if desired, automated.
Poliovirus replication proteins: RNA sequence encoding P3-1b and the sites of proteolytic processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Semler, B.L.; Anderson, C.W.; Kitamura, N.

1981-06-01

A partial amino-terminal amino acid sequence of each of the major proteins encoded by the replicase region of the poliovirus genome has been determined. A comparison of this sequence information with the amino acid sequence predicted from the RNA sequence that has been determined for the 3' region of the poliovirus genome has allowed us to locate precisely the proteolytic cleavage sites at which the initial polyprotein is processed to create the poliovirus products P3-1b (NCVP1b), P3-2 (NCVP2), P3-4b (NCVP4b), and P3-7c (NCVP7c). For each of these products, as well as for the small genome-linked protein VPg, proteolytic cleavage occursmore » between a glutamine and a glycine residue to create the amino terminus of each protein. This result suggests that a single proteinase may be responsible for all of these cleavages. The sequence data also allow the precise positioning of the genome-linked protein VPg within the precursor P3-1b just proximal to the amino terminus of polypeptide P3-2.« less
Sequence Alignment to Predict Across Species Susceptibility ...

EPA Pesticide Factsheets

Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev
A Generative Angular Model of Protein Structure Evolution

PubMed Central

Golden, Michael; García-Portugués, Eduardo; Sørensen, Michael; Mardia, Kanti V.; Hamelryck, Thomas; Hein, Jotun

2017-01-01

Abstract Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof. PMID:28453724
Conformational Entropy of Intrinsically Disordered Proteins from Amino Acid Triads

PubMed Central

Baruah, Anupaul; Rani, Pooja; Biswas, Parbati

2015-01-01

This work quantitatively characterizes intrinsic disorder in proteins in terms of sequence composition and backbone conformational entropy. Analysis of the normalized relative composition of the amino acid triads highlights a distinct boundary between globular and disordered proteins. The conformational entropy is calculated from the dihedral angles of the middle amino acid in the amino acid triad for the conformational ensemble of the globular, partially and completely disordered proteins relative to the non-redundant database. Both Monte Carlo (MC) and Molecular Dynamics (MD) simulations are used to characterize the conformational ensemble of the representative proteins of each group. The results show that the globular proteins span approximately half of the allowed conformational states in the Ramachandran space, while the amino acid triads in disordered proteins sample the entire range of the allowed dihedral angle space following Flory’s isolated-pair hypothesis. Therefore, only the sequence information in terms of the relative amino acid triad composition may be sufficient to predict protein disorder and the backbone conformational entropy, even in the absence of well-defined structure. The predicted entropies are found to agree with those calculated using mutual information expansion and the histogram method. PMID:26138206

Identification of multiple mRNA and DNA sequences from small tissue samples isolated by laser-assisted microdissection.

PubMed

Bernsen, M R; Dijkman, H B; de Vries, E; Figdor, C G; Ruiter, D J; Adema, G J; van Muijen, G N

1998-10-01

Molecular analysis of small tissue samples has become increasingly important in biomedical studies. Using a laser dissection microscope and modified nucleic acid isolation protocols, we demonstrate that multiple mRNA as well as DNA sequences can be identified from a single-cell sample. In addition, we show that the specificity of procurement of tissue samples is not compromised by smear contamination resulting from scraping of the microtome knife during sectioning of lesions. The procedures described herein thus allow for efficient RT-PCR or PCR analysis of multiple nucleic acid sequences from small tissue samples obtained by laser-assisted microdissection.
Nucleic acid detection methods

DOEpatents

Smith, C.L.; Yaar, R.; Szafranski, P.; Cantor, C.R.

1998-05-19

The invention relates to methods for rapidly determining the sequence and/or length a target sequence. The target sequence may be a series of known or unknown repeat sequences which are hybridized to an array of probes. The hybridized array is digested with a single-strand nuclease and free 3{prime}-hydroxyl groups extended with a nucleic acid polymerase. Nuclease cleaved heteroduplexes can be easily distinguish from nuclease uncleaved heteroduplexes by differential labeling. Probes and target can be differentially labeled with detectable labels. Matched target can be detected by cleaving resulting loops from the hybridized target and creating free 3-hydroxyl groups. These groups are recognized and extended by polymerases added into the reaction system which also adds or releases one label into solution. Analysis of the resulting products using either solid phase or solution. These methods can be used to detect characteristic nucleic acid sequences, to determine target sequence and to screen for genetic defects and disorders. Assays can be conducted on solid surfaces allowing for multiple reactions to be conducted in parallel and, if desired, automated. 18 figs.
DNA–DNA kissing complexes as a new tool for the assembly of DNA nanostructures

PubMed Central

Barth, Anna; Kobbe, Daniela; Focke, Manfred

2016-01-01

Kissing-loop annealing of nucleic acids occurs in nature in several viruses and in prokaryotic replication, among other circumstances. Nucleobases of two nucleic acid strands (loops) interact with each other, although the two strands cannot wrap around each other completely because of the adjacent double-stranded regions (stems). In this study, we exploited DNA kissing-loop interaction for nanotechnological application. We functionalized the vertices of DNA tetrahedrons with DNA stem-loop sequences. The complementary loop sequence design allowed the hybridization of different tetrahedrons via kissing-loop interaction, which might be further exploited for nanotechnology applications like cargo transport and logical elements. Importantly, we were able to manipulate the stability of those kissing-loop complexes based on the choice and concentration of cations, the temperature and the number of complementary loops per tetrahedron either at the same or at different vertices. Moreover, variations in loop sequences allowed the characterization of necessary sequences within the loop as well as additional stability control of the kissing complexes. Therefore, the properties of the presented nanostructures make them an important tool for DNA nanotechnology. PMID:26773051
Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, Soo-Ik; Hammes, G.G.

1989-11-01

Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chickenmore » and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.« less
Sequence similarity is more relevant than species specificity in probabilistic backtranslation.

PubMed

Ferro, Alfredo; Giugno, Rosalba; Pigola, Giuseppe; Pulvirenti, Alfredo; Di Pietro, Cinzia; Purrello, Michele; Ragusa, Marco

2007-02-21

Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically.
Amino Acid Properties Conserved in Molecular Evolution

PubMed Central

Rudnicki, Witold R.; Mroczek, Teresa; Cudek, Paweł

2014-01-01

That amino acid properties are responsible for the way protein molecules evolve is natural and is also reasonably well supported both by the structure of the genetic code and, to a large extent, by the experimental measures of the amino acid similarity. Nevertheless, there remains a significant gap between observed similarity matrices and their reconstructions from amino acid properties. Therefore, we introduce a simple theoretical model of amino acid similarity matrices, which allows splitting the matrix into two parts – one that depends only on mutabilities of amino acids and another that depends on pairwise similarities between them. Then the new synthetic amino acid properties are derived from the pairwise similarities and used to reconstruct similarity matrices covering a wide range of information entropies. Our model allows us to explain up to 94% of the variability in the BLOSUM family of the amino acids similarity matrices in terms of amino acid properties. The new properties derived from amino acid similarity matrices correlate highly with properties known to be important for molecular evolution such as hydrophobicity, size, shape and charge of amino acids. This result closes the gap in our understanding of the influence of amino acids on evolution at the molecular level. The methods were applied to the single family of similarity matrices used often in general sequence homology searches, but it is general and can be used also for more specific matrices. The new synthetic properties can be used in analyzes of protein sequences in various biological applications. PMID:24967708
Cloning and characterization of the gene encoding IMP dehydrogenase from Arabidopsis thaliana.

PubMed

Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E

1996-10-03

We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Arabidopsis thaliana (At). The transcription unit of the At gene spans approximately 1900 bp and specifies a protein of 503 amino acids with a calculated relative molecular mass (M(r)) of 54,190. The gene is comprised of a minimum of four introns and five exons with all donor and acceptor splice sequences conforming to previously proposed consensus sequences. The deduced IMPDH amino-acid sequence from At shows a remarkable similarity to other eukaryotic IMPDH sequences, with a 48% identity to human Type II enzyme. Allowing for conservative substitutions, the enzyme is 69% similar to human Type II IMPDH. The putative active-site sequence of At IMPDH conforms to the IMP dehydrogenase/guanosine monophosphate reductase motif and contains an essential active-site cysteine residue.
Design and construction of 2A peptide-linked multicistronic vectors.

PubMed

Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A

2012-02-01

The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. This article describes the design and construction of 2A peptide-linked multicistronic vectors, which can be used to express multiple proteins from a single open reading frame (ORF). The small 2A peptide sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

PubMed

Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

2017-04-15

Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

PubMed Central

Sinclair, Robert M.; Ravantti, Janne J.

2017-01-01

ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979
Generation of 2A-linked multicistronic cassettes by recombinant PCR.

PubMed

Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A

2012-02-01

The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. It is now possible to express multiple proteins from a single open reading frame (ORF) using 2A peptide-linked multicistronic vectors. These small sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector. This protocol describes the use of recombinant polymerase chain reaction (PCR) to connect multiple 2A-linked protein sequences. The final construct is subcloned into an expression vector.
Synthetic signal sequences that enable efficient secretory protein production in the yeast Kluyveromyces marxianus.

PubMed

Yarimizu, Tohru; Nakamura, Mikiko; Hoshida, Hisashi; Akada, Rinji

2015-02-14

Targeting of cellular proteins to the extracellular environment is directed by a secretory signal sequence located at the N-terminus of a secretory protein. These signal sequences usually contain an N-terminal basic amino acid followed by a stretch containing hydrophobic residues, although no consensus signal sequence has been identified. In this study, simple modeling of signal sequences was attempted using Gaussia princeps secretory luciferase (GLuc) in the yeast Kluyveromyces marxianus, which allowed comprehensive recombinant gene construction to substitute synthetic signal sequences. Mutational analysis of the GLuc signal sequence revealed that the GLuc hydrophobic peptide length was lower limit for effective secretion and that the N-terminal basic residue was indispensable. Deletion of the 16th Glu caused enhanced levels of secreted protein, suggesting that this hydrophilic residue defined the boundary of a hydrophobic peptide stretch. Consequently, we redesigned this domain as a repeat of a single hydrophobic amino acid between the N-terminal Lys and C-terminal Glu. Stretches consisting of Phe, Leu, Ile, or Met were effective for secretion but the number of residues affected secretory activity. A stretch containing sixteen consecutive methionine residues (M16) showed the highest activity; the M16 sequence was therefore utilized for the secretory production of human leukemia inhibitory factor protein in yeast, resulting in enhanced secreted protein yield. We present a new concept for the provision of secretory signal sequence ability in the yeast K. marxianus, determined by the number of residues of a single hydrophobic residue located between N-terminal basic and C-terminal acidic amino acid boundaries.
Draft genome sequence of the extremely acidophilic biomining bacterium Acidithiobacillus thiooxidans ATCC 19377 provides insights into the evolution of the Acidithiobacillus genus.

PubMed

Valdes, Jorge; Ossandon, Francisco; Quatrini, Raquel; Dopson, Mark; Holmes, David S

2011-12-01

Acidithiobacillus thiooxidans is a mesophilic, extremely acidophilic, chemolithoautotrophic gammaproteobacterium that derives energy from the oxidation of sulfur and inorganic sulfur compounds. Here we present the draft genome sequence of A. thiooxidans ATCC 19377, which has allowed the identification of genes for survival and colonization of extremely acidic environments.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Li -Chen; Lu, Jie; Weck, Marcus

In shell cross-linked micelles (SCMs) containing acid sites in the shell and base sites in the core are prepared from amphiphilic poly(2-oxazoline) triblock copolymers. These materials are utilized as two-chamber nanoreactors for a prototypical acid-base bifunctional tandem deacetalization-nitroaldol reaction. Furthermore, the acid and base sites are localized in different regions of the micelle, allowing the two steps in the reaction sequence to largely proceed in separate compartments, akin to the compartmentalization that occurs in biological systems.
From a marine neuropeptide to antimicrobial pseudopeptides containing aza-β(3)-amino acids: structure and activity

PubMed Central

Laurencin, Mathieu; Legrand, Baptiste; Duval, Emilie; Henry, Joël; Baudy-Floc'H, Michèle; Zatylny-Gaudin, Céline; Bondon, Arnaud

2012-01-01

Incorporation of aza-β3-amino acids into endogenous neuropeptide from mollusks (ALSGDAFLRF-NH2) with weak antimicrobial activities allows us to design new AMPs sequences. We find that, depending on the nature of the substitution, these could result either in inactive pseudopeptides or in a drastic enhancement of the antimicrobial activity without high cytotoxicity resulted. Structural studies perform by NMR and circular dichroism on the pseudopeptides show the impact of aza-β3-amino acids on the peptide structures. We obtain the first three-dimensional structures of pseudopeptides containing aza-β3-amino acids in aqueous micellar SDS and demonstrate that hydrazino turn can be formed in aqueous solution. Overall, these results demonstrate the ability to modulate AMPs activities through structural modifications induced by the nature and the position of these amino acid analogs in the peptide sequences. PMID:22320306
Genotype-specific signal generation based on digestion of 3-way DNA junctions: application to KRAS variation detection.

PubMed

Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike

2006-10-01

Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
Depletion of Unwanted Nucleic Acid Templates by Selective Cleavage: LNAzymes, Catalytically Active Oligonucleotides Containing Locked Nucleic Acids, Open a New Window for Detecting Rare Microbial Community Members

PubMed Central

Dolinšek, Jan; Dorninger, Christiane; Lagkouvardos, Ilias; Wagner, Michael

2013-01-01

Many studies of molecular microbial ecology rely on the characterization of microbial communities by PCR amplification, cloning, sequencing, and phylogenetic analysis of genes encoding rRNAs or functional marker enzymes. However, if the established clone libraries are dominated by one or a few sequence types, the cloned diversity is difficult to analyze by random clone sequencing. Here we present a novel approach to deplete unwanted sequence types from complex nucleic acid mixtures prior to cloning and downstream analyses. It employs catalytically active oligonucleotides containing locked nucleic acids (LNAzymes) for the specific cleavage of selected RNA targets. When combined with in vitro transcription and reverse transcriptase PCR, this LNAzyme-based technique can be used with DNA or RNA extracts from microbial communities. The simultaneous application of more than one specific LNAzyme allows the concurrent depletion of different sequence types from the same nucleic acid preparation. This new method was evaluated with defined mixtures of cloned 16S rRNA genes and then used to identify accompanying bacteria in an enrichment culture dominated by the nitrite oxidizer “Candidatus Nitrospira defluvii.” In silico analysis revealed that the majority of publicly deposited rRNA-targeted oligonucleotide probes may be used as specific LNAzymes with no or only minor sequence modifications. This efficient and cost-effective approach will greatly facilitate tasks such as the identification of microbial symbionts in nucleic acid preparations dominated by plastid or mitochondrial rRNA genes from eukaryotic hosts, the detection of contaminants in microbial cultures, and the analysis of rare organisms in microbial communities of highly uneven composition. PMID:23263968
Three-dimensional structural modelling and calculation of electrostatic potentials of HLA Bw4 and Bw6 epitopes to explain the molecular basis for alloantibody binding: toward predicting HLA antigenicity and immunogenicity.

PubMed

Mallon, Dermot H; Bradley, J Andrew; Winn, Peter J; Taylor, Craig J; Kosmoliaptsis, Vasilis

2015-02-01

We have previously shown that qualitative assessment of surface electrostatic potential of HLA class I molecules helps explain serological patterns of alloantibody binding. We have now used a novel computational approach to quantitate differences in surface electrostatic potential of HLA B-cell epitopes and applied this to explain HLA Bw4 and Bw6 antigenicity. Protein structure models of HLA class I alleles expressing either the Bw4 or Bw6 epitope (defined by sequence motifs at positions 77 to 83) were generated using comparative structure prediction. The electrostatic potential in 3-dimensional space encompassing the Bw4/Bw6 epitope was computed by solving the Poisson-Boltzmann equation and quantitatively compared in a pairwise, all-versus-all fashion to produce distance matrices that cluster epitopes with similar electrostatics properties. Quantitative comparison of surface electrostatic potential at the carboxyl terminal of the α1-helix of HLA class I alleles, corresponding to amino acid sequence motif 77 to 83, produced clustering of HLA molecules in 3 principal groups according to Bw4 or Bw6 epitope expression. Remarkably, quantitative differences in electrostatic potential reflected known patterns of serological reactivity better than Bw4/Bw6 amino acid sequence motifs. Quantitative assessment of epitope electrostatic potential allowed the impact of known amino acid substitutions (HLA-B*07:02 R79G, R82L, G83R) that are critical for antibody binding to be predicted. We describe a novel approach for quantitating differences in HLA B-cell epitope electrostatic potential. Proof of principle is provided that this approach enables better assessment of HLA epitope antigenicity than amino acid sequence data alone, and it may allow prediction of HLA immunogenicity.
Cloning of cellobiose phosphoenolpyruvate-dependent phosphotransferase genes: Functional expression in recombinant Escherichia coli and identification of a putative binding region for disaccharides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lai, Xiaokuang; Davis, F.C.; Ingram, L.O.

1997-02-01

Genomic libraries from nine cellobiose-metabolizing bacteria were screened for cellobiose utilization. Positive clones were recovered from six libraries, all of which encode phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) proteins. Clones from Bacillus subtilis, Butyrivibrio fibrisolvens, and Klebsiella oxytoca allowed the growth of recombinant Escherichia coli in cellobiose-M9 minimal medium. The K. oxytoca clone, pLOI1906, exhibited an unusually broad substrate range (cellobiose, arbutin, salicin, and methylumbelliferyl derivatives of glucose, cellobiose, mannose, and xylose) and was sequenced. The insert in this plasmid encoded the carboxy-terminal region of a putative regulatory protein, cellobiose permease (single polypeptide), and phospho-{beta}-glucosidase, which appear to form an operon (casRAB).more » Subclones allowed both casA and casB to be expressed independently, as evidenced by in vitro complementation. An analysis of the translated sequences from the EIIC domains of cellobiose, aryl-{beta}-glucoside, and other disaccharide permeases allowed the identification of a 50-amino-acid conserved region. A disaccharide consensus sequence is proposed for the most conserved segment (13 amino acids), which may represent part of the EIIC active site for binding and phosphorylation. 63 refs., 4 figs., 4 tabs.« less
Increasing Sequence Diversity with Flexible Backbone Protein Design: The Complete Redesign of a Protein Hydrophobic Core

DOE Office of Scientific and Technical Information (OSTI.GOV)

Murphy, Grant S.; Mills, Jeffrey L.; Miley, Michael J.

2015-10-15

Protein design tests our understanding of protein stability and structure. Successful design methods should allow the exploration of sequence space not found in nature. However, when redesigning naturally occurring protein structures, most fixed backbone design algorithms return amino acid sequences that share strong sequence identity with wild-type sequences, especially in the protein core. This behavior places a restriction on functional space that can be explored and is not consistent with observations from nature, where sequences of low identity have similar structures. Here, we allow backbone flexibility during design to mutate every position in the core (38 residues) of a four-helixmore » bundle protein. Only small perturbations to the backbone, 12 {angstrom}, were needed to entirely mutate the core. The redesigned protein, DRNN, is exceptionally stable (melting point >140C). An NMR and X-ray crystal structure show that the side chains and backbone were accurately modeled (all-atom RMSD = 1.3 {angstrom}).« less

Biopolymers Containing Unnatural Amino Acids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schultz, Peter

Although the main chain structure of polymers has a profound effect on their materials properties, the side groups can also have dramatic effects on their properties including conductivity, liquid crystallinity, hydrophobicity, elasticity and biodegradability. Unfortunately control over the side chain structure of polymers remains a challenge – it is difficult to control the sequence of chain elongation when mixtures of monomers are polymerized, and postpolymerization side chain modification is made difficult by polymer effects on side chain reactivity. In contrast, the mRNA templated synthesis of polypeptides on the ribosome affords absolute control over the primary sequence of the twenty aminomore » acid monomers. Moreover, the length of the biopolymer is precisely controlled as are sites of crosslinking. However, whereas synthetic polymers can be synthesized from monomers with a wide range of chemically defined structures, ribosomal biosynthesis is largely limited to the 20 canonical amino acids. For many applications in material sciences, additional building blocks would be desirable, for example, amino acids containing metallocene, photoactive, and halogenated side chains. To overcome this natural constraint we have developed a method that allows unnatural amino acids, beyond the common twenty, to be genetically encoded in response to nonsense or frameshift codons in bacteria, yeast and mammalian cells with high fidelity and good yields. Here we have developed methods that allow identical or distinct noncanonical amino acids to be incorporated at multiple sites in a polypeptide chain, potentially leading to a new class of templated biopolymers. We have also developed improved methods for genetically encoding unnatural amino acids. In addition, we have genetically encoded new amino acids with novel physical and chemical properties that allow selective modification of proteins with synthetic agents. Finally, we have evolved new metal-ion binding sites in proteins using a novel metal-ion binding amino acid, which may facilitate our ability to generate new protein based sensors and catalysts.« less
fCCAC: functional canonical correlation analysis to evaluate covariance between nucleic acid sequencing datasets.

PubMed

Madrigal, Pedro

2017-03-01

Computational evaluation of variability across DNA or RNA sequencing datasets is a crucial step in genomic science, as it allows both to evaluate reproducibility of biological or technical replicates, and to compare different datasets to identify their potential correlations. Here we present fCCAC, an application of functional canonical correlation analysis to assess covariance of nucleic acid sequencing datasets such as chromatin immunoprecipitation followed by deep sequencing (ChIP-seq). We show how this method differs from other measures of correlation, and exemplify how it can reveal shared covariance between histone modifications and DNA binding proteins, such as the relationship between the H3K4me3 chromatin mark and its epigenetic writers and readers. An R/Bioconductor package is available at http://bioconductor.org/packages/fCCAC/ . pmb59@cam.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Stable isotope probing to study functional components of complex microbial ecosystems.

PubMed

Mazard, Sophie; Schäfer, Hendrik

2014-01-01

This protocol presents a method of dissecting the DNA or RNA of key organisms involved in a specific biochemical process within a complex ecosystem. Stable isotope probing (SIP) allows the labelling and separation of nucleic acids from community members that are involved in important biochemical transformations, yet are often not the most numerically abundant members of a community. This pure culture-independent technique circumvents limitations of traditional microbial isolation techniques or data mining from large-scale whole-community metagenomic studies to tease out the identities and genomic repertoires of microorganisms participating in biological nutrient cycles. SIP experiments can be applied to virtually any ecosystem and biochemical pathway under investigation provided a suitable stable isotope substrate is available. This versatile methodology allows a wide range of analyses to be performed, from fatty-acid analyses, community structure and ecology studies, and targeted metagenomics involving nucleic acid sequencing. SIP experiments provide an effective alternative to large-scale whole-community metagenomic studies by specifically targeting the organisms or biochemical transformations of interest, thereby reducing the sequencing effort and time-consuming bioinformatics analyses of large datasets.
A technique for extracting Radiolaria from radiolarian cherts.

NASA Technical Reports Server (NTRS)

Pessagno, E. A., Jr.; Newport, R. L.

1972-01-01

Differential solution of Mesozoic radiolarian cherts with hydrofluoric acid has yielded well-preserved, matrix-free Radiolaria. This technique allows the full utilization of Radiolaria in interpreting the stratigraphy of ophiolite sequences and of other successions where cherts are prevalent.
Protein Design Using Unnatural Amino Acids

NASA Astrophysics Data System (ADS)

Bilgiçer, Basar; Kumar, Krishna

2003-11-01

With the increasing availability of whole organism genome sequences, understanding protein structure and function is of capital importance. Recent developments in the methodology of incorporation of unnatural amino acids into proteins allow the exploration of proteins at a very detailed level. Furthermore, de novo design of novel protein structures and function is feasible with unprecedented sophistication. Using examples from the literature, this article describes the available methods for unnatural amino acid incorporation and highlights some recent applications including the design of hyperstable protein folds.
Acid–base bifunctional shell cross-linked micelle nanoreactor for one-pot tandem reaction

DOE PAGES

Lee, Li -Chen; Lu, Jie; Weck, Marcus; ...

2015-12-29

In shell cross-linked micelles (SCMs) containing acid sites in the shell and base sites in the core are prepared from amphiphilic poly(2-oxazoline) triblock copolymers. These materials are utilized as two-chamber nanoreactors for a prototypical acid-base bifunctional tandem deacetalization-nitroaldol reaction. Furthermore, the acid and base sites are localized in different regions of the micelle, allowing the two steps in the reaction sequence to largely proceed in separate compartments, akin to the compartmentalization that occurs in biological systems.
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV)

PubMed Central

Martin, Andrew C. R.

2014-01-01

The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and ’dotifying’ repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/. PMID:25653836
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).

PubMed

Martin, Andrew C R

2014-01-01

The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.
Biopolymers Containing Unnatural Building Blocks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schultz, Peter G.

2013-06-30

Although the main chain structure of polymers has a profound effect on their materials properties, the side groups can also have dramatic effects on their properties including conductivity, liquid crystallinity, hydrophobicity, elasticity and biodegradability. Unfortunately control over the side chain structure of polymers remains a challenge – it is difficult to control the sequence of chain elongation when mixtures of monomers are polymerized, and postpolymerization side chain modification is made difficult by polymer effects on side chain reactivity. In contrast, the mRNA templated synthesis of polypeptides on the ribosome affords absolute control over the primary sequence of the twenty aminomore » acid monomers. Moreover, the length of the biopolymer is precisely controlled as are sites of crosslinking. However, whereas synthetic polymers can be synthesized from monomers with a wide range of chemically defined structures, ribosomal biosynthesis is largely limited to the 20 canonical amino acids. For many applications in material sciences, additional building blocks would be desirable, for example, amino acids containing metallocene, photoactive, and halogenated side chains. To overcome this natural constraint we have developed a method that allows unnatural amino acids, beyond the common twenty, to be genetically encoded in response to nonsense or frameshift codons in bacteria, yeast and mammalian cells with high fidelity and good yields. Here we have developed methods that allow identical or distinct noncanonical amino acids to be incorporated at multiple sites in a polypeptide chain, potentially leading to a new class of templated biopolymers. We have also developed improved methods for genetically encoding unnatural amino acids. In addition, we have genetically encoded new amino acids with novel physical and chemical properties that allow selective modification of proteins with synthetic agents. Finally, we have evolved new metal-ion binding sites in proteins using a novel metal-ion binding amino acid, which may facilitate our ability to generate new protein based sensors and catalysts.« less
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts

PubMed Central

Naito, Yuki; Bono, Hidemasa

2012-01-01

GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users. PMID:22641850
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts.

PubMed

Naito, Yuki; Bono, Hidemasa

2012-07-01

GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users.
Integrated databanks access and sequence/structure analysis services at the PBIL.

PubMed

Perrière, Guy; Combet, Christophe; Penel, Simon; Blanchet, Christophe; Thioulouse, Jean; Geourjon, Christophe; Grassot, Julien; Charavay, Céline; Gouy, Manolo; Duret, Laurent; Deléage, Gilbert

2003-07-01

The World Wide Web server of the PBIL (Pôle Bioinformatique Lyonnais) provides on-line access to sequence databanks and to many tools of nucleic acid and protein sequence analyses. This server allows to query nucleotide sequence banks in the EMBL and GenBank formats and protein sequence banks in the SWISS-PROT and PIR formats. The query engine on which our data bank access is based is the ACNUC system. It allows the possibility to build complex queries to access functional zones of biological interest and to retrieve large sequence sets. Of special interest are the unique features provided by this system to query the data banks of gene families developed at the PBIL. The server also provides access to a wide range of sequence analysis methods: similarity search programs, multiple alignments, protein structure prediction and multivariate statistics. An originality of this server is the integration of these two aspects: sequence retrieval and sequence analysis. Indeed, thanks to the introduction of re-usable lists, it is possible to perform treatments on large sets of data. The PBIL server can be reached at: http://pbil.univ-lyon1.fr.
Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

PubMed

Militello, Kevin T; Lazatin, Justine C

2017-05-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Sequence-independent construction of ordered combinatorial libraries with predefined crossover points.

PubMed

Jézéquel, Laetitia; Loeper, Jacqueline; Pompon, Denis

2008-11-01

Combinatorial libraries coding for mosaic enzymes with predefined crossover points constitute useful tools to address and model structure-function relationships and for functional optimization of enzymes based on multivariate statistics. The presented method, called sequence-independent generation of a chimera-ordered library (SIGNAL), allows easy shuffling of any predefined amino acid segment between two or more proteins. This method is particularly well adapted to the exchange of protein structural modules. The procedure could also be well suited to generate ordered combinatorial libraries independent of sequence similarities in a robotized manner. Sequence segments to be recombined are first extracted by PCR from a single-stranded template coding for an enzyme of interest using a biotin-avidin-based method. This technique allows the reduction of parental template contamination in the final library. Specific PCR primers allow amplification of two complementary mosaic DNA fragments, overlapping in the region to be exchanged. Fragments are finally reassembled using a fusion PCR. The process is illustrated via the construction of a set of mosaic CYP2B enzymes using this highly modular approach.
Selecting Fully-Modified XNA Aptamers Using Synthetic Genetics.

PubMed

Taylor, Alexander I; Holliger, Philipp

2018-06-01

This unit describes the application of "synthetic genetics," i.e., the replication of xeno nucleic acids (XNAs), artificial analogs of DNA and RNA bearing alternative backbone or sugar congeners, to the directed evolution of synthetic oligonucleotide ligands (XNA aptamers) specific for target proteins or nucleic acid motifs, using a cross-chemistry selective exponential enrichment (X-SELEX) approach. Protocols are described for synthesis of diverse-sequence XNA repertoires (typically 10 14 molecules) using DNA templates, isolation and panning for functional XNA sequences using targets immobilized on solid phase or gel shift induced by target binding in solution, and XNA reverse transcription to allow cDNA amplification or sequencing. The method may be generally applied to select fully-modified XNA aptamers specific for a wide range of target molecules. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.
Y chromosome specific nucleic acid probe and method for determining the Y chromosome in situ

DOEpatents

Gray, Joe W.; Weier, Heinz-Ulrich

1998-01-01

A method for producing a Y chromosome specific probe selected from highly repeating sequences on that chromosome is described. There is little or no nonspecific binding to autosomal and X chromosomes, and a very large signal is provided. Inventive primers allowing the use of PCR for both sample amplification and probe production are described, as is their use in producing large DNA chromosome painting sequences.
Y chromosome specific nucleic acid probe and method for identifying the Y chromosome in SITU

DOEpatents

Gray, Joe W.; Weier, Heinz-Ulrich

1999-01-01

A method for producing a Y chromosome specific probe selected from highly repeating sequences on that chromosome is described. There is little or no nonspecific binding to autosomal and X chromosomes, and a very large signal is provided. Inventive primers allowing the use of PCR for both sample amplification and probe production are described, as is their use in producing large DNA chromosome painting sequences.
Y chromosome specific nucleic acid probe and method for determining the Y chromosome in situ

DOEpatents

Gray, Joe W.; Weier, Heinz-Ulrich

2001-01-01

A method for producing a Y chromosome specific probe selected from highly repeating sequences on that chromosome is described. There is little or no nonspecific binding to autosomal and X chromosomes, and a very large signal is provided. Inventive primers allowing the use of PCR for both sample amplification and probe production are described, as is their use in producing large DNA chromosome painting sequences.
Y chromosome specific nucleic acid probe and method for determining the Y chromosome in situ

DOEpatents

Gray, J.W.; Weier, H.U.

1998-11-24

A method for producing a Y chromosome specific probe selected from highly repeating sequences on that chromosome is described. There is little or no nonspecific binding to autosomal and X chromosomes, and a very large signal is provided. Inventive primers allowing the use of PCR for both sample amplification and probe production are described, as is their use in producing large DNA chromosome painting sequences. 9 figs.
Y chromosome specific nucleic acid probe and method for identifying the Y chromosome in SITU

DOEpatents

Gray, J.W.; Weier, H.U.

1999-03-30

A method for producing a Y chromosome specific probe selected from highly repeating sequences on that chromosome is described. There is little or no nonspecific binding to autosomal and X chromosomes, and a very large signal is provided. Inventive primers allowing the use of PCR for both sample amplification and probe production are described, as is their use in producing large DNA chromosome painting sequences. 9 figs.

GAWK, a novel human pituitary polypeptide: isolation, immunocytochemical localization and complete amino acid sequence.

PubMed

Benjannet, S; Leduc, R; Lazure, C; Seidah, N G; Marcinkiewicz, M; Chrétien, M

1985-01-16

During the course of reverse-phase high pressure liquid chromatography (RP-HPLC) purification of a postulated big ACTH (1) from human pituitary gland extracts, a highly purified peptide bearing no resemblance to any known polypeptide was isolated. The complete sequence of this 74 amino acid polypeptide, called GAWK, has been determined. Search on a computer data bank on the possible homology to any known protein or fragment, using a mutation data matrix, failed to reveal any homology greater than 30%. An antibody produced against a synthetic fragment allowed us to detect several immunoreactive forms. The antisera also enabled us to localize the polypeptide, by immunocytochemistry, in the anterior lobe of the pituitary gland.
Complete nucleotide sequences of the coat protein messenger RNAs of brome mosaic virus and cowpea chlorotic mottle virus.

PubMed Central

Dasgupta, R; Kaesberg, P

1982-01-01

The nucleotide sequences of the subgenomic coat protein messengers (RNA4's) of two related bromoviruses, brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV), have been determined by direct RNA and CDNA sequencing without cloning. BMV RNA4 is 876 b long including a 5' noncoding region of nine nucleotides and a 3' noncoding region of 300 nucleotides. CCMV RNA 4 is 824 b long, including a 5' noncoding region of 10 nucleotides and a 3' noncoding region of 244 nucleotides. The encoded coat proteins are similar in length (188 amino acids for BMV and 189 amino acids for CCMV) and display about 70% homology in their amino acid sequences. Length difference between the two RNAs is due mostly to a single deletion, in CCMV with respect to BMV, of about 57 b immediately following the coding region. Allowing for this deletion the RNAs are indicate that mutations leading to divergence were constrained in the coding region primarily by the requirement of maintaining a favorable coat protein structure and in the 3' noncoding region primarily by the requirement of maintaining a favorable RNA spatial configuration. PMID:6895941
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

PubMed

Gerlt, John A

2017-08-22

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.
Integrating mRNA and protein sequencing enables the detection and quantitative profiling of natural protein sequence variants of Populus trichocarpa

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya

The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less
Integrating mRNA and protein sequencing enables the detection and quantitative profiling of natural protein sequence variants of Populus trichocarpa

DOE PAGES

Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya; ...

2015-10-20

The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less
Detection of viral infection and gene expression in clinical tissue specimens using branched DNA (bDNA) in situ hybridization.

PubMed

Kenny, Daryn; Shen, Lu-Ping; Kolberg, Janice A

2002-09-01

In situ hybridization (ISH) methods for detection of nucleic acid sequences have proved especially powerful for revealing genetic markers and gene expression in a morphological context. Although target and signal amplification technologies have enabled researchers to detect relatively low-abundance molecules in cell extracts, the sensitive detection of nucleic acid sequences in tissue specimens has proved more challenging. We recently reported the development of a branched DNA (bDNA) ISH method for detection of DNA and mRNA in whole cells. Based on bDNA signal amplification technology, bDNA ISH is highly sensitive and can detect one or two copies of DNA per cell. In this study we evaluated bDNA ISH for detection of nucleic acid sequences in tissue specimens. Using normal and human papillomavirus (HPV)-infected cervical biopsy specimens, we explored the cell type-specific distribution of HPV DNA and mRNA by bDNA ISH. We found that bDNA ISH allowed rapid, sensitive detection of nucleic acids with high specificity while preserving tissue morphology. As an adjunct to conventional histopathology, bDNA ISH may improve diagnostic accuracy and prognosis for viral and neoplastic diseases.
Cloning and sequencing of pyruvate decarboxylase (PDC) genes from bacteria and uses therefor

DOEpatents

Maupin-Furlow, Julie A [Gainesville, FL; Talarico, Lee Ann [Gainesville, FL; Raj, Krishnan Chandra [Tamil Nadu, IN; Ingram, Lonnie O [Gainesville, FL

2008-02-05

The invention provides isolated nucleic acids molecules which encode pyruvate decarboxylase enzymes having improved decarboxylase activity, substrate affinity, thermostability, and activity at different pH. The nucleic acids of the invention also have a codon usage which allows for high expression in a variety of host cells. Accordingly, the invention provides recombinant expression vectors containing such nucleic acid molecules, recombinant host cells comprising the expression vectors, host cells further comprising other ethanologenic enzymes, and methods for producing useful substances, e.g., acetaldehyde and ethanol, using such host cells.
Visualization of nucleic acids with synthetic exciton-controlled fluorescent oligonucleotide probes.

PubMed

Wang, Dan Ohtan; Okamoto, Akimitsu

2015-01-01

Engineered probes to adapt new photochemical properties upon recognition of target nucleic acids offer powerful tools to DNA and RNA visualization technologies. Herein, we describe a rapid and effective visualization method of nucleic acids in both fixed and living cells with hybridization-sensitive fluorescent oligonucleotide probes. These probes are efficiently quenched in an aqueous environment due to the homodimeric, excitonic interactions between fluorophores but become highly fluorescent upon hybridization to DNA or RNA with complementary sequences. The fast hybridization kinetics and quick fluorescence activation of the new probes allow applications to simplify the conventional fluorescent in situ hybridization protocols and reduce the amount of time to process the samples. Furthermore, hybridization-sensitive fluorescence emission of the probes allows monitoring dynamic behaviors of RNA in living cells.
An In Vitro Translation, Selection, and Amplification System for Peptide Nucleic Acids

PubMed Central

Brudno, Yevgeny; Birnbaum, Michael E.; Kleiner, Ralph E.; Liu, David R.

2009-01-01

Methods to evolve synthetic, rather than biological, polymers could significantly expand the functional potential of polymers that emerge from in vitro evolution. Requirements for synthetic polymer evolution include: (i) sequence-specific polymerization of synthetic building blocks on an amplifiable template; (ii) display of the newly translated polymer strand in a manner that allows it to adopt folded structures; (iii) selection of synthetic polymer libraries for desired binding or catalytic properties; and (iv) amplification of template sequences surviving selection in a manner that allows subsequent translation. Here we report the development of such a system for peptide nucleic acids (PNAs) using a set of twelve PNA pentamer building blocks. We validated the system by performing six iterated cycles of translation, selection, and amplification on a library of 4.3 × 108 PNA-encoding DNA templates and observed >1,000,000-fold overall enrichment of a template encoding a biotinylated (streptavidin-binding) PNA. These results collectively provide an experimental foundation for PNA evolution in the laboratory. PMID:20081830
In vitro selection of functional nucleic acids

NASA Technical Reports Server (NTRS)

Wilson, D. S.; Szostak, J. W.

1999-01-01

In vitro selection allows rare functional RNA or DNA molecules to be isolated from pools of over 10(15) different sequences. This approach has been used to identify RNA and DNA ligands for numerous small molecules, and recent three-dimensional structure solutions have revealed the basis for ligand recognition in several cases. By selecting high-affinity and -specificity nucleic acid ligands for proteins, promising new therapeutic and diagnostic reagents have been identified. Selection experiments have also been carried out to identify ribozymes that catalyze a variety of chemical transformations, including RNA cleavage, ligation, and synthesis, as well as alkylation and acyl-transfer reactions and N-glycosidic and peptide bond formation. The existence of such RNA enzymes supports the notion that ribozymes could have directed a primitive metabolism before the evolution of protein synthesis. New in vitro protein selection techniques should allow for a direct comparison of the frequency of ligand binding and catalytic structures in pools of random sequence polynucleotides versus polypeptides.
alpha-Crystallin A sequences of Alligator mississippiensis and the lizard Tupinambis teguixin: molecular evolution and reptilian phylogeny.

PubMed

de Jong, W W; Zweers, A; Versteeg, M; Dessauer, H C; Goodman, M

1985-11-01

The amino acid sequences of the eye lens protein alpha-crystallin A from many mammalian and avian species, two frog species, and a dogfish have provided detailed information about the molecular evolution of this protein and allowed some useful inferences about phylogenetic relationships among these species. We now have isolated and sequenced the alpha-crystallins of the American alligator and the common tegu lizard. The reptilian alpha A chains appear to have evolved as slowly as those of other vertebrates, i.e., at two to three amino acid replacements per 100 residues in 100 Myr. The lack of charged replacements and the general types and distribution of replacements also are similar to those in other vertebrate alpha A chains. Maximum-parsimony analyses of the total data set of 67 vertebrate alpha A sequences support the monophyletic origin of alligator, tegu, and birds and favor the grouping of crocodilians and birds as surviving sister groups in the subclass Archosauria.
Characterization of HIV Type 1 Envelope Sequence Among Viral Isolates Circulating in the Northern Region of Colombia, South America

PubMed Central

Villarreal, José-Luis; Gutiérrez, Jaime; Palacio, Lucy; Peñuela, Martha; Hernández, Robin; Lemay, Guy

2012-01-01

Abstract To characterize human immunodeficiency virus (HIV-1) strains circulating in the Northern region of Colombia in South America, sequences of the viral envelope C2V3C3 region were obtained from patients with different high-risk practices. Close to 60% of the sequences were predicted to belong to macrophage-tropic viruses, according to the positions of acidic amino acids and putative N-linked glycosylation sites. This is in agreement with the fact that most of the patients were recently diagnosed individuals. Phylogenic analysis then allowed assignment of all 35 samples to subtype B viruses. This same subtype was found in previous studies carried out in other Colombian regions. This study thus expands previous analyses with previously missing data from the Northern region of the country. The number and the length of the sequences examined also help to provide a clearer picture of the prevailing situation of the present HIV epidemics in this country. PMID:22482735
Peptides derivatized with bicyclic quaternary ammonium ionization tags. Sequencing via tandem mass spectrometry.

PubMed

Setner, Bartosz; Rudowska, Magdalena; Klem, Ewelina; Cebrat, Marek; Szewczuk, Zbigniew

2014-10-01

Improving the sensitivity of detection and fragmentation of peptides to provide reliable sequencing of peptides is an important goal of mass spectrometric analysis. Peptides derivatized by bicyclic quaternary ammonium ionization tags: 1-azabicyclo[2.2.2]octane (ABCO) or 1,4-diazabicyclo[2.2.2]octane (DABCO), are characterized by an increased detection sensitivity in electrospray ionization mass spectrometry (ESI-MS) and longer retention times on the reverse-phase (RP) chromatography columns. The improvement of the detection limit was observed even for peptides dissolved in 10 mM NaCl. Collision-induced dissociation tandem mass spectrometry of quaternary ammonium salts derivatives of peptides showed dominant a- and b-type ions, allowing facile sequencing of peptides. The bicyclic ionization tags are stable in collision-induced dissociation experiments, and the resulted fragmentation pattern is not significantly influenced by either acidic or basic amino acid residues in the peptide sequence. Obtained results indicate the general usefulness of the bicyclic quaternary ammonium ionization tags for ESI-MS/MS sequencing of peptides. Copyright © 2014 John Wiley & Sons, Ltd.
Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

DOEpatents

Lucas, J.N.; Straume, T.; Bogen, K.T.

1998-03-24

A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.
Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

1998-01-01

A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.
Method for identifying and quantifying nucleic acid sequence aberrations

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

1998-01-01

A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.
Method for identifying and quantifying nucleic acid sequence aberrations

DOEpatents

Lucas, J.N.; Straume, T.; Bogen, K.T.

1998-07-21

A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.
TmiRUSite and TmiROSite scripts: searching for mRNA fragments with miRNA binding sites with encoded amino acid residues.

PubMed

Berillo, Olga; Régnier, Mireille; Ivashchenko, Anatoly

2014-01-01

microRNAs are small RNA molecules that inhibit the translation of target genes. microRNA binding sites are located in the untranslated regions as well as in the coding domains. We describe TmiRUSite and TmiROSite scripts developed using python as tools for the extraction of nucleotide sequences for miRNA binding sites with their encoded amino acid residue sequences. The scripts allow for retrieving a set of additional sequences at left and at right from the binding site. The scripts presents all received data in table formats that are easy to analyse further. The predicted data finds utility in molecular and evolutionary biology studies. They find use in studying miRNA binding sites in animals and plants. TmiRUSite and TmiROSite scripts are available for free from authors upon request and at https: //sites.google.com/site/malaheenee/downloads for download.
Mammalian evolution: timing and implications from using the LogDeterminant transform for proteins of differing amino acid composition.

PubMed

Penny, D; Hasegawa, M; Waddell, P J; Hendy, M D

1999-03-01

We explore the tree of mammalian mtDNA sequences, using particularly the LogDet transform on amino acid sequences, the distance Hadamard transform, and the Closest Tree selection criterion. The amino acid composition of different species show significant differences, even within mammals. After compensating for these differences, nearest-neighbor bootstrap results suggest that the tree is locally stable, though a few groups show slightly greater rearrangements when a large proportion of the constant sites are removed. Many parts of the trees we obtain agree with those on published protein ML trees. Interesting results include a preference for rodent monophyly. The detection of a few alternative signals to those on the optimal tree were obtained using the distance Hadamard transform (with results expressed as a Lento plot). One rearrangement suggested was the interchange of the position of primates and rodents on the optimal tree. The basic stability of the tree, combined with two calibration points (whale/cow and horse/rhinoceros), together with a distant secondary calibration from the mammal/bird divergence, allows inferences of the times of divergence of putative clades. Allowing for sampling variances due to finite sequence length, most major divergences amongst lineages leading to modern orders, appear to occur well before the Cretaceous/Tertiary (K/T) boundary. Implications arising from these early divergences are discussed, particularly the possibility of competition between the small dinosaurs and the new mammal clades.
Dynamic peptide libraries for the discovery of supramolecular nanomaterials

NASA Astrophysics Data System (ADS)

Pappas, Charalampos G.; Shafi, Ramim; Sasselli, Ivan R.; Siccardi, Henry; Wang, Tong; Narang, Vishal; Abzalimov, Rinat; Wijerathne, Nadeesha; Ulijn, Rein V.

2016-11-01

Sequence-specific polymers, such as oligonucleotides and peptides, can be used as building blocks for functional supramolecular nanomaterials. The design and selection of suitable self-assembling sequences is, however, challenging because of the vast combinatorial space available. Here we report a methodology that allows the peptide sequence space to be searched for self-assembling structures. In this approach, unprotected homo- and heterodipeptides (including aromatic, aliphatic, polar and charged amino acids) are subjected to continuous enzymatic condensation, hydrolysis and sequence exchange to create a dynamic combinatorial peptide library. The free-energy change associated with the assembly process itself gives rise to selective amplification of self-assembling candidates. By changing the environmental conditions during the selection process, different sequences and consequent nanoscale morphologies are selected.

Dynamic peptide libraries for the discovery of supramolecular nanomaterials.

PubMed

Pappas, Charalampos G; Shafi, Ramim; Sasselli, Ivan R; Siccardi, Henry; Wang, Tong; Narang, Vishal; Abzalimov, Rinat; Wijerathne, Nadeesha; Ulijn, Rein V

2016-11-01

Sequence-specific polymers, such as oligonucleotides and peptides, can be used as building blocks for functional supramolecular nanomaterials. The design and selection of suitable self-assembling sequences is, however, challenging because of the vast combinatorial space available. Here we report a methodology that allows the peptide sequence space to be searched for self-assembling structures. In this approach, unprotected homo- and heterodipeptides (including aromatic, aliphatic, polar and charged amino acids) are subjected to continuous enzymatic condensation, hydrolysis and sequence exchange to create a dynamic combinatorial peptide library. The free-energy change associated with the assembly process itself gives rise to selective amplification of self-assembling candidates. By changing the environmental conditions during the selection process, different sequences and consequent nanoscale morphologies are selected.
Computational structural analysis of an anti-l-amino acid antibody and inversion of its stereoselectivity

PubMed Central

Ranieri, Daniel I.; Hofstetter, Heike; Hofstetter, Oliver

2009-01-01

The binding site of a monoclonal anti-l-amino acid antibody was modeled using the program SWISS-MODEL. Docking experiments with the enantiomers of phenylalanine revealed that the antibody interacts with l-phenylalanine via hydrogen bonds and hydrophobic contacts, whereas the d-enantiomer is rejected due to steric hindrance. Comparison of the sequences of this antibody and an anti-d-amino acid antibody indicates that both immunoglobulins derived from the same germline progenitor. Substitution of four amino acids residues, three in the framework and one in the complementarity determining regions, allowed in silico conversion of the anti-l-amino acid antibody into an antibody that stereoselectively binds d-phenylalanine. PMID:19472280
Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry and tandem mass spectrometry for peptide de novo amino acid sequencing for a seven-protein mixture by paired single-residue transposed Lys-N and Lys-C digestion.

PubMed

Guan, Xiaoyan; Brownstein, Naomi C; Young, Nicolas L; Marshall, Alan G

2017-01-30

Bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics to identify proteins from a sequence database. De novo sequencing is also available for sequencing peptides with relatively short sequence lengths. We recently showed that paired Lys-C and Lys-N proteases produce peptides of identical mass and similar retention time, but different tandem mass spectra. Such parallel experiments provide complementary information, and allow for up to 100% MS/MS sequence coverage. Here, we report digestion by paired Lys-C and Lys-N proteases of a seven-protein mixture: human hemoglobin alpha, bovine carbonic anhydrase 2, horse skeletal muscle myoglobin, hen egg white lysozyme, bovine pancreatic ribonuclease, bovine rhodanese, and bovine serum albumin, followed by reversed-phase nanoflow liquid chromatography, collision-induced dissociation, and 14.5 T Fourier transform ion cyclotron resonance mass spectrometry. Matched pairs of product peptide ions of equal precursor mass and similar retention times from each digestion are compared, leveraging single-residue transposed information with independent interferences to confidently identify fragment ion types, residues, and peptides. Selected pairs of product ion mass spectra for de novo sequenced protein segments from each member of the mixture are presented. Pairs of the transposed product ions as well as complementary information from the parallel experiments allow for both high MS/MS coverage for long peptide sequences and high confidence in the amino acid identification. Moreover, the parallel experiments in the de novo sequencing reduce false-positive matches of product ions from the single-residue transposed peptides from the same segment, and thereby further improve the confidence in protein identification. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
ACTG: novel peptide mapping onto gene models.

PubMed

Choi, Seunghyuk; Kim, Hyunwoo; Paek, Eunok

2017-04-15

In many proteogenomic applications, mapping peptide sequences onto genome sequences can be very useful, because it allows us to understand origins of the gene products. Existing software tools either take the genomic position of a peptide start site as an input or assume that the peptide sequence exactly matches the coding sequence of a given gene model. In case of novel peptides resulting from genomic variations, especially structural variations such as alternative splicing, these existing tools cannot be directly applied unless users supply information about the variant, either its genomic position or its transcription model. Mapping potentially novel peptides to genome sequences, while allowing certain genomic variations, requires introducing novel gene models when aligning peptide sequences to gene structures. We have developed a new tool called ACTG (Amino aCids To Genome), which maps peptides to genome, assuming all possible single exon skipping, junction variation allowing three edit distances from the original splice sites, exon extension and frame shift. In addition, it can also consider SNVs (single nucleotide variations) during mapping phase if a user provides the VCF (variant call format) file as an input. Available at http://prix.hanyang.ac.kr/ACTG/search.jsp . eunokpaek@hanyang.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

NASA Astrophysics Data System (ADS)

Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

2016-08-01

Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.
Dictionary-driven protein annotation.

PubMed

Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

2002-09-01

Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/.
Microbial and viral-like rhodopsins present in coastal marine sediments from four polar and subpolar regions

DOE Office of Scientific and Technical Information (OSTI.GOV)

López, José L.; Golemba, Marcelo; Hernández, Edgardo

Rhodopsins are broadly distributed. In this work, we analyzed 23 metagenomes corresponding to marine sediment samples from four regions that share cold climate conditions (Norway; Sweden; Argentina and Antarctica). In order to investigate the genes evolution of viral rhodopsins, an initial set of 6224 bacterial rhodopsin sequences according to COG5524 were retrieved from the 23 metagenomes. After selection by the presence of transmembrane domains and alignment, 123 viral (51) and non-viral (72) sequences (>50 amino acids) were finally included in further analysis. Viral rhodopsin genes were homologs of Phaeocystis globosa virus and Organic lake Phycodnavirus. Non-viral microbial rhodopsin genes weremore » ascribed to Bacteroidetes, Planctomycetes, Firmicutes, Actinobacteria, Cyanobacteria, Proteobacteria, Deinococcus-Thermus and Cryptophyta and Fungi. A rescreening using Blastp, using as queries the viral sequences previously described, retrieved 30 sequences (>100 amino acids). Phylogeographic analysis revealed a geographical clustering of the sequences affiliated to the viral group. This clustering was not observed for the microbial non-viral sequences. The phylogenetic reconstruction allowed us to propose the existence of a putative ancestor of viral rhodopsin genes related to Actinobacteria and Chloroflexi. This is the first report about the existence of a phylogeographic association of the viral rhodopsin sequences from marine sediments.« less
Identification of single amino acid substitutions (SAAS) in neuraminidase from influenza a virus (H1N1) via mass spectrometry analysis coupled with de novo peptide sequencing.

PubMed

Peng, Qisheng; Wang, Zijian; Wu, Donglin; Li, Xiaoou; Liu, Xiaofeng; Sun, Wanchun; Liu, Ning

2016-08-01

Amino acid substitutions in the neuraminidase of the influenza virus are the main cause of the emergence of resistance to zanamivir or oseltamivir during seasonal influenza treatment; they are the result of non-synonymous mutations in the viral genome that can be successfully detected by polymer chain reaction (PCR)-based approaches. There is always an urgent need to detect variation in amino acid sequences directly at the protein level. Mass spectrometry coupled with de novo sequencing has been explored as an alternative and straightforward strategy for detecting amino acid substitutions, as well - this approach is the primary focus of the present study. Influenza virus (A/Puerto Rico/8/1934 H1N1) propagated in embryonated chicken eggs was purified by ultracentrifugation, followed by PNGase F treatment. The deglycosylated virion was lysed and separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). The gel band corresponding to neuraminidase was picked up and subjected to liquid chromatography tandem mass spectrometry (LC-MS/MS) analysis. LC-MS/MS analyses, coupled with manual de novo sequencing, allowed the determination of three amino acid substitutions: R346K, S349 N, and S370I/L, in the neuraminidase from the influenza virus (A/Puerto Rico/8/1934 H1N1), which were located in three mutated peptides of the neuraminidase: YGNGVWIGK, TKNHSSR, and PNGWTETDI/LK, respectively. We found that the amino acid substitutions in the proteins of RNA viruses (including influenza A virus) resulting from non-synonymous gene mutations can indeed be directly analyzed via mass spectrometry, and that manual interpretation of the MS/MS data may be beneficial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
From Ramachandran Maps to Tertiary Structures of Proteins.

PubMed

DasGupta, Debarati; Kaushik, Rahul; Jayaram, B

2015-08-27

Sequence to structure of proteins is an unsolved problem. A possible coarse grained resolution to this entails specification of all the torsional (Φ, Ψ) angles along the backbone of the polypeptide chain. The Ramachandran map quite elegantly depicts the allowed conformational (Φ, Ψ) space of proteins which is still very large for the purposes of accurate structure generation. We have divided the allowed (Φ, Ψ) space in Ramachandran maps into 27 distinct conformations sufficient to regenerate a structure to within 5 Å from the native, at least for small proteins, thus reducing the structure prediction problem to a specification of an alphanumeric string, i.e., the amino acid sequence together with one of the 27 conformations preferred by each amino acid residue. This still theoretically results in 27(n) conformations for a protein comprising "n" amino acids. We then investigated the spatial correlations at the two-residue (dipeptide) and three-residue (tripeptide) levels in what may be described as higher order Ramachandran maps, with the premise that the allowed conformational space starts to shrink as we introduce neighborhood effects. We found, for instance, for a tripeptide which potentially can exist in any of the 27(3) "allowed" conformations, three-fourths of these conformations are redundant to the 95% confidence level, suggesting sequence context dependent preferred conformations. We then created a look-up table of preferred conformations at the tripeptide level and correlated them with energetically favorable conformations. We found in particular that Boltzmann probabilities calculated from van der Waals energies for each conformation of tripeptides correlate well with the observed populations in the structural database (the average correlation coefficient is ∼0.8). An alpha-numeric string and hence the tertiary structure can be generated for any sequence from the look-up table within minutes on a single processor and to a higher level of accuracy if secondary structure can be specified. We tested the methodology on 100 small proteins, and in 90% of the cases, a structure within 5 Å is recovered. We thus believe that the method presented here provides the missing link between Ramachandran maps and tertiary structures of proteins. A Web server to convert a tertiary structure to an alphanumeric string and to predict the tertiary structure from the sequence of a protein using the above methodology is created and made freely accessible at http://www.scfbio-iitd.res.in/software/proteomics/rm2ts.jsp.
Short peptides allowing preferential detection of Candida albicans hyphae.

PubMed

Kaba, Hani E J; Pölderl, Antonia; Bilitewski, Ursula

2015-09-01

Whereas the detection of pathogens via recognition of surface structures by specific antibodies and various types of antibody mimics is frequently described, the applicability of short linear peptides as sensor molecules or diagnostic tools is less well-known. We selected peptides which were previously reported to bind to recombinant S. cerevisiae cells, expressing members of the C. albicans Agglutinin-Like-Sequence (ALS) cell wall protein family. We slightly modified amino acid sequences to evaluate peptide sequence properties influencing binding to C. albicans cells. Among the selected peptides, decamer peptides with an "AP"-N-terminus were superior to shorter peptides. The new decamer peptide FBP4 stained viable C. albicans cells more efficiently in their mature hyphal form than in their yeast form. Moreover, it allowed distinction of C. albicans from other related Candida spp. and could thus be the basis for the development of a useful tool for the diagnosis of invasive candidiasis.
Method for isolating chromosomal DNA in preparation for hybridization in suspension

DOEpatents

Lucas, Joe N.

2000-01-01

A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. Chromosomal DNA in a sample containing cell debris is prepared for hybridization in suspension by treating the mixture with RNase. The treated DNA can also be fixed prior to hybridization.
RNA-seq based transcriptomic analysis uncovers α-linolenic acid and jasmonic acid biosynthesis pathways respond to cold acclimation in Camellia japonica

PubMed Central

Li, Qingyuan; Lei, Sheng; Du, Kebing; Li, Lizhi; Pang, Xufeng; Wang, Zhanchang; Wei, Ming; Fu, Shao; Hu, Limin; Xu, Lin

2016-01-01

Camellia is a well-known ornamental flower native to Southeast of Asia, including regions such as Japan, Korea and South China. However, most species in the genus Camellia are cold sensitive. To elucidate the cold stress responses in camellia plants, we carried out deep transcriptome sequencing of ‘Jiangxue’, a cold-tolerant cultivar of Camellia japonica, and approximately 1,006 million clean reads were generated using Illumina sequencing technology. The assembly of the clean reads produced 367,620 transcripts, including 207,592 unigenes. Overall, 28,038 differentially expressed genes were identified during cold acclimation. Detailed elucidation of responses of transcription factors, protein kinases and plant hormone signalling-related genes described the interplay of signal that allowed the plant to fine-tune cold stress responses. On the basis of global gene regulation of unsaturated fatty acid biosynthesis- and jasmonic acid biosynthesis-related genes, unsaturated fatty acid biosynthesis and jasmonic acid biosynthesis pathways were deduced to be involved in the low temperature responses in C. japonica. These results were supported by the determination of the fatty acid composition and jasmonic acid content. Our results provide insights into the genetic and molecular basis of the responses to cold acclimation in camellia plants. PMID:27819341
Optimization and validation of sample preparation for metagenomic sequencing of viruses in clinical samples.

PubMed

Lewandowska, Dagmara W; Zagordi, Osvaldo; Geissberger, Fabienne-Desirée; Kufner, Verena; Schmutz, Stefan; Böni, Jürg; Metzner, Karin J; Trkola, Alexandra; Huber, Michael

2017-08-08

Sequence-specific PCR is the most common approach for virus identification in diagnostic laboratories. However, as specific PCR only detects pre-defined targets, novel virus strains or viruses not included in routine test panels will be missed. Recently, advances in high-throughput sequencing allow for virus-sequence-independent identification of entire virus populations in clinical samples, yet standardized protocols are needed to allow broad application in clinical diagnostics. Here, we describe a comprehensive sample preparation protocol for high-throughput metagenomic virus sequencing using random amplification of total nucleic acids from clinical samples. In order to optimize metagenomic sequencing for application in virus diagnostics, we tested different enrichment and amplification procedures on plasma samples spiked with RNA and DNA viruses. A protocol including filtration, nuclease digestion, and random amplification of RNA and DNA in separate reactions provided the best results, allowing reliable recovery of viral genomes and a good correlation of the relative number of sequencing reads with the virus input. We further validated our method by sequencing a multiplexed viral pathogen reagent containing a range of human viruses from different virus families. Our method proved successful in detecting the majority of the included viruses with high read numbers and compared well to other protocols in the field validated against the same reference reagent. Our sequencing protocol does work not only with plasma but also with other clinical samples such as urine and throat swabs. The workflow for virus metagenomic sequencing that we established proved successful in detecting a variety of viruses in different clinical samples. Our protocol supplements existing virus-specific detection strategies providing opportunities to identify atypical and novel viruses commonly not accounted for in routine diagnostic panels.
Construction of Synthetic Immunogens in View of Developing Orally-Active Anti-Enterotoxigenic E. coli Vaccines

DTIC Science & Technology

1989-05-31

toxin lj-chain Adjuvant materials MOP-Lys - aminocaproic murabutide 6-O-succinyl murabutide Experimental Methods and Results 5 HPLC Analysis Dosage of...containing other E.coli antigens as suggested by Ahren and Svennerholm. (49). The amino acid sequence of CFA1 is now available (50) as well as the...MOP. The method of Reissig (56) has been used. It allows to evaluate specifically the N-acetyl group substituted in the 2-position of the muramic acid
Rational design of new materials using recombinant structural proteins: Current state and future challenges.

PubMed

Sutherland, Tara D; Huson, Mickey G; Rapson, Trevor D

2018-01-01

Sequence-definable polymers are seen as a prerequisite for design of future materials, with many polymer scientists regarding such polymers as the holy grail of polymer science. Recombinant proteins are sequence-defined polymers. Proteins are dictated by DNA templates and therefore the sequence of amino acids in a protein is defined, and molecular biology provides tools that allow redesign of the DNA as required. Despite this advantage, proteins are underrepresented in materials science. In this publication we investigate the advantages and limitations of using proteins as templates for rational design of new materials. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Rapid motif compliance scoring with match weight sets.

PubMed

Venezia, D; O'Hara, P J

1993-02-01

Most current implementations of motif matching in biological sequences have sacrificed the generality of weight matrix scoring for shorter runtimes. The program MOTIF incorporates a weight matrix and a rapid, backtracking tree-search algorithm to score motif compliance with greatly enhanced performance while placing no constraints on the motif. In addition, any positions within a motif can be marked as 'inviolate', thereby requiring an exact match. MOTIF allows a choice of regular expression formats and can use both motif and sequence libraries as either targets or queries. Nucleic acid sequences can optionally be translated by MOTIF in any frame(s) and used against peptide motifs.
Local backbone structure prediction of proteins

PubMed Central

De Brevern, Alexandre G.; Benros, Cristina; Gautier, Romain; Valadié, Hélène; Hazout, Serge; Etchebest, Catherine

2004-01-01

Summary A statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (φ, Ψ) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3D structure of proteins from the sole knowledge of their sequences. LocPred is a software which allows the users to submit a protein sequence and performs a prediction in terms of PBs. The prediction results are given both textually and graphically. PMID:15724288
Occurrence probability of structured motifs in random sequences.

PubMed

Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

2002-01-01

The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.
PLAAC: a web and command-line application to identify proteins with prion-like amino acid composition.

PubMed

Lancaster, Alex K; Nutter-Upham, Andrew; Lindquist, Susan; King, Oliver D

2014-09-01

Prions are self-templating protein aggregates that stably perpetuate distinct biological states and are of keen interest to researchers in both evolutionary and biomedical science. The best understood prions are from yeast and have a prion-forming domain with strongly biased amino acid composition, most notably enriched for Q or N. PLAAC is a web application that scans protein sequences for domains with P: rion- L: ike A: mino A: cid C: omposition. Users can upload sequence files, or paste sequences directly into a textbox. PLAAC ranks the input sequences by several summary scores and allows scores along sequences to be visualized. Text output files can be downloaded for further analyses, and visualizations saved in PDF and PNG formats. http://plaac.wi.mit.edu/. The Ruby-based web framework and the command-line software (implemented in Java, with visualization routines in R) are available at http://github.com/whitehead/plaac under the MIT license. All software can be run under OS X, Windows and Unix. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Transposon Tn10 contains two structural genes with opposite polarity between tetA and IS10R.

PubMed Central

Schollmeier, K; Hillen, W

1984-01-01

The nucleotide sequence of the central part of Tn10 has been determined from the rightmost HindIII site to IS10R. This sequence contains two open reading frames with opposite polarity. The in vivo transcription start points in this sequence have been determined by S1 mapping. These results define one minor and two major promoters. The transcription starts of the two major promoters are only 18 base pairs apart, and the transcripts show different polarity and overlap by 18 base pairs. The nucleotide sequence reveals two regions with palindromic symmetry which may serve as operators. Their possible involvement in the regulation of transcription of both genes is discussed. Taken together these results allow for a maximal coding capacity of 138 amino acids directed toward IS10R and 197 amino acids directed toward tetA. The possible function of these gene products is discussed. The accompanying article (Braus et al., J. Bacteriol. 160:504-509, 1984) presents evidence that these genes are expressed. Images PMID:6094471

Principles of protein folding--a perspective from simple exact models.

PubMed Central

Dill, K. A.; Bromberg, S.; Yue, K.; Fiebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S.

1995-01-01

General principles of protein structure, stability, and folding kinetics have recently been explored in computer simulations of simple exact lattice models. These models represent protein chains at a rudimentary level, but they involve few parameters, approximations, or implicit biases, and they allow complete explorations of conformational and sequence spaces. Such simulations have resulted in testable predictions that are sometimes unanticipated: The folding code is mainly binary and delocalized throughout the amino acid sequence. The secondary and tertiary structures of a protein are specified mainly by the sequence of polar and nonpolar monomers. More specific interactions may refine the structure, rather than dominate the folding code. Simple exact models can account for the properties that characterize protein folding: two-state cooperativity, secondary and tertiary structures, and multistage folding kinetics--fast hydrophobic collapse followed by slower annealing. These studies suggest the possibility of creating "foldable" chain molecules other than proteins. The encoding of a unique compact chain conformation may not require amino acids; it may require only the ability to synthesize specific monomer sequences in which at least one monomer type is solvent-averse. PMID:7613459
Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide.

PubMed

Pérez Sirkin, Daniela I; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M; Vissio, Paula G; Dufour, Sylvie

2017-01-01

GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation.
Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide

PubMed Central

Pérez Sirkin, Daniela I.; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M.; Vissio, Paula G.; Dufour, Sylvie

2017-01-01

GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation. PMID:28878737
Structure and biochemical functions of four simian virus 40 truncated large-T antigens.

PubMed Central

Chaudry, F; Harvey, R; Smith, A E

1982-01-01

The structure of four abnormal T antigens which are present in different simian virus 40 (SV40)-transformed mouse cell lines was studied by tryptic peptide mapping, partial proteolysis fingerprinting, immunoprecipitation with monoclonal antibodies, and in vitro translation. The results obtained allowed us to deduce that these proteins, which have apparent molecular weights of 15,000, 22,000, 33,000 and 45,000, are truncated forms of large-T antigen extending to different amounts into the amino acid sequences unique to large-T. The proteins are all phosphorylated, probably at a site between amino acids 106 and 123. The mRNAs coding for the proteins probably contain the normal large-T splice but are shorter than the normal transcripts of the SV40 early region. The truncated large-Ts were tested for the ability to bind to double-stranded DNA-cellulose. This showed that the 33,000- and 45,000-molecular-weight polypeptides contained sequences sufficient for binding under the conditions used, whereas the 15,000- and 22,000-molecular-weight forms did not. Together with published data, this allows the tentative mapping of a region of SV40 large-T between amino acids 109 and 272 that is necessary and may be sufficient for the binding to double-stranded DNA-cellulose in vitro. None of the truncated large-T species formed a stable complex with the host cell protein referred to as nonviral T-antigen or p53, suggesting that the carboxy-terminal sequences of large-T are necessary for complex formation. Images PMID:6292504
Using the QCM Biosensor-Based T7 Phage Display Combined with Bioinformatics Analysis for Target Identification of Bioactive Small Molecule.

PubMed

Takakusagi, Yoichi; Takakusagi, Kaori; Sugawara, Fumio; Sakaguchi, Kengo

2018-01-01

Identification of target proteins that directly bind to bioactive small molecule is of great interest in terms of clarifying the mode of action of the small molecule as well as elucidating the biological phenomena at the molecular level. Of the experimental technologies available, T7 phage display allows comprehensive screening of small molecule-recognizing amino acid sequence from the peptide libraries displayed on the T7 phage capsid. Here, we describe the T7 phage display strategy that is combined with quartz-crystal microbalance (QCM) biosensor for affinity selection platform and bioinformatics analysis for small molecule-recognizing short peptides. This method dramatically enhances efficacy and throughput of the screening for small molecule-recognizing amino acid sequences without repeated rounds of selection. Subsequent execution of bioinformatics programs allows combinatorial and comprehensive target protein discovery of small molecules with its binding site, regardless of protein sample insolubility, instability, or inaccessibility of the fixed small molecules to internally located binding site on larger target proteins when conventional proteomics approaches are used.
Molecular cloning and nucleotide sequence of the alpha and beta subunits of allophycocyanin from the cyanelle genome of Cyanophora paradoxa.

PubMed Central

Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E

1985-01-01

The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
Reductionist Approach in Peptide-Based Nanotechnology.

PubMed

Gazit, Ehud

2018-06-20

The formation of ordered nanostructures by molecular self-assembly of proteins and peptides represents one of the principal directions in nanotechnology. Indeed, polyamides provide superior features as materials with diverse physical properties. A reductionist approach allowed the identification of extremely short peptide sequences, as short as dipeptides, which could form well-ordered amyloid-like β-sheet-rich assemblies comparable to supramolecular structures made of much larger proteins. Some of the peptide assemblies show remarkable mechanical, optical, and electrical characteristics. Another direction of reductionism utilized a natural noncoded amino acid, α-aminoisobutryic acid, to form short superhelical assemblies. The use of this exceptional helix inducer motif allowed the fabrication of single heptad repeats used in various biointerfaces, including their use as surfactants and DNA-binding agents. Two additional directions of the reductionist approach include the use of peptide nucleic acids (PNAs) and coassembly techniques. The diversified accomplishments of the reductionist approach, as well as the exciting future advances it bears, are discussed.
Characterization of the genetic elements required for site-specific integration of plasmid pSE211 in Saccharopolyspora erythraea.

PubMed Central

Brown, D P; Idler, K B; Katz, L

1990-01-01

The 18.1-kilobase plasmid pSE211 integrates into the chromosome of Saccharopolyspora erythraea at a specific attB site. Restriction analysis of the integrated plasmid, pSE211int, and adjacent chromosomal sequences allowed identification of attP, the plasmid attachment site. Nucleotide sequencing of attP, attB, attL, and attR revealed a 57-base-pair sequence common to all sites with no duplications of adjacent plasmid or chromosomal sequences in the integrated state, indicating that integration takes place through conservative, reciprocal strand exchange. An analysis of the sequences indicated the presence of a putative gene for Phe-tRNA at attB which is preserved at attL after integration has occurred. A comparison of the attB site for a number of actinomycete plasmids is presented. Integration at attB was also observed when a 2.4-kilobase segment of pSE211 containing attP and the adjacent plasmid sequence was used to transform a pSE211- host. Nucleotide sequencing of this segment revealed the presence of two complete open reading frames (ORFs) and a segment of a third ORF. The ORF adjacent to attP encodes a putative polypeptide 437 amino acids in length that shows similarity, at its C-terminal domain, to sequences of site-specific recombinases of the integrase family. The adjacent ORF encodes a putative 98-amino-acid basic polypeptide that contains a helix-turn-helix motif at its N terminus which corresponds to domains in the Xis proteins of a number of bacteriophages. A proposal for the function of this polypeptide is presented. The deduced amino acid sequence of the third ORF did not reveal similarities to polypeptide sequences in the current data banks. Images FIG. 2 FIG. 3 PMID:2180909
Nucleic acid arrays and methods of synthesis

DOEpatents

Sabanayagam, Chandran R.; Sano, Takeshi; Misasi, John; Hatch, Anson; Cantor, Charles

2001-01-01

The present invention generally relates to high density nucleic acid arrays and methods of synthesizing nucleic acid sequences on a solid surface. Specifically, the present invention contemplates the use of stabilized nucleic acid primer sequences immobilized on solid surfaces, and circular nucleic acid sequence templates combined with the use of isothermal rolling circle amplification to thereby increase nucleic acid sequence concentrations in a sample or on an array of nucleic acid sequences.
Molecular Recognition and Structural Influences on Function in Bio-nanosystems of Nucleic Acids and Proteins

NASA Astrophysics Data System (ADS)

Sethaphong, Latsavongsakda

This work examines smart material properties of rational self-assembly and molecular recognition found in nano-biosystems. Exploiting the sequence and structural information encoded within nucleic acids and proteins will permit programmed synthesis of nanomaterials and help create molecular machines that may carry out new roles involving chemical catalysis and bioenergy. Responsive to different ionic environments thru self-reorgnization, nucleic acids (NA) are nature's signature smart material; organisms such as viruses and bacteria use features of NAs to react to their environment and orchestrate their lifecycle. Furthermore, nucleic acid systems (both RNA and DNA) are currently exploited as scaffolds; recent applications have been showcased to build bioelectronics and biotemplated nanostructures via directed assembly of multidimensional nanoelectronic devices 1. Since the most stable and rudimentary structure of nucleic acids is the helical duplex, these were modeled in order to examine the influence of the microenvironment, sequence, and cation-dependent perturbations of their canonical forms. Due to their negatively charged phosphate backbone, NA's rely on counterions to overcome the inherent repulsive forces that arise from the assembly of two complementary strands. As a realistic model system, we chose the HIV-TAR helix (PDB ID: 397D) to study specific sequence motifs on cation sequestration. At physiologically relevant concentrations of sodium and potassium ions, we observed sequence based effects where purine stretches were adept in retaining high residency cations. The transitional space between adenine and guanosine nucleotides (ApG step) in a sequence proved the most favorable. This work was the first to directly show these subtle interactions of sequence based cationic sequestration and may be useful for controlling metallization of nucleic acids in conductive nanowires. Extending the study further, we explored the degree to which the structure of NA duplexes alone interacted with cations distinct from a specific sequence. Under physiologically relevant conditions, a duplex of RNA polyguanine-polycitidine was highly responsive and able to sequester cations to the middle of the purine stretches. The least responsive structure was a DNA polyadenine-polythymine duplex. A random sequence DNA duplex contorted into an RNA-like helix resulted in cationic dynamics similar to RNA systems. These studies showed that cation diffusive binding events in nucleic acid duplex structures are sequence specific and heavily influenced by structural aspects helical forms to account for much of the differences observed. Although structural information in nucleic acids is encoded within their sequence, linking amino acid sequence to protein structure is murkier; the structural information within proteins is encoded by the folding process itself: a complex phenomenon driven toward the equilibrium state of the active conformation. Upwards of two thirds of a protein's sequence can be substituted with similar amino acids without significantly perturbing its function; conserved residues of about 10% seem to be vital; since evolutionary selection pressure in proteins operates 3-dimenionally, a linear sequence is partially informative. We explored this problem by folding de-novo the cytosolic portion of the membrane protein, cellulose synthase, CESA1 from upland cotton, Gossypium hirsutum (Ghcesa1). The cytoplasmic region was generated by homology modeling and refined with molecular dynamics. These mutations impair local structural flexibility which likely results in cellulose that is produced at a lower rate and is less crystalline. Additional modeling of fragments of cellulose synthases from the model plant, Arabidopsis thaliana, offered novel insights into the function of conserved cytosolic domains within plant cellulose synthases. Transport mechanisms related to the transmembrane region revealed significant differences between plants and a bacterial complex. These studies generated possible mutations that may allow for the creation of new synthases and identified other avenues of research in order to develop technologies that may alter the crystallinity and other useful properties of cellulose. 1. Karplus, K., SAM-T08, HMM-based protein structure prediction. Nucleic Acids Research, 2009. 37: p. W492-W497.
Two Perspectives on the Origin of the Standard Genetic Code

NASA Astrophysics Data System (ADS)

Sengupta, Supratim; Aggarwal, Neha; Bandhu, Ashutosh Vishwa

2014-12-01

The origin of a genetic code made it possible to create ordered sequences of amino acids. In this article we provide two perspectives on code origin by carrying out simulations of code-sequence coevolution in finite populations with the aim of examining how the standard genetic code may have evolved from more primitive code(s) encoding a small number of amino acids. We determine the efficacy of the physico-chemical hypothesis of code origin in the absence and presence of horizontal gene transfer (HGT) by allowing a diverse collection of code-sequence sets to compete with each other. We find that in the absence of horizontal gene transfer, natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. However, for certain probabilities of the horizontal transfer events, a universal code emerges having a structure that is consistent with the standard genetic code.
Physical Model of the Genotype-to-Phenotype Map of Proteins

NASA Astrophysics Data System (ADS)

Tlusty, Tsvi; Libchaber, Albert; Eckmann, Jean-Pierre

2017-04-01

How DNA is mapped to functional proteins is a basic question of living matter. We introduce and study a physical model of protein evolution which suggests a mechanical basis for this map. Many proteins rely on large-scale motion to function. We therefore treat protein as learning amorphous matter that evolves towards such a mechanical function: Genes are binary sequences that encode the connectivity of the amino acid network that makes a protein. The gene is evolved until the network forms a shear band across the protein, which allows for long-range, soft modes required for protein function. The evolution reduces the high-dimensional sequence space to a low-dimensional space of mechanical modes, in accord with the observed dimensional reduction between genotype and phenotype of proteins. Spectral analysis of the space of 1 06 solutions shows a strong correspondence between localization around the shear band of both mechanical modes and the sequence structure. Specifically, our model shows how mutations are correlated among amino acids whose interactions determine the functional mode.
NNAlign: a platform to construct and evaluate artificial neural network models of receptor-ligand interactions.

PubMed

Nielsen, Morten; Andreatta, Massimo

2017-07-03

Peptides are extensively used to characterize functional or (linear) structural aspects of receptor-ligand interactions in biological systems, e.g. SH2, SH3, PDZ peptide-recognition domains, the MHC membrane receptors and enzymes such as kinases and phosphatases. NNAlign is a method for the identification of such linear motifs in biological sequences. The algorithm aligns the amino acid or nucleotide sequences provided as training set, and generates a model of the sequence motif detected in the data. The webserver allows setting up cross-validation experiments to estimate the performance of the model, as well as evaluations on independent data. Many features of the training sequences can be encoded as input, and the network architecture is highly customizable. The results returned by the server include a graphical representation of the motif identified by the method, performance values and a downloadable model that can be applied to scan protein sequences for occurrence of the motif. While its performance for the characterization of peptide-MHC interactions is widely documented, we extended NNAlign to be applicable to other receptor-ligand systems as well. Version 2.0 supports alignments with insertions and deletions, encoding of receptor pseudo-sequences, and custom alphabets for the training sequences. The server is available at http://www.cbs.dtu.dk/services/NNAlign-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Purification, characterization and sequence analysis of Omp50,a new porin isolated from Campylobacter jejuni.

PubMed Central

Bolla, J M; Dé, E; Dorez, A; Pagès, J M

2000-01-01

A novel pore-forming protein identified in Campylobacter was purified by ion-exchange chromatography and named Omp50 according to both its molecular mass and its outer membrane localization. We observed a pore-forming ability of Omp50 after re-incorporation into artificial membranes. The protein induced cation-selective channels with major conductance values of 50-60 pS in 1 M NaCl. N-terminal sequencing allowed us to identify the predicted coding sequence Cj1170c from the Campylobacter jejuni genome database as the corresponding gene in the NCTC 11168 genome sequence. The gene, designated omp50, consists of a 1425 bp open reading frame encoding a deduced 453-amino acid protein with a calculated pI of 5.81 and a molecular mass of 51169.2 Da. The protein possessed a 20-amino acid leader sequence. No significant similarity was found between Omp50 and porin protein sequences already determined. Moreover, the protein showed only weak sequence identity with the major outer-membrane protein (MOMP) of Campylobacter, correlating with the absence of antigenic cross-reactivity between these two proteins. Omp50 is expressed in C. jejuni and Campylobacter lari but not in Campylobacter coli. The gene, however, was detected in all three species by PCR. According to its conformation and functional properties, the protein would belong to the family of outer-membrane monomeric porins. PMID:11104668
Bioinformatic prediction and in vivo validation of residue-residue interactions in human proteins

NASA Astrophysics Data System (ADS)

Jordan, Daniel; Davis, Erica; Katsanis, Nicholas; Sunyaev, Shamil

2014-03-01

Identifying residue-residue interactions in protein molecules is important for understanding both protein structure and function in the context of evolutionary dynamics and medical genetics. Such interactions can be difficult to predict using existing empirical or physical potentials, especially when residues are far from each other in sequence space. Using a multiple sequence alignment of 46 diverse vertebrate species we explore the space of allowed sequences for orthologous protein families. Amino acid changes that are known to damage protein function allow us to identify specific changes that are likely to have interacting partners. We fit the parameters of the continuous-time Markov process used in the alignment to conclude that these interactions are primarily pairwise, rather than higher order. Candidates for sites under pairwise epistasis are predicted, which can then be tested by experiment. We report the results of an initial round of in vivo experiments in a zebrafish model that verify the presence of multiple pairwise interactions predicted by our model. These experimentally validated interactions are novel, distant in sequence, and are not readily explained by known biochemical or biophysical features.
Back to the future: Rational maps for exploring acetylcholine receptor space and time.

PubMed

Tessier, Christian J G; Emlaw, Johnathon R; Cao, Zhuo Qian; Pérez-Areales, F Javier; Salameh, Jean-Paul J; Prinston, Jethro E; McNulty, Melissa S; daCosta, Corrie J B

2017-11-01

Global functions of nicotinic acetylcholine receptors, such as subunit cooperativity and compatibility, likely emerge from a network of amino acid residues distributed across the entire pentameric complex. Identification of such networks has stymied traditional approaches to acetylcholine receptor structure and function, likely due to the cryptic interdependency of their underlying amino acid residues. An emerging evolutionary biochemistry approach, which traces the evolutionary history of acetylcholine receptor subunits, allows for rational mapping of acetylcholine receptor sequence space, and offers new hope for uncovering the amino acid origins of these enigmatic properties. Copyright © 2017 Elsevier B.V. All rights reserved.
TALEN-mediated targeted mutagenesis of fatty acid desaturase 2 (FAD2) in peanut (Arachis hypogaea L.) promotes the accumulation of oleic acid.

PubMed

Wen, Shijie; Liu, Hao; Li, Xingyu; Chen, Xiaoping; Hong, Yanbin; Li, Haifen; Lu, Qing; Liang, Xuanqiang

2018-05-01

A first creation of high oleic acid peanut varieties by using transcription activator-like effecter nucleases (TALENs) mediated targeted mutagenesis of Fatty Acid Desaturase 2 (FAD2). Transcription activator like effector nucleases (TALENs), which allow the precise editing of DNA, have already been developed and applied for genome engineering in diverse organisms. However, they are scarcely used in higher plant study and crop improvement, especially in allopolyploid plants. In the present study, we aimed to create targeted mutagenesis by TALENs in peanut. Targeted mutations in the conserved coding sequence of Arachis hypogaea fatty acid desaturase 2 (AhFAD2) were created by TALENs. Genetic stability of AhFAD2 mutations was identified by DNA sequencing in up to 9.52 and 4.11% of the regeneration plants at two different targeted sites, respectively. Mutation frequencies among AhFAD2 mutant lines were significantly correlated to oleic acid accumulation. Genetically, stable individuals of positive mutant lines displayed a 0.5-2 fold increase in the oleic acid content compared with non-transgenic controls. This finding suggested that TALEN-mediated targeted mutagenesis could increase the oleic acid content in edible peanut oil. Furthermore, this was the first report on peanut genome editing event, and the obtained high oleic mutants could serve for peanut breeding project.
PCR detection of thermophilic spore-forming bacteria involved in canned food spoilage.

PubMed

Prevost, S; Andre, S; Remize, F

2010-12-01

Thermophilic bacteria that form highly heat-resistant spores constitute an important group of spoilage bacteria of low-acid canned food. A PCR assay was developed in order to rapidly trace these bacteria. Three PCR primer pairs were designed from rRNA gene sequences. These primers were evaluated for the specificity and the sensitivity of detection. Two primer pairs allowed detection at the species level of Geobacillus stearothermophilus and Moorella thermoacetica/thermoautrophica. The other pair allowed group-specific detection of anaerobic thermophilic bacteria of the genera Thermoanaerobacterium, Thermoanaerobacter, Caldanerobium and Caldanaerobacter. After a single enrichment step, these PCR assays allowed the detection of 28 thermophiles from 34 cans of spoiled low-acid food. In addition, 13 ingredients were screened for the presence of these bacteria. This PCR assay serves as a detection method for strains able to spoil low-acid canned food treated at 55°C. It will lead to better reactivity in the canning industry. Raw materials and ingredients might be qualified not only for quantitative spore contamination, but also for qualitative contamination by highly heat-resistant spores.
Rapid identification of acetic acid bacteria using MALDI-TOF mass spectrometry fingerprinting.

PubMed

Andrés-Barrao, Cristina; Benagli, Cinzia; Chappuis, Malou; Ortega Pérez, Ruben; Tonolla, Mauro; Barja, François

2013-03-01

Acetic acid bacteria (AAB) are widespread microorganisms characterized by their ability to transform alcohols and sugar-alcohols into their corresponding organic acids. The suitability of matrix-assisted laser desorption-time of flight mass spectrometry (MALDI-TOF MS) for the identification of cultured AAB involved in the industrial production of vinegar was evaluated on 64 reference strains from the genera Acetobacter, Gluconacetobacter and Gluconobacter. Analysis of MS spectra obtained from single colonies of these strains confirmed their basic classification based on comparative 16S rRNA gene sequence analysis. MALDI-TOF analyses of isolates from vinegar cross-checked by comparative sequence analysis of 16S rRNA gene fragments allowed AAB to be identified, and it was possible to differentiate them from mixed cultures and non-AAB. The results showed that MALDI-TOF MS analysis was a rapid and reliable method for the clustering and identification of AAB species. Copyright © 2012 Elsevier GmbH. All rights reserved.
Experimental and Theoretical Studies on the Nazarov Cyclization/Wagner-Meerwein Rearrangement Sequence

PubMed Central

Lebœuf, David; Ciesielski, Jennifer

2012-01-01

Highly functionalized cyclopentenones can be generated stereospecifically by a chemoselective copper(II)-mediated Nazarov/Wagner-Meerwein rearrangement sequence of divinyl ketones. A detailed investigation of this sequence is described including a study of substrate scope and limitations. After the initial 4π electrocyclization, this reaction proceeds via two different sequential [1,2]-shifts, with selectivity that depends upon either migratory ability or the steric bulkiness of the substituents at C1 and C5. This methodology allows the creation of vicinal stereogenic centers, including adjacent quaternary centers. This sequence can also be achieved by using a catalytic amount of copper(II) in combination with NaBAr4f, a weak Lewis acid. During the study of the scope of the reaction, a partial or complete E / Z isomerization of the enone moiety was observed in some cases prior to the cyclization, which resulted in a mixture of diastereomeric products. Use of a Cu(II)-bisoxazoline complex prevented the isomerization, allowing high diastereoselectivity to be obtained in all substrate types. In addition, the reaction sequence was studied by DFT computations at the UB3LYP/6-31G(d,p) level, which are consistent with the proposed sequences observed, including E / Z isomerizations and chemoselective Wagner-Meerwein shifts. PMID:22471833

Using One's Hands for Naming Optical Isomers and Other Stereochemical Positions.

ERIC Educational Resources Information Center

Mezl, Vasek A.

1996-01-01

Presents a method that allows students to use their hands to obtain the stereochemistry of chiral centers without redrawing the structure. Discusses the use of the model in: determining the configurations of amino acids, determining if sugars are D or L isomers, the sequence rule procedure, prochirality, naming the sides of trigonal carbons, and…
Molecular Design of Performance Proteins With Repetitive Sequences

NASA Astrophysics Data System (ADS)

Vendrely, Charlotte; Ackerschott, Christian; Römer, Lin; Scheibel, Thomas

Most performance proteins responsible for the mechanical stability of cells and organisms reveal highly repetitive sequences. Mimicking such performance proteins is of high interest for the design of nanostructured biomaterials. In this article, flagelliform silk is exemplary introduced to describe a general principle for designing genes of repetitive performance proteins for recombinant expression in Escherichia coli . In the first step, repeating amino acid sequence motifs are reversely transcripted into DNA cassettes, which can in a second step be seamlessly ligated, yielding a designed gene. Recombinant expression thereof leads to proteins mimicking the natural ones. The recombinant proteins can be assembled into nanostructured materials in a controlled manner, allowing their use in several applications.
Dictionary-driven protein annotation

PubMed Central

Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

2002-01-01

Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/. PMID:12202776
Identification and expression analysis of two pro-inflammatory cytokines, TNF-α and IL-8, in cobia (Rachycentron canadum L.) in response to Streptococcus dysgalactiae infection.

PubMed

Nguyen, Thuy Thi Thu; Nguyen, Hai Trong; Wang, Pei-Chyi; Chen, Shih-Chu

2017-08-01

Tumor necrosis factor-alpha (TNF-α) and interleukin-8 (IL-8/CXCL8) play pivotal roles in mediating inflammatory responses to invading pathogens. In this study, we identified and analyzed expressions of cobia TNF-α and IL-8 during Streptococcus dysgalactiae infection. The cloned cDNA transcript of cobia TNF-α comprised of 1281 base pairs (bp), with a 774 bp open reading frame (ORF) encoding 257 amino acids. The deduced amino acid sequence of cobia TNF-α showed a close relationship (84% similarity) with TNF-α of yellowtail amberjack. The cloned IL-8 cDNA sequence was 828 bp long, including a 300-bp ORF encoding 99 amino acids. The deduced amino acid sequence of cobia IL-8 shared 90% identity with IL-8 of striped trumpeter. Cobia challenged with a virulent S. dysgalactiae strain displayed an early significant up-regulation of TNF-α and IL-8 in head kidney, liver, and spleen. Notably, IL-8 expression level increased dramatically in the liver at the severe stage of infection (72 h). In conclusion, a better understanding of TNF-α and IL-8 allows more detailed investigation of immune responses in cobia and furthers study on controlling the infectious disease caused by S. dysgalactiae. Copyright © 2017 Elsevier Ltd. All rights reserved.
Genome sequence of the highly weak-acid-tolerant Zygosaccharomyces bailii IST302, amenable to genetic manipulations and physiological studies.

PubMed

Palma, Margarida; Münsterkötter, Martin; Peça, João; Güldener, Ulrich; Sá-Correia, Isabel

2017-06-01

Zygosaccharomyces bailii is one of the most problematic spoilage yeast species found in the food and beverage industry particularly in acidic products, due to its exceptional resistance to weak acid stress. This article describes the annotation of the genome sequence of Z. bailii IST302, a strain recently proven to be amenable to genetic manipulations and physiological studies. The work was based on the annotated genomes of strain ISA1307, an interspecies hybrid between Z. bailii and a closely related species, and the Z. bailii reference strain CLIB 213T. The resulting genome sequence of Z. bailii IST302 is distributed through 105 scaffolds, comprising a total of 5142 genes and a size of 10.8 Mb. Contrasting with CLIB 213T, strain IST302 does not form cell aggregates, allowing its manipulation in the laboratory for genetic and physiological studies. Comparative cell cycle analysis with the haploid and diploid Saccharomyces cerevisiae strains BY4741 and BY4743, respectively, suggests that Z. bailii IST302 is haploid. This is an additional trait that makes this strain attractive for the functional analysis of non-essential genes envisaging the elucidation of mechanisms underlying its high tolerance to weak acid food preservatives, or the investigation and exploitation of the potential of this resilient yeast species as cell factory. © FEMS 2017.
The catalytic chain of human complement subcomponent C1r. Purification and N-terminal amino acid sequences of the major cyanogen bromide-cleavage fragments.

PubMed

Arlaud, G J; Gagnon, J; Porter, R R

1982-01-01

1. The a- and b-chains of reduced and alkylated human complement subcomponent C1r were separated by high-pressure gel-permeation chromatography and isolated in good yield and in pure form. 2. CNBr cleavage of C1r b-chain yielded eight major peptides, which were purified by gel filtration and high-pressure reversed-phase chromatography. As determined from the sum of their amino acid compositions, these peptides accounted for a minimum molecular weight of 28 000, close to the value 29 100 calculated from the whole b-chain. 3. N-Terminal sequence determinations of C1r b-chain and its CNBr-cleavage peptides allowed the identification of about two-thirds of the amino acids of C1r b-chain. From our results, and on the basis of homology with other serine proteinases, an alignment of the eight CNBr-cleavage peptides from C1r b-chain is proposed. 4. The residues forming the 'charge-relay' system of the active site of serine proteinases (His-57, Asp-102 and Ser-195 in the chymotrypsinogen numbering) are found in the corresponding regions of C1r b-chain, and the amino acid sequence around these residues has been determined. 5. The N-terminal sequence of C1r b-chain has been extended to residue 60 and reveals that C1r b-chain lacks the 'histidine loop', a disulphide bond that is present in all other known serine proteinases.
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2011 CFR

2011-07-01

... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...
Optimized invertase expression and secretion cassette for improving Yarrowia lipolytica growth on sucrose for industrial applications.

PubMed

Lazar, Zbigniew; Rossignol, Tristan; Verbeke, Jonathan; Crutz-Le Coq, Anne-Marie; Nicaud, Jean-Marc; Robak, Małgorzata

2013-11-01

Yarrowia lipolytica requires the expression of a heterologous invertase to grow on a sucrose-based substrate. This work reports the construction of an optimized invertase expression cassette composed of Saccharomyces cerevisiae Suc2p secretion signal sequence followed by the SUC2 sequence and under the control of the strong Y. lipolytica pTEF promoter. This new construction allows a fast and optimal cleavage of sucrose into glucose and fructose and allows cells to reach the maximum growth rate. Contrary to pre-existing constructions, the expression of SUC2 is not sensitive to medium composition in this context. The strain JMY2593, expressing this new cassette with an optimized secretion signal sequence and a strong promoter, produces 4,519 U/l of extracellular invertase in bioreactor experiments compared to 597 U/l in a strain expressing the former invertase construction. The expression of this cassette strongly improved production of invertase and is suitable for simultaneously high production level of citric acid from sucrose-based media.
Investigation of mRNA quadruplex formation in Escherichia coli.

PubMed

Wieland, Markus; Hartig, Jörg S

2009-01-01

The protocol presented here allows for the investigation of the formation of unusual nucleic acid structures in the 5'-untranslated region (UTR) of bacteria by correlating gene expression levels to the in vitro stability of the respective structure. In particular, we describe the introduction of G-quadruplex forming sequences close to the ribosome-binding site (RBS) on the mRNA of a reporter gene and the subsequent read-out of the expression levels. Insertion of a stable secondary structure results in the cloaking of RBS and eventually reduced gene expression levels. The structures and stability of the introduced sequences are further characterized by circular dichroism (CD) spectroscopy and thermal melting experiments. The extent of inhibition is then correlated to the stability of the respective quadruplex structure, allowing judgement of whether factors other than thermodynamic stability affect the formation of a given quadruplex sequence in vivo. Measuring gene expression levels takes 2 d including cloning; CD experiments take 5 hours per experiment.
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

PubMed Central

2017-01-01

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221
Integrating mRNA and Protein Sequencing Enables the Detection and Quantitative Profiling of Natural Protein Sequence Variants of Populus trichocarpa.

PubMed

Abraham, Paul E; Wang, Xiaojing; Ranjan, Priya; Nookaew, Intawat; Zhang, Bing; Tuskan, Gerald A; Hettich, Robert L

2015-12-04

Next-generation sequencing has transformed the ability to link genotypes to phenotypes and facilitates the dissection of genetic contribution to complex traits. However, it is challenging to link genetic variants with the perturbed functional effects on proteins encoded by such genes. Here we show how RNA sequencing can be exploited to construct genotype-specific protein sequence databases to assess natural variation in proteins, providing information about the molecular toolbox driving cellular processes. For this study, we used two natural genotypes selected from a recent genome-wide association study of Populus trichocarpa, an obligate outcrosser with tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs), as well as insertions and deletions. We profiled the frequency of 128 types of naturally occurring amino acid substitutions, including both expected (neutral) and unexpected (non-neutral) SAAPs, with a subset occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. By zeroing in on the molecular signatures of these important regions that might have previously been uncharacterized, we now provide a high-resolution molecular inventory that should improve accessibility and subsequent identification of natural protein variants in future genotype-to-phenotype studies.
Solid phase sequencing of double-stranded nucleic acids

DOEpatents

Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

2002-01-01

This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.
Protein structure and the sequential structure of mRNA: alpha-helix and beta-sheet signals at the nucleotide level.

PubMed

Brunak, S; Engelbrecht, J

1996-06-01

A direct comparison of experimentally determined protein structures and their corresponding protein coding mRNA sequences has been performed. We examine whether real world data support the hypothesis that clusters of rare codons correlate with the location of structural units in the resulting protein. The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain. A complete search for GenBank nucleotide sequences coding for structural entries in the Brookhaven Protein Data Bank produced 719 protein chains with matching mRNA sequence, amino acid sequence, and secondary structure assignment. By neural network analysis, we found strong signals in mRNA sequence regions surrounding helices and sheets. These signals do not originate from the clustering of rare codons, but from the similarity of codons coding for very abundant amino acid residues at the N- and C-termini of helices and sheets. No correlation between the positioning of rare codons and the location of structural units was found. The mRNA signals were also compared with conserved nucleotide features of 16S-like ribosomal RNA sequences and related to mechanisms for maintaining the correct reading frame by the ribosome.
Structure-related statistical singularities along protein sequences: a correlation study.

PubMed

Colafranceschi, Mauro; Colosimo, Alfredo; Zbilut, Joseph P; Uversky, Vladimir N; Giuliani, Alessandro

2005-01-01

A data set composed of 1141 proteins representative of all eukaryotic protein sequences in the Swiss-Prot Protein Knowledge base was coded by seven physicochemical properties of amino acid residues. The resulting numerical profiles were submitted to correlation analysis after the application of a linear (simple mean) and a nonlinear (Recurrence Quantification Analysis, RQA) filter. The main RQA variables, Recurrence and Determinism, were subsequently analyzed by Principal Component Analysis. The RQA descriptors showed that (i) within protein sequences is embedded specific information neither present in the codes nor in the amino acid composition and (ii) the most sensitive code for detecting ordered recurrent (deterministic) patterns of residues in protein sequences is the Miyazawa-Jernigan hydrophobicity scale. The most deterministic proteins in terms of autocorrelation properties of primary structures were found (i) to be involved in protein-protein and protein-DNA interactions and (ii) to display a significantly higher proportion of structural disorder with respect to the average data set. A study of the scaling behavior of the average determinism with the setting parameters of RQA (embedding dimension and radius) allows for the identification of patterns of minimal length (six residues) as possible markers of zones specifically prone to inter- and intramolecular interactions.
Striking similarities in amino acid sequence among nonstructural proteins encoded by RNA viruses that have dissimilar genomic organization.

PubMed Central

Haseloff, J; Goelet, P; Zimmern, D; Ahlquist, P; Dasgupta, R; Kaesberg, P

1984-01-01

The plant viruses alfalfa mosaic virus (AMV) and brome mosaic virus (BMV) each divide their genetic information among three RNAs while tobacco mosaic virus (TMV) contains a single genomic RNA. Amino acid sequence comparisons suggest that the single proteins encoded by AMV RNA 1 and BMV RNA 1 and by AMV RNA 2 and BMV RNA 2 are related to the NH2-terminal two-thirds and the COOH-terminal one-third, respectively, of the largest protein encoded by TMV. Separating these two domains in the TMV RNA sequence is an amber termination codon, whose partial suppression allows translation of the downstream domain. Many of the residues that the TMV read-through domain and the segmented plant viruses have in common are also conserved in a read-through domain found in the nonstructural polyprotein of the animal alphaviruses Sindbis and Middelburg. We suggest that, despite substantial differences in gene organization and expression, all of these viruses use related proteins for common functions in RNA replication. Reassortment of functional modules of coding and regulatory sequence from preexisting viral or cellular sources, perhaps via RNA recombination, may be an important mechanism in RNA virus evolution. PMID:6611550
Characterization of a dam Mutant of Serratia marcescens and Nucleotide Sequence of the dam Region

PubMed Central

Ostendorf, Tammo; Cherepanov, Peter; de Vries, Johann; Wackernagel, Wilfried

1999-01-01

The DNA of Serratia marcescens has N6-adenine methylation in GATC sequences. Among 2-aminopurine-sensitive mutants isolated from S. marcescens Sr41, one was identified which lacked GATC methylation. The mutant showed up to 30-fold increased spontaneous mutability and enhanced mutability after treatment with 2-aminopurine, ethyl methanesulfonate, or UV light. The gene (dam) coding for the adenine methyltransferase (Dam enzyme) of S. marcescens was identified on a gene bank plasmid which alleviated the 2-aminopurine sensitivity and the higher mutability of a dam-13::Tn9 mutant of Escherichia coli. Nucleotide sequencing revealed that the deduced amino acid sequence of Dam (270 amino acids; molecular mass, 31.3 kDa) has 72% identity to the Dam enzyme of E. coli. The dam gene is located between flanking genes which are similar to those found to the sides of the E. coli dam gene. The results of complementation studies indicated that like Dam of E. coli and unlike Dam of Vibrio cholerae, the Dam enzyme of S. marcescens plays an important role in mutation avoidance by allowing the mismatch repair enzymes to discriminate between the parental and newly synthesized strands during correction of replication errors. PMID:10383952
A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing

PubMed Central

Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante

2008-01-01

Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465
The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

PubMed

Profiti, Giuseppe; Martelli, Pier Luigi; Casadio, Rita

2017-07-03

BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Multiplex, Rapid, and Sensitive Isothermal Detection of Nucleic-Acid Sequence by Endonuclease Restriction-Mediated Real-Time Multiple Cross Displacement Amplification.

PubMed

Wang, Yi; Wang, Yan; Zhang, Lu; Liu, Dongxin; Luo, Lijuan; Li, Hua; Cao, Xiaolong; Liu, Kai; Xu, Jianguo; Ye, Changyun

2016-01-01

We have devised a novel isothermal amplification technology, termed endonuclease restriction-mediated real-time multiple cross displacement amplification (ET-MCDA), which facilitated multiplex, rapid, specific and sensitive detection of nucleic-acid sequences at a constant temperature. The ET-MCDA integrated multiple cross displacement amplification strategy, restriction endonuclease cleavage and real-time fluorescence detection technique. In the ET-MCDA system, the functional cross primer E-CP1 or E-CP2 was constructed by adding a short sequence at the 5' end of CP1 or CP2, respectively, and the new E-CP1 or E-CP2 primer was labeled at the 5' end with a fluorophore and in the middle with a dark quencher. The restriction endonuclease Nb.BsrDI specifically recognized the short sequence and digested the newly synthesized double-stranded terminal sequences (5' end short sequences and their complementary sequences), which released the quenching, resulting on a gain of fluorescence signal. Thus, the ET-MCDA allowed real-time detection of single or multiple targets in only a single reaction, and the positive results were observed in as short as 12 min, detecting down to 3.125 fg of genomic DNA per tube. Moreover, the analytical specificity and the practical application of the ET-MCDA were also successfully evaluated in this study. Here, we provided the details on the novel ET-MCDA technique and expounded the basic ET-MCDA amplification mechanism.
Electron Transfer Dissociation with Supplemental Activation to Differentiate Aspartic and Isoaspartic Residues in Doubly Charged Peptide Cations

PubMed Central

Chan, Wai Yi Kelly; Chan, T. W. Dominic; O’Connor, Peter B.

2011-01-01

Electron-transfer dissociation (ETD) with supplemental activation of the doubly charged deamidated tryptic digested peptide ions allows differentiation of isoaspartic acid and aspartic acid residues using c + 57 or z• − 57 peaks. The diagnostic peak clearly localizes and characterizes the isoaspartic acid residue. Supplemental activation in ETD of the doubly charged peptide ions involves resonant excitation of the charge reduced precursor radical cations and leads to further dissociation, including extra backbone cleavages and secondary fragmentation. Supplemental activation is essential to obtain a high quality ETD spectrum (especially for doubly charged peptide ions) with sequence information. Unfortunately, the low-resolution of the ion trap mass spectrometer makes detection of the diagnostic peak for the aspartic acid residue difficult due to interference with side-chain loss from arginine and glutamic acid residues. PMID:20304674

Solid phase sequencing of biopolymers

DOEpatents

Cantor, Charles; Koster, Hubert

2010-09-28

This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.
Import of honeybee prepromelittin into the endoplasmic reticulum: structural basis for independence of SRP and docking protein.

PubMed Central

Müller, G; Zimmermann, R

1987-01-01

Honeybee prepromelittin is correctly processed and imported by dog pancreas microsomes. Insertion of prepromelittin into microsomal membranes, as assayed by signal sequence removal, does not depend on signal recognition particle (SRP) and docking protein. We addressed the question as to how prepromelittin bypasses the SRP/docking protein system. Hybrid proteins between prepromelittin, or carboxy-terminally truncated derivatives, and the cytoplasmic protein dihydrofolate reductase from mouse were constructed. These hybrid proteins were analysed for membrane insertion and sequestration into microsomes. The results suggest the following: (i) The signal sequence of prepromelittin is capable of interacting with the SRP/docking protein system, but this interaction is not mandatory for membrane insertion; this is related to the small size of prepromelittin. (ii) In prepromelittin a cluster of negatively charged amino acids must be balanced by a cluster of positively charged amino acids in order to allow membrane insertion. (iii) In general, a signal sequence can be sufficient to mediate membrane insertion independently of SRP and docking protein in the case of short precursor proteins; however, the presence and distribution of charged amino acids within the mature part of these precursors can play distinct roles. Images Fig. 3. Fig. 4. Fig. 5. Fig. 6. Fig. 7. Fig. 8. Fig. 9. PMID:2820722
Mapping the primary structure of copper/topaquinone-containing methylamine oxidase from Aspergillus niger.

PubMed

Lenobel, R; Sebela, M; Frébort, I

2005-01-01

The amino acid sequence of methylamine oxidase (MeAO) from the fungus Aspergillus niger was analyzed using mass spectrometry (MS). First, MeAO was characterized by an accurate molar mass of 72.4 kDa of the monomer measured using MALDI-TOF-MS and by a pI value of 5.8 determined by isoelectric focusing. MALDI-TOF-MS revealed a clear peptide mass fingerprint after tryptic digestion, which did not provide any relevant hit when searched against a nonredundant protein database and was different from that of A. niger amine oxidase AO-I. Tandem mass spectrometry with electrospray ionization coupled to liquid chromatography allowed unambiguous reading of six peptide sequences (11-19 amino acids) and seven sequence tags (4-15 amino acids), which were used for MS BLAST homology searching. MeAO was found to be largely homologous to a hypothetical protein AN7641.2 (EMBL/GenBank protein-accession code EAA61827) from Aspergillus nidulans FGSC A4 with a theoretical molar mass of 76.46 kDa and pI 6.14, which belongs to the superfamily of copper amine oxidases. The protein AN7641.2 is only little homologous to the amine oxidase AO-I (32% identity, 49 % similarity).
Complete sequence of HLA-B27 cDNA identified through the characterization of structural markers unique to the HLA-A, -B, and -C allelic series

DOE Office of Scientific and Technical Information (OSTI.GOV)

Szoets, H.; Reithmueller, G.; Weiss, E.

1986-03-01

Antigen HLA-B27 is a high-risk genetic factor with respect to a group of rheumatoid disorders, especially ankylosing spondylitis. A cDNA library was constructed from an autozygous B-cell line expressing HLA-B27, HLA-Cw1, and the previously cloned HLA-A2 antigen. Clones detected with an HLA probe were isolated and sorted into homology groups by differential hybridization and restriction maps. Nucleotide sequencing allowed the unambiguous assignment of cDNAs to HLA-A, -B, and -C loci. The HLA-B27 mRNA has the structure features and the codon variability typical of an HLA class I transcript but it specifies two uncommon amino acid replacements: a cysteine in positionmore » 67 and a serine in position 131. The latter substitution may have functional consequences, because it occurs in a conserved region and at a position invariably occupied by a species-specific arginine in humans and lysine in mice. The availability of the complete sequence of HLA-B27 and of the partial sequence of HLA-Cw1 allows the recognition of locus-specific sequence markers, particularly, but not exclusively, in the transmembrane and cytoplasmic domains.« less
An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

PubMed

Horn, T; Chang, C A; Urdea, M S

1997-12-01

The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

PubMed Central

Horn, T; Chang, C A; Urdea, M S

1997-01-01

The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
Droplet Microfluidic Device Fabrication and Use for Isothermal Amplification and Detection of MicroRNA.

PubMed

Giuffrida, Maria Chiara; D'Agata, Roberta; Spoto, Giuseppe

2017-01-01

Droplet microfluidics combined with the isothermal circular strand displacement polymerization (ICSDP) represents a powerful new technique to detect both single-stranded DNA and microRNA sequences. The method here described helps in overcoming some drawbacks of the lately introduced droplet polymerase chain reaction (PCR) amplification when implemented in microfluidic devices. The method also allows the detection of nanoliter droplets of nucleic acids sequences solutions, with a particular attention to microRNA sequences that are detected at the picomolar level. The integration of the ICSDP amplification protocol in droplet microfluidic devices reduces the time of analysis and the amount of sample required. In addition, there is also the possibility to design parallel analyses to be integrated in portable devices.
Thiamine pyrophosphokinase deficiency causes a Leigh Disease like phenotype in a sibling pair: identification through whole exome sequencing and management strategies.

PubMed

Fraser, Jamie L; Vanderver, Adeline; Yang, Sandra; Chang, Taeun; Cramp, Laura; Vezina, Gilbert; Lichter-Konecki, Uta; Cusmano-Ozog, Kristina P; Smpokou, Patroula; Chapman, Kimberly A; Zand, Dina J

2014-01-01

We present a sibling pair with Leigh-like disease, progressive hypotonia, regression, and chronic encephalopathy. Whole exome sequencing in the younger sibling demonstrated a homozygous thiamine pyrophosphokinase (TPK) mutation. Initiation of high dose thiamine, niacin, biotin, α-lipoic acid and ketogenic diet in this child demonstrated improvement in neurologic function and re-attainment of previously lost milestones. The diagnosis of TPK deficiency was difficult due to inconsistent biochemical and diagnostic parameters, rapidity of clinical demise and would not have been made in a timely manner without the use of whole exome sequencing. Molecular diagnosis allowed for attempt at dietary modification with cofactor supplementation which resulted in an improved clinical course.
The primary structure of rat liver ribosomal protein L37. Homology with yeast and bacterial ribosomal proteins.

PubMed

Lin, A; McNally, J; Wool, I G

1983-09-10

The covalent structure of the rat liver 60 S ribosomal subunit protein L37 was determined. Twenty-four tryptic peptides were purified and the sequence of each was established; they accounted for all 111 residues of L37. The sequence of the first 30 residues of L37, obtained previously by automated Edman degradation of the intact protein, provided the alignment of the first 9 tryptic peptides. Three peptides (CN1, CN2, and CN3) were produced by cleavage of protein L37 with cyanogen bromide. The sequence of CN1 (65 residues) was established from the sequence of secondary peptides resulting from cleavage with trypsin and chymotrypsin. The sequence of CN1 in turn served to order tryptic peptides 1 through 14. The sequence of CN2 (15 residues) was determined entirely by a micromanual procedure and allowed the alignment of tryptic peptides 14 through 18. The sequence of the NH2-terminal 28 amino acids of CN3 (31 residues) was determined; in addition the complete sequences of the secondary tryptic and chymotryptic peptides were done. The sequence of CN3 provided the order of tryptic peptides 18 through 24. Thus the sequence of the three cyanogen bromide peptides also accounted for the 111 residues of protein L37. The carboxyl-terminal amino acids were identified after carboxypeptidase A treatment. There is a disulfide bridge between half-cystinyl residues at positions 40 and 69. Rat liver ribosomal protein L37 is homologous with yeast YP55 and with Escherichia coli L34. Moreover, there is a segment of 17 residues in rat L37 that occurs, albeit with modifications, in yeast YP55 and in E. coli S4, L20, and L34.
A new high molecular weight immunoglobulin class from the carcharhine shark: implications for the properties of the primordial immunoglobulin.

PubMed

Berstein, R M; Schluter, S F; Shen, S; Marchalonis, J J

1996-04-16

All immunoglobulins and T-cell receptors throughout phylogeny share regions of highly conserved amino acid sequence. To identify possible primitive immunoglobulins and immunoglobulin-like molecules, we utilized 3' RACE (rapid amplification of cDNA ends) and a highly conserved constant region consensus amino acid sequence to isolate a new immunoglobulin class from the sandbar shark Carcharhinus plumbeus. The immunoglobulin, termed IgW, in its secreted form consists of 782 amino acids and is expressed in both the thymus and the spleen. The molecule overall most closely resembles mu chains of the skate and human and a new putative antigen binding molecule isolated from the nurse shark (NAR). The full-length IgW chain has a variable region resembling human and shark heavy-chain (VH) sequences and a novel joining segment containing the WGXGT motif characteristic of H chains. However, unlike any other H-chain-type molecule, it contains six constant (C) domains. The first C domain contains the cysteine residue characteristic of C mu1 that would allow dimerization with a light (L) chain. The fourth and sixth domains also contain comparable cysteines that would enable dimerization with other H chains or homodimerization. Comparison of the sequences of IgW V and C domains shows homology greater than that found in comparisons among VH and C mu or VL, or CL thereby suggesting that IgW may retain features of the primordial immunoglobulin in evolution.
Continuously tunable nucleic acid hybridization probes.

PubMed

Wu, Lucia R; Wang, Juexiao Sherry; Fang, John Z; Evans, Emily R; Pinto, Alessandro; Pekker, Irena; Boykin, Richard; Ngouenet, Celine; Webster, Philippa J; Beechem, Joseph; Zhang, David Yu

2015-12-01

In silico-designed nucleic acid probes and primers often do not achieve favorable specificity and sensitivity tradeoffs on the first try, and iterative empirical sequence-based optimization is needed, particularly in multiplexed assays. We present a novel, on-the-fly method of tuning probe affinity and selectivity by adjusting the stoichiometry of auxiliary species, which allows for independent and decoupled adjustment of the hybridization yield for different probes in multiplexed assays. Using this method, we achieved near-continuous tuning of probe effective free energy. To demonstrate our approach, we enforced uniform capture efficiency of 31 DNA molecules (GC content, 0-100%), maximized the signal difference for 11 pairs of single-nucleotide variants and performed tunable hybrid capture of mRNA from total RNA. Using the Nanostring nCounter platform, we applied stoichiometric tuning to simultaneously adjust yields for a 24-plex assay, and we show multiplexed quantitation of RNA sequences and variants from formalin-fixed, paraffin-embedded samples.
Integrating DNA strand displacement circuitry to the nonlinear hybridization chain reaction.

PubMed

Zhang, Zhuo; Fan, Tsz Wing; Hsing, I-Ming

2017-02-23

Programmable and modular attributes of DNA molecules allow one to develop versatile sensing platforms that can be operated isothermally and enzyme-free. In this work, we present an approach to integrate upstream DNA strand displacement circuits that can be turned on by a sequence-specific microRNA analyte with a downstream nonlinear hybridization chain reaction for a cascading hyperbranched nucleic acid assembly. This system provides a two-step amplification strategy for highly sensitive detection of the miRNA analyte, conducive for multiplexed detection. Multiple miRNA analytes were tested with our integrated circuitry using the same downstream signal amplification setting, showing the decoupling of nonlinear self-assembly with the analyte sequence. Compared with the reported methods, our signal amplification approach provides an additional control module for higher-order DNA self-assembly and could be developed into a promising platform for the detection of critical nucleic-acid based biomarkers.
Molecular cloning of a cDNA encoding the precursor of adenoregulin from frog skin. Relationships with the vertebrate defensive peptides, dermaseptins.

PubMed

Amiche, M; Ducancel, F; Lajeunesse, E; Boulain, J C; Ménez, A; Nicolas, P

1993-03-31

Adenoregulin has recently been isolated from Phyllomedusa skin as a 33 amino acid residues peptide which enhanced binding of agonists to the A1 adenosine receptor. In order to study the structure of the precursor of adenoregulin we constructed a cDNA library from mRNAs extracted from the skin of Phyllomedusa bicolor. We detected the complete nucleotide sequence of a cDNA encoding the adenoregulin biosynthetic precursor. The deduced sequence of the precursor is 81 amino acids long, exhibits a putative signal sequence at the NH2 terminus and contains a single copy of the biologically active peptide at the COOH terminus. Structural and conformational homologies that are observed between adenoregulin and the dermaseptins, antimicrobial peptides exhibiting strong membranolytic activities against various pathogenic agents, suggest that adenoregulin is an additional member of the growing family of cytotropic antimicrobial peptides that allow vertebrate animals to defend themselves against microorganisms. As such, the adenosine receptor regulating activity of adenoregulin could be due to its ability to interact with and disrupt membranes lipid bilayers.
The Thiamine-Pyrophosphate-Motif

NASA Technical Reports Server (NTRS)

Ciszak, Ewa; Dominiak, Paulina

2004-01-01

Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.
Controlling the Surface Chemistry of Graphite by Engineered Self-Assembled Peptides

PubMed Central

Khatayevich, Dmitriy; So, Christopher R.; Hayamizu, Yuhei; Gresswell, Carolyn; Sarikaya, Mehmet

2012-01-01

The systematic control over surface chemistry is a long-standing challenge in biomedical and nanotechnological applications for graphitic materials. As a novel approach, we utilize graphite-binding dodecapeptides that self-assemble into dense domains to form monolayer thick long-range ordered films on graphite. Specifically, the peptides are rationally designed through their amino acid sequences to predictably display hydrophilic and hydrophobic characteristics while maintaining their self-assembly capabilities on the solid substrate. The peptides are observed to maintain a high tolerance for sequence modification, allowing the control over surface chemistry via their amino acid sequence. Furthermore, through a single step co-assembly of two different designed peptides, we predictably and precisely tune the wettability of the resulting functionalized graphite surfaces from 44 to 83 degrees. The modular molecular structures and predictable behavior of short peptides demonstrated here give rise to a novel platform for functionalizing graphitic materials that offers numerous advantages, including non-invasive modification of the substrate, bio-compatible processing in an aqueous environment, and simple fusion with other functional biological molecules. PMID:22428620
Detection of nucleic acid sequences by invader-directed cleavage

DOEpatents

Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

1999-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2011 CFR

2011-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2013 CFR

2013-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2012 CFR

2012-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2010 CFR

2010-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...

37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2014 CFR

2014-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
A recombinant isoform of the Ole e 7 olive pollen allergen assembled by de novo mass spectrometry retains the allergenic ability of the natural allergen.

PubMed

Oeo-Santos, Carmen; Mas, Salvador; Benedé, Sara; López-Lucendo, María; Quiralte, Joaquín; Blanca, Miguel; Mayorga, Cristobalina; Villalba, Mayte; Barderas, Rodrigo

2018-06-05

The allergenic non-specific lipid transfer protein Ole e 7 from olive pollen is a major allergen associated with severe symptoms in areas with high olive pollen levels. Despite its clinical importance, its cloning and recombinant production has been unable by classical approaches. This study aimed at determining by mass-spectrometry based proteomics its complete amino acid sequence for its subsequent expression and characterization. To this end, the natural protein was in-2D-gel tryptic digested, and CID and HCD fragmentation spectra obtained by nLC-MS/MS analyzed using PEAKS software. Thirteen out of the 457 de novo sequenced peptides obtained allowed assembling its full-length amino acid sequence. Then, Ole e 7-encoding cDNA was synthesized and cloned in pPICZαA vector for its expression in Pichia pastoris yeast. The analyses by Circular Dichroism, and WB, ELISA and cell-based tests using sera and blood from olive pollen-sensitized patients showed that rOle e 7 mostly retained the structural, allergenic and antigenic properties of the natural allergen. In summary, rOle e 7 allergen assembled by de novo peptide sequencing by MS behaved immunologically similar to the natural allergen scarcely isolated from pollen. Olive pollen is an important cause of allergy. The non-specific lipid binding protein Ole e 7 is a major allergen with a high incidence and a phenotype associated to severe clinical symptoms. Despite its relevance, its cloning and recombinant expression has been unable by classical techniques. Here, we have inferred the primary amino acid sequence of Ole e 7 by mass-spectrometry. We separated Ole e 7 isolated from pollen by 2DE. After in-gel digestion with trypsin and a direct analysis by nLC-MS/MS in an LTQ-Orbitrap Velos, we got the complete de novo sequenced peptides repertoire that allowed the assembling of the primary sequence of Ole e 7. After its protein expression, purification to homogeneity, and structural and immunological characterization using sera from olive pollen allergic patients and cell-based assays, we observed that the recombinant allergen retained the antigenic and allergenic properties of the natural allergen. Collectively, we show that the recombinant protein assembled by proteomics would be suitable for a better in vitro diagnosis of olive pollen allergic patients. Copyright © 2018. Published by Elsevier B.V.
Blends of cysteine-containing proteins

NASA Astrophysics Data System (ADS)

Barone, Justin

2005-03-01

Many agricultural wastes are made of proteins such as keratin, lactalbumin, gluten, and albumin. These proteins contain the amino acid cysteine. Cysteine allows for the formation of inter-and intra-molecular sulfur-sulfur bonds. Correlations are made between the properties of films made from the proteins and the amino acid sequence. Blends of cysteine-containing proteins show possible synergies in physical properties at intermediate concentrations. FT-IR spectroscopy shows increased hydrogen bonding at intermediate concentrations suggesting that this contributes to increased physical properties. DSC shows limited miscibility and the formation of new crystalline phases in the blends suggesting that this too contributes.
Pyrin gene and mutants thereof, which cause familial Mediterranean fever

DOEpatents

Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL

2003-09-30

The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.
Functional Screening of Metagenome and Genome Libraries for Detection of Novel Flavonoid-Modifying Enzymes

PubMed Central

Rabausch, U.; Juergensen, J.; Ilmberger, N.; Böhnke, S.; Fischer, S.; Schubach, B.; Schulte, M.

2013-01-01

The functional detection of novel enzymes other than hydrolases from metagenomes is limited since only a very few reliable screening procedures are available that allow the rapid screening of large clone libraries. For the discovery of flavonoid-modifying enzymes in genome and metagenome clone libraries, we have developed a new screening system based on high-performance thin-layer chromatography (HPTLC). This metagenome extract thin-layer chromatography analysis (META) allows the rapid detection of glycosyltransferase (GT) and also other flavonoid-modifying activities. The developed screening method is highly sensitive, and an amount of 4 ng of modified flavonoid molecules can be detected. This novel technology was validated against a control library of 1,920 fosmid clones generated from a single Bacillus cereus isolate and then used to analyze more than 38,000 clones derived from two different metagenomic preparations. Thereby we identified two novel UDP glycosyltransferase (UGT) genes. The metagenome-derived gtfC gene encoded a 52-kDa protein, and the deduced amino acid sequence was weakly similar to sequences of putative UGTs from Fibrisoma and Dyadobacter. GtfC mediated the transfer of different hexose moieties and exhibited high activities on flavones, flavonols, flavanones, and stilbenes and also accepted isoflavones and chalcones. From the control library we identified a novel macroside glycosyltransferase (MGT) with a calculated molecular mass of 46 kDa. The deduced amino acid sequence was highly similar to sequences of MGTs from Bacillus thuringiensis. Recombinant MgtB transferred the sugar residue from UDP-glucose effectively to flavones, flavonols, isoflavones, and flavanones. Moreover, MgtB exhibited high activity on larger flavonoid molecules such as tiliroside. PMID:23686272
Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko

2006-06-10

Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, amore » cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25.« less
ORENZA: a web resource for studying ORphan ENZyme activities

PubMed Central

Lespinet, Olivier; Labedan, Bernard

2006-01-01

Background Despite the current availability of several hundreds of thousands of amino acid sequences, more than 36% of the enzyme activities (EC numbers) defined by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) are not associated with any amino acid sequence in major public databases. This wide gap separating knowledge of biochemical function and sequence information is found for nearly all classes of enzymes. Thus, there is an urgent need to explore these sequence-less EC numbers, in order to progressively close this gap. Description We designed ORENZA, a PostgreSQL database of ORphan ENZyme Activities, to collate information about the EC numbers defined by the NC-IUBMB with specific emphasis on orphan enzyme activities. Complete lists of all EC numbers and of orphan EC numbers are available and will be periodically updated. ORENZA allows one to browse the complete list of EC numbers or the subset associated with orphan enzymes or to query a specific EC number, an enzyme name or a species name for those interested in particular organisms. It is possible to search ORENZA for the different biochemical properties of the defined enzymes, the metabolic pathways in which they participate, the taxonomic data of the organisms whose genomes encode them, and many other features. The association of an enzyme activity with an amino acid sequence is clearly underlined, making it easy to identify at once the orphan enzyme activities. Interactive publishing of suggestions by the community would provide expert evidence for re-annotation of orphan EC numbers in public databases. Conclusion ORENZA is a Web resource designed to progressively bridge the unwanted gap between function (enzyme activities) and sequence (dataset present in public databases). ORENZA should increase interactions between communities of biochemists and of genomicists. This is expected to reduce the number of orphan enzyme activities by allocating gene sequences to the relevant enzymes. PMID:17026747
Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently

PubMed Central

Currin, Andrew; Swainston, Neil; Day, Philip J.

2015-01-01

The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the ‘search space’ of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (K d) and catalytic (k cat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving k cat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the ‘best’ amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust. PMID:25503938
Palladium-catalyzed domino C,N-coupling/carbonylation/Suzuki coupling reaction: an efficient synthesis of 2-aroyl-/heteroaroylindoles.

PubMed

Arthuis, Martin; Pontikis, Renée; Florent, Jean-Claude

2009-10-15

A convenient one-pot synthesis of 2-aroylindoles using a domino palladium-catalyzed C,N-coupling/carbonylation/C,C-coupling sequence is described. The reaction involved easily prepared 2-gem-dibromovinylanilines and boronic acids under carbon monoxide. Optimized reaction conditions allowed the construction of a wide variety of highly functionalized 2-aroyl-/heteroaroylindoles in satisfactory yields.
77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-10-29

... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...
Cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.

2007-12-11

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

1999-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

2002-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.

2010-11-09

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

2000-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Nucleic acid detection assays

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.

2005-04-05

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Contribution to the Prediction of the Fold Code: Application to Immunoglobulin and Flavodoxin Cases

PubMed Central

Banach, Mateusz; Prudhomme, Nicolas; Carpentier, Mathilde; Duprat, Elodie; Papandreou, Nikolaos; Kalinowska, Barbara; Chomilier, Jacques; Roterman, Irena

2015-01-01

Background Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model. Results We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus. PMID:25915049
Multiple-strand displacement and identification of single nucleotide polymorphisms as markers of genotypic variation of Pasteuria penetrans biotypes infecting root-knot nematodes.

PubMed

Nong, Guang; Chow, Virginia; Schmidt, Liesbeth M; Dickson, Don W; Preston, James F

2007-08-01

Pasteuria species are endospore-forming obligate bacterial parasites of soil-inhabiting nematodes and water-inhabiting cladocerans, e.g. water fleas, and are closely related to Bacillus spp. by 16S rRNA gene sequence. As naturally occurring bacteria, biotypes of Pasteuria penetrans are attractive candidates for the biocontrol of various Meloidogyne spp. (root-knot nematodes). Failure to culture these bacteria outside their hosts has prevented isolation of genomic DNA in quantities sufficient for identification of genes associated with host recognition and virulence. We have applied multiple-strand displacement amplification (MDA) to generate DNA for comparative genomics of biotypes exhibiting different host preferences. Using the genome of Bacillus subtilis as a paradigm, MDA allowed quantitative detection and sequencing of 12 marker genes from 2000 cells. Meloidogyne spp. infected with P. penetrans P20 or B4 contained single nucleotide polymorphisms (SNPs) in the spoIIAB gene that did not change the amino acid sequence, or that substituted amino acids with similar chemical properties. Individual nematodes infected with P. penetrans P20 or B4 contained SNPs in the spoIIAB gene sequenced in MDA-generated products. Detection of SNPs in the spoIIAB gene in a nematode indicates infection by more than one genotype, supporting the need to sequence genomes of Pasteuria spp. derived from single spore isolates.
Peptides whose uptake by cells is controllable

DOEpatents

Jiang, Tao [San Diego, CA; Tsien, Roger Y [La Jolla, CA

2012-02-07

A generic structure for the peptides of the present invention includes A-X-B-C, where C is a cargo moiety, the B portion includes basic amino acids, X is a cleavable linker sequence, and the A portion includes acidic amino acids. The intact structure is not significantly taken up by cells; however, upon extracellular cleavage of X, the B-C portion is taken up, delivering the cargo to targeted cells. Cargo may be, for example, a contrast agent for diagnostic imaging, a chemotherapeutic drug, or a radiation-sensitizer for therapy. Cleavage of X allows separation of A from B, unmasking the normal ability of the basic amino acids in B to drag cargo C into cells near the cleavage event. X is cleaved extracellularly, preferably under physiological conditions. D-amino acids are preferred for the A and B portions, to minimize immunogenicity and nonspecific cleavage by background peptidases or proteases.
Peptides whose uptake by cells is controllable

DOEpatents

Jiang, Tao [San Diego, CA; Tsien, Roger Y [La Jolla, CA

2008-10-07

A generic structure for the peptides of the present invention includes A-X-B-C, where C is a cargo moiety, the B portion includes basic amino acids, X is a cleavable linker sequence, and the A portion includes acidic amino acids. The intact structure is not significantly taken up by cells; however, upon extracellular cleavage of X, the B-C portion is taken up, delivering the cargo to targeted cells. Cargo may be, for example, a contrast agent for diagnostic imaging, a chemotherapeutic drug, or a radiation-sensitizer for therapy. Cleavage of X allows separation of A from B, unmasking the normal ability of the basic amino acids in B to drag cargo C into cells near the cleavage event. X is cleaved extracellularly, preferably under physiological conditions. D-amino acids are preferred for the A and B portions, to minimize immunogenicity and nonspecific cleavage by background peptidases or proteases.

Peptides whose uptake by cells is controllable

DOEpatents

Jiang, Tao; Tsien, Roger Y

2014-02-04

A generic structure for the peptides of the present invention includes A-X-B-C, where C is a cargo moiety, the B portion includes basic amino acids, X is a cleavable linker sequence, and the A portion includes acidic amino acids. The intact structure is not significantly taken up by cells; however, upon extracellular cleavage of X, the B-C portion is taken up, delivering the cargo to targeted cells. Cargo may be, for example, a contrast agent for diagnostic imaging, a chemotherapeutic drug, or a radiation-sensitizer for therapy. Cleavage of X allows separation of A from B, unmasking the normal ability of the basic amino acids in B to drag cargo C into cells near the cleavage event. X is cleaved extracellularly, preferably under physiological conditions. D-amino acids are preferred for the A and B portions, to minimize immunogenicity and nonspecific cleavage by background peptidases or proteases.
Functionally Convergent B Cell Receptor Sequences in Transgenic Rats Expressing a Human B Cell Repertoire in Response to Tetanus Toxoid and Measles Antigens.

PubMed

Bürckert, Jean-Philippe; Dubois, Axel R S X; Faison, William J; Farinelle, Sophie; Charpentier, Emilie; Sinner, Regina; Wienecke-Baldacchino, Anke; Muller, Claude P

2017-01-01

The identification and tracking of antigen-specific immunoglobulin (Ig) sequences within total Ig repertoires is central to high-throughput sequencing (HTS) studies of infections or vaccinations. In this context, public Ig sequences shared by different individuals exposed to the same antigen could be valuable markers for tracing back infections, measuring vaccine immunogenicity, and perhaps ultimately allow the reconstruction of the immunological history of an individual. Here, we immunized groups of transgenic rats expressing human Ig against tetanus toxoid (TT), Modified Vaccinia virus Ankara (MVA), measles virus hemagglutinin and fusion proteins expressed on MVA, and the environmental carcinogen benzo[a]pyrene, coupled to TT. We showed that these antigens impose a selective pressure causing the Ig heavy chain (IgH) repertoires of the rats to converge toward the expression of antibodies with highly similar IgH CDR3 amino acid sequences. We present a computational approach, similar to differential gene expression analysis, that selects for clusters of CDR3s with 80% similarity, significantly overrepresented within the different groups of immunized rats. These IgH clusters represent antigen-induced IgH signatures exhibiting stereotypic amino acid patterns including previously described TT- and measles-specific IgH sequences. Our data suggest that with the presented methodology, transgenic Ig rats can be utilized as a model to identify antigen-induced, human IgH signatures to a variety of different antigens.
Programming the Assembly of Unnatural Materials with Nucleic Acids

NASA Astrophysics Data System (ADS)

Mirkin, Chad

Nature directs the assembly of enormously complex and highly functional materials through an encoded class of biomolecules, nucleic acids. The establishment of a similarly programmable code for the construction of synthetic, unnatural materials would allow researchers to impart functionality by precisely positioning all material components. Although it is exceedingly difficult to control the complex interactions between atomic and molecular species in such a manner, interactions between nanoscale components can be directed through the ligands attached to their surface. Our group has shown that nucleic acids can be used as highly programmable surface ligands to control the spacing and symmetry of nanoparticle building blocks in structurally sophisticated and functional materials. These nucleic acids function as programmable ``bonds'' between nanoparticle ``atoms,'' analogous to a nanoscale genetic code for assembling materials. The sequence and length tunability of nucleic acid bonds has allowed us to define a powerful set of design rules for the construction of nanoparticle superlattices with more than 30 unique lattice symmetries, tunable defect structures and interparticle spacings, and several well-defined crystal habits. Further, the nature of the nucleic acid bond enables an additional level of structural control: temporal regulation of dynamic material response to external biomolecular and chemical stimuli. This control allows for the reversible transformation between thermodynamic states with different crystal symmetries, particle stoichiometries, thermal stabilities, and interparticle spacings on demand. Notably, our unique genetic approach affords functional nanoparticle architectures that, among many other applications, can be used to systematically explore and manipulate optoelectronic material properties, such as tunable interparticle plasmonic interactions, microstructure-directed energy emission, and coupled plasmonic and photonic modes.
Linkage-specific sialic acid derivatization for MALDI-TOF-MS profiling of IgG glycopeptides.

PubMed

de Haan, Noortje; Reiding, Karli R; Haberger, Markus; Reusch, Dietmar; Falck, David; Wuhrer, Manfred

2015-08-18

Glycosylation is a common co- and post-translational protein modification, having a large influence on protein properties like conformation and solubility. Furthermore, glycosylation is an important determinant of efficacy and clearance of biopharmaceuticals such as immunoglobulin G (IgG). Matrix-assisted laser desorption/ionization (MALDI)-time-of-flight (TOF)-mass spectrometry (MS) shows potential for the site-specific glycosylation analysis of IgG at the glycopeptide level. With this approach, however, important information about glycopeptide sialylation is not duly covered because of in-source and metastable decay of the sialylated species. Here, we present a highly repeatable sialic acid derivatization method to allow subclass-specific MALDI-TOF-MS analysis of tryptic IgG glycopeptides. The method, employing dimethylamidation with the carboxylic acid activator 1-ethyl-3-(3-dimethylamino)propyl)carbodiimide (EDC) and the catalyst 1-hydroxybenzotriazole (HOBt), results in different masses for the functionally divergent α2,3- and α2,6-linked sialic acids. Respective lactonization and dimethylamidation leads to their direct discrimination in MS and importantly, both glycan and peptide moieties reacted in a controlled manner. In addition, stabilization allowed the acquisition of fragmentation spectra informative with respect to glycosylation and peptide sequence. This was in contrast to fragmentation spectra of underivatized samples, which were dominated by sialic acid loss. The method allowed the facile discrimination and relative quantitation of IgG Fc sialylation in therapeutic IgG samples. The method has considerable potential for future site- and sialic acid linkage-specific glycosylation profiling of therapeutic antibodies, as well as for subclass-specific biomarker discovery in clinical IgG samples derived from plasma.
Biochemical and molecular characterization of the venom from the Cuban scorpion Rhopalurus junceus.

PubMed

García-Gómez, B I; Coronas, F I V; Restano-Cassulini, R; Rodríguez, R R; Possani, L D

2011-07-01

This communication describes the first general biochemical, molecular and functional characterization of the venom from the Cuban blue scorpion Rhopalurus junceus, which is often used as a natural product for anti-cancer therapy in Cuba. The soluble venom of this arachnid is not toxic to mice, injected intraperitoneally at doses up to 200 μg/20 g body weight, but it is deadly to insects at doses of 10 μg per animal. The venom causes typical alpha and beta-effects on Na+ channels, when assayed using patch-clamp techniques in neuroblastoma cells in vitro. It also affects K+ currents conducted by ERG (ether-a-go-go related gene) channels. The soluble venom was shown to display phospholipase, hyaluronidase and anti-microbial activities. High performance liquid chromatography of the soluble venom can separate at least 50 components, among which are peptides lethal to crickets. Four such peptides were isolated to homogeneity and their molecular masses and N-terminal amino acid sequence were determined. The major component (RjAa12f) was fully sequenced by Edman degradation. It contains 64 amino acid residues and four disulfide bridges, similar to other known scorpion toxins. A cDNA library prepared from the venomous glands of one scorpion allowed cloning 18 genes that code for peptides of the venom, including RjA12f and eleven other closely related genes. Sequence analyses and phylogenetic reconstruction of the amino acid sequences deduced from the cloned genes showed that this scorpion contains sodium channel like toxin sequences clearly segregated into two monophyletic clusters. Considering the complex set of effects on Na+ currents verified here, this venom certainly warrant further investigation. Copyright © 2011 Elsevier Ltd. All rights reserved.
Genome-Wide Association Study of Genetic Control of Seed Fatty Acid Biosynthesis in Brassica napus

PubMed Central

Gacek, Katarzyna; Bayer, Philipp E.; Bartkowiak-Broda, Iwona; Szala, Laurencja; Bocianowski, Jan; Edwards, David; Batley, Jacqueline

2017-01-01

Fatty acids and their composition in seeds determine oil value for nutritional or industrial purposes and also affect seed germination as well as seedling establishment. To better understand the genetic basis of seed fatty acid biosynthesis in oilseed rape (Brassica napus L.) we applied a genome-wide association study, using 91,205 single nucleotide polymorphisms (SNPs) characterized across a mapping population with high-resolution skim genotyping by sequencing (SkimGBS). We identified a cluster of loci on chromosome A05 associated with oleic and linoleic seed fatty acids. The delineated genomic region contained orthologs of the Arabidopsis thaliana genes known to play a role in regulation of seed fatty acid biosynthesis such as Fatty acyl-ACP thioesterase B (FATB) and Fatty Acid Desaturase (FAD5). This approach allowed us to identify potential functional genes regulating fatty acid composition in this important oil producing crop and demonstrates that this approach can be used as a powerful tool for dissecting complex traits for B. napus improvement programs. PMID:28163710
Method for nucleic acid hybridization using single-stranded DNA binding protein

DOEpatents

Tabor, Stanley; Richardson, Charles C.

1996-01-01

Method of nucleic acid hybridization for detecting the presence of a specific nucleic acid sequence in a population of different nucleic acid sequences using a nucleic acid probe. The nucleic acid probe hybridizes with the specific nucleic acid sequence but not with other nucleic acid sequences in the population. The method includes contacting a sample (potentially including the nucleic acid sequence) with the nucleic acid probe under hybridizing conditions in the presence of a single-stranded DNA binding protein provided in an amount which stimulates renaturation of a dilute solution (i.e., one in which the t.sub.1/2 of renaturation is longer than 3 weeks) of single-stranded DNA greater than 500 fold (i.e., to a t.sub.1/2 less than 60 min, preferably less than 5 min, and most preferably about 1 min.) in the absence of nucleotide triphosphates.
Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.

PubMed

Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami

2012-08-01

Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or <0.5% or >15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.
[MALDI-TOF mass spectrometry in the investigation of large high-molecular biological compounds].

PubMed

Porubl'ova, L V; Rebriiev, A V; Hromovyĭ, T Iu; Minia, I I; Obolens'ka, M Iu

2009-01-01

MALDI-TOF (Matrix-Assisted Laser Desorption/Ionization Time-of-Flight) mass spectrometry has become, in the recent years, a tool of choice for analyses of biological polymers. The wide mass range, high accuracy, informativity and sensitivity make it a superior method for analysis of all kinds of high-molecular biological compounds including proteins, nucleic acids and lipids. MALDI-TOF-MS is particularly suitable for the identification of proteins by mass fingerprint or microsequencing. Therefore it has become an important technique of proteomics. Furthermore, the method allows making a detailed analysis of post-translational protein modifications, protein-protein and protein-nucleic acid interactions. Recently, the method was also successfully applied to nucleic acid sequencing as well as screening for mutations.
Emergence of a replicating species from an in vitro RNA evolution reaction

NASA Technical Reports Server (NTRS)

Breaker, R. R.; Joyce, G. F.

1994-01-01

The technique of self-sustained sequence replication allows isothermal amplification of DNA and RNA molecules in vitro. This method relies on the activities of a reverse transcriptase and a DNA-dependent RNA polymerase to amplify specific nucleic acid sequences. We have modified this protocol to allow selective amplification of RNAs that catalyze a particular chemical reaction. During an in vitro RNA evolution experiment employing this modified system, a unique class of "selfish" RNAs emerged and replicated to the exclusion of the intended RNAs. Members of this class of selfish molecules, termed RNA Z, amplify efficiently despite their inability to catalyze the target chemical reaction. Their amplification requires the action of both reverse transcriptase and RNA polymerase and involves the synthesis of both DNA and RNA replication intermediates. The proposed amplification mechanism for RNA Z involves the formation of a DNA hairpin that functions as a template for transcription by RNA polymerase. This arrangement links the two strands of the DNA, resulting in the production of RNA transcripts that contain an embedded RNA polymerase promoter sequence.
Multi-Harmony: detecting functional specificity from sequence alignment

PubMed Central

Brandt, Bernd W.; Feenstra, K. Anton; Heringa, Jaap

2010-01-01

Many protein families contain sub-families with functional specialization, such as binding different ligands or being involved in different protein–protein interactions. A small number of amino acids generally determine functional specificity. The identification of these residues can aid the understanding of protein function and help finding targets for experimental analysis. Here, we present multi-Harmony, an interactive web sever for detecting sub-type-specific sites in proteins starting from a multiple sequence alignment. Combining our Sequence Harmony (SH) and multi-Relief (mR) methods in one web server allows simultaneous analysis and comparison of specificity residues; furthermore, both methods have been significantly improved and extended. SH has been extended to cope with more than two sub-groups. mR has been changed from a sampling implementation to a deterministic one, making it more consistent and user friendly. For both methods Z-scores are reported. The multi-Harmony web server produces a dynamic output page, which includes interactive connections to the Jalview and Jmol applets, thereby allowing interactive analysis of the results. Multi-Harmony is available at http://www.ibi.vu.nl/ programs/shmrwww. PMID:20525785
Cloning and sequence analysis of sucrose phosphate synthase gene from varieties of Pennisetum species.

PubMed

Li, H C; Lu, H B; Yang, F Y; Liu, S J; Bai, C J; Zhang, Y W

2015-03-31

Sucrose phosphate synthase (SPS) is an enzyme used by higher plants for sucrose synthesis. In this study, three primer sets were designed on the basis of known SPS sequences from maize (GenBank: NM_001112224.1) and sugarcane (GenBank: JN584485.1), and five novel SPS genes were identified by RT-PCR from the genomes of Pennisetum spp (the hybrid P. americanum x P. purpureum, P. purpureum Schum., P. purpureum Schum. cv. Red, P. purpureum Schum. cv. Taiwan, and P. purpureum Schum. cv. Mott). The cloned sequences showed 99.9% identity and 80-88% similarity to the SPS sequences of other plants. The SPS gene of hybrid Pennisetum had one nucleotide and four amino acid polymorphisms compared to the other four germplasms, and cluster analysis was performed to assess genetic diversity in this species. Additional characterization of the SPS gene product can potentially allow Pennisetum to be exploited as a biofuel source.
Identification of Delta5-fatty acid desaturase from the cellular slime mold dictyostelium discoideum.

PubMed

Saito, T; Ochiai, H

1999-10-01

cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
History of retinoic acid receptors.

PubMed

Benbrook, Doris M; Chambon, Pierre; Rochette-Egly, Cécile; Asson-Batres, Mary Ann

2014-01-01

The discovery of retinoic acid receptors arose from research into how vitamins are essential for life. Early studies indicated that Vitamin A was metabolized into an active factor, retinoic acid (RA), which regulates RNA and protein expression in cells. Each step forward in our understanding of retinoic acid in human health was accomplished by the development and application of new technologies. Development cDNA cloning techniques and discovery of nuclear receptors for steroid hormones provided the basis for identification of two classes of retinoic acid receptors, RARs and RXRs, each of which has three isoforms, α, β and ɣ. DNA manipulation and crystallographic studies revealed that the receptors contain discrete functional domains responsible for binding to DNA, ligands and cofactors. Ligand binding was shown to induce conformational changes in the receptors that cause release of corepressors and recruitment of coactivators to create functional complexes that are bound to consensus promoter DNA sequences called retinoic acid response elements (RAREs) and that cause opening of chromatin and transcription of adjacent genes. Homologous recombination technology allowed the development of mice lacking expression of retinoic acid receptors, individually or in various combinations, which demonstrated that the receptors exhibit vital, but redundant, functions in fetal development and in vision, reproduction, and other functions required for maintenance of adult life. More recent advancements in sequencing and proteomic technologies reveal the complexity of retinoic acid receptor involvement in cellular function through regulation of gene expression and kinase activity. Future directions will require systems biology approaches to decipher how these integrated networks affect human stem cells, health, and disease.
Analysis of Endogenous D-Amino Acid-Containing Peptides in Metazoa

PubMed Central

Bai, Lu; Sheeley, Sarah; Sweedler, Jonathan V.

2010-01-01

Peptides are chiral molecules with their structure determined by the composition and configuration of their amino acid building blocks. The naturally occurring amino acids, except glycine, possess two chiral forms. This allows the formation of multiple peptide diastereomers that have the same sequence. Although living organisms use L-amino acids to make proteins, a group of D-amino acid-containing peptides (DAACPs) has been discovered in animals that have at least one of their residues isomerized to the D-form via an enzyme-catalyzed process. In many cases, the biological functions of these peptides are enhanced due to this structural conversion. These DAACPs are different from those known to occur in bacterial cell wall and antibiotic peptides, the latter of which are synthesized in a ribosome-independent manner. DAACPs have now also been identified in a number of distinct groups throughout the Metazoa. Their serendipitous discovery has often resulted from discrepancies observed in bioassays or in chromatographic behavior between natural peptide fractions and peptides synthesized according to a presumed all-L sequence. Because this L-to-D post-translational modification is subtle and not detectable by most sequence determination approaches, it is reasonable to suspect that many studies have overlooked this change; accordingly, DAACPs may be more prevalent than currently thought. Although diastereomer separation techniques developed with synthetic peptides in recent years have greatly aided in the discovery of natural DAACPs, there is a need for new, more robust methods for naturally complex samples. In this review, a brief history of DAACPs in animals is presented, followed by discussion of a variety of analytical methods that have been used for diastereomeric separation and detection of peptides. PMID:20490347
An OmpA family protein, a target of the GinI/GinR quorum-sensing system in Gluconacetobacter intermedius, controls acetic acid fermentation.

PubMed

Iida, Aya; Ohnishi, Yasuo; Horinouchi, Sueharu

2008-07-01

Via N-acylhomoserine lactones, the GinI/GinR quorum-sensing system in Gluconacetobacter intermedius NCI1051, a gram-negative acetic acid bacterium, represses acetic acid and gluconic acid fermentation. Two-dimensional polyacrylamide gel electrophoretic analysis of protein profiles of strain NCI1051 and ginI and ginR mutants identified a protein that was produced in response to the GinI/GinR regulatory system. Cloning and nucleotide sequencing of the gene encoding this protein revealed that it encoded an OmpA family protein, named GmpA. gmpA was a member of the gene cluster containing three adjacent homologous genes, gmpA to gmpC, the organization of which appeared to be unique to vinegar producers, including "Gluconacetobacter polyoxogenes." In addition, GmpA was unique among the OmpA family proteins in that its N-terminal membrane domain forming eight antiparallel transmembrane beta-strands contained an extra sequence in one of the surface-exposed loops. Transcriptional analysis showed that only gmpA of the three adjacent gmp genes was activated by the GinI/GinR quorum-sensing system. However, gmpA was not controlled directly by GinR but was controlled by an 89-amino-acid protein, GinA, a target of this quorum-sensing system. A gmpA mutant grew more rapidly in the presence of 2% (vol/vol) ethanol and accumulated acetic acid and gluconic acid in greater final yields than strain NCI1051. Thus, GmpA plays a role in repressing oxidative fermentation, including acetic acid fermentation, which is unique to acetic acid bacteria and allows ATP synthesis via ethanol oxidation. Consistent with the involvement of gmpA in oxidative fermentation, its transcription was also enhanced by ethanol and acetic acid.
Composition for nucleic acid sequencing

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-08-26

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-06-06

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-05-30

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

PubMed

Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

2015-01-01

The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples

PubMed Central

Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

2016-01-01

Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline. PMID:26910355
Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

PubMed

Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

2004-01-01

Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.
Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples.

PubMed

Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

2016-03-01

Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline.
SequenceCEROSENE: a computational method and web server to visualize spatial residue neighborhoods at the sequence level.

PubMed

Heinke, Florian; Bittrich, Sebastian; Kaiser, Florian; Labudde, Dirk

2016-01-01

To understand the molecular function of biopolymers, studying their structural characteristics is of central importance. Graphics programs are often utilized to conceive these properties, but with the increasing number of available structures in databases or structure models produced by automated modeling frameworks this process requires assistance from tools that allow automated structure visualization. In this paper a web server and its underlying method for generating graphical sequence representations of molecular structures is presented. The method, called SequenceCEROSENE (color encoding of residues obtained by spatial neighborhood embedding), retrieves the sequence of each amino acid or nucleotide chain in a given structure and produces a color coding for each residue based on three-dimensional structure information. From this, color-highlighted sequences are obtained, where residue coloring represent three-dimensional residue locations in the structure. This color encoding thus provides a one-dimensional representation, from which spatial interactions, proximity and relations between residues or entire chains can be deduced quickly and solely from color similarity. Furthermore, additional heteroatoms and chemical compounds bound to the structure, like ligands or coenzymes, are processed and reported as well. To provide free access to SequenceCEROSENE, a web server has been implemented that allows generating color codings for structures deposited in the Protein Data Bank or structure models uploaded by the user. Besides retrieving visualizations in popular graphic formats, underlying raw data can be downloaded as well. In addition, the server provides user interactivity with generated visualizations and the three-dimensional structure in question. Color encoded sequences generated by SequenceCEROSENE can aid to quickly perceive the general characteristics of a structure of interest (or entire sets of complexes), thus supporting the researcher in the initial phase of structure-based studies. In this respect, the web server can be a valuable tool, as users are allowed to process multiple structures, quickly switch between results, and interact with generated visualizations in an intuitive manner. The SequenceCEROSENE web server is available at https://biosciences.hs-mittweida.de/seqcerosene.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2014-02-25

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-05-16

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-04-01

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2010-10-12

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVIII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-05-23

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl8, and the corresponding EGVIII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVIII, recombinant EGVIII proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2010-10-05

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-06-06

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2009-05-05

The present invention provides an endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2013-07-16

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2012-02-14

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2015-04-14

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
Kit for detecting nucleic acid sequences using competitive hybridization probes

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

2001-01-01

A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.
Characterization of Austrian koi herpesvirus samples based on the ORF40 region.

PubMed

Marek, A; Schachner, O; Bilic, I; Hess, M

2010-02-17

Using a PCR that amplifies a region of the thymidine kinase (TK) gene, an epidemic spread of koi herpesvirus (KHV) was determined in koi carps in Austria in 2007. A total of 15 virus samples from different locations in Austria were analyzed to determine their genetic relatedness following PCR and nucleic acid sequencing of the open reading frame 40 (ORF40) region of the KHV genome. ORF40-specific PCR amplification products that were obtained from tissue samples shared 100% nucleotide sequence identity with the published sequence of the Japanese strain of KHV. The ORF40 sequence of one isolate from the UK that was included in the present study was 100% identical with the published sequence of an Israeli strain of KHV. This is the first study that used a larger number of samples and a PCR method, which allowed distinguishing all 3 strains of KHV. The present investigation provides information on the epidemiology of KHV infections in Europe and describes a useful molecular tool for epidemiological studies.
New partial sequences of phosphoenolpyruvate carboxylase as molecular phylogenetic markers.

PubMed

Gehrig, H; Heute, V; Kluge, M

2001-08-01

To better understand the evolution of the enzyme phosphoenolpyruvate carboxylase (PEPC) and to test its versatility as a molecular character in phylogenetic and taxonomic studies, we have characterized and compared 70 new partial PEPC nucleotide and amino acid sequences (about 1100 bp of the 3' side of the gene) from 50 plant species (24 species of Bryophyta, 1 of Pteridophyta, and 25 of Spermatophyta). Together with previously published data, the new set of sequences allowed us to construct the up to now most complete phylogenetic tree of PEPC, where the PEPC sequences cluster according to both the taxonomic positions of the donor plants and the assumed specific function of the PEPC isoforms. Altogether, the study further strengthens the view that PEPC sequences can provide interesting information for the reconstruction of phylogenetic relations between organisms and metabolic pathways. To avoid confusion in future discussion, we propose a new nomenclature for the denotation of PEPC isoforms. Copyright 2001 Academic Press.
Identification of Novel Growth Regulators in Plant Populations Expressing Random Peptides1[OPEN

PubMed Central

Bao, Zhilong; Clancy, Maureen A.

2017-01-01

The use of chemical genomics approaches allows the identification of small molecules that integrate into biological systems, thereby changing discrete processes that influence growth, development, or metabolism. Libraries of chemicals are applied to living systems, and changes in phenotype are observed, potentially leading to the identification of new growth regulators. This work describes an approach that is the nexus of chemical genomics and synthetic biology. Here, each plant in an extensive population synthesizes a unique small peptide arising from a transgene composed of a randomized nucleic acid sequence core flanked by translational start, stop, and cysteine-encoding (for disulfide cyclization) sequences. Ten and 16 amino acid sequences, bearing a core of six and 12 random amino acids, have been synthesized in Arabidopsis (Arabidopsis thaliana) plants. Populations were screened for phenotypes from the seedling stage through senescence. Dozens of phenotypes were observed in over 2,000 plants analyzed. Ten conspicuous phenotypes were verified through separate transformation and analysis of multiple independent lines. The results indicate that these populations contain sequences that often influence discrete aspects of plant biology. Novel peptides that affect photosynthesis, flowering, and red light response are described. The challenge now is to identify the mechanistic integrations of these peptides into biochemical processes. These populations serve as a new tool to identify small molecules that modulate discrete plant functions that could be produced later in transgenic plants or potentially applied exogenously to impart their effects. These findings could usher in a new generation of agricultural growth regulators, herbicides, or defense compounds. PMID:28807931
Chip-based sequencing nucleic acids

DOEpatents

Beer, Neil Reginald

2014-08-26

A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

"De-novo" amino acid sequence elucidation of protein G'e by combined "top-down" and "bottom-up" mass spectrometry.

PubMed

Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F M; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L; Glocker, Michael O

2015-03-01

Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein G´ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α-N-gluconoylation and α-N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α-N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant (K(d)) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins.
Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids

PubMed Central

Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

2010-01-01

Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279–284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614
DOE Office of Scientific and Technical Information (OSTI.GOV)

Reiser, Steven E.; Somerville, Chris R.

The present invention relates to bacterial enzymes, in particular to an acyl-CoA reductase and a gene encoding an acyl-CoA reductase, the amino acid and nucleic acid sequences corresponding to the reductase polypeptide and gene, respectively, and to methods of obtaining such enzymes, amino acid sequences and nucleic acid sequences. The invention also relates to the use of such sequences to provide transgenic host cells capable of producing fatty alcohols and fatty aldehydes.
In-Gel Determination of L-Amino Acid Oxidase Activity Based on the Visualization of Prussian Blue-Forming Reaction

PubMed Central

Zhou, Ning; Zhao, Chuntian

2013-01-01

L-amino acid oxidase (LAAO) is attracting increasing attention due to its important functions. Diverse detection methods with their own properties have been developed for characterization of LAAO. In the present study, a simple, rapid, sensitive, cost-effective and reproducible method for quantitative in-gel determination of LAAO activity based on the visualization of Prussian blue-forming reaction is described. Coupled with SDS-PAGE, this Prussian blue agar assay can be directly used to determine the numbers and approximate molecular weights of LAAO in one step, allowing straightforward application for purification and sequence identification of LAAO from diverse samples. PMID:23383337
BGL7 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2013-01-29

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL6 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2012-10-02

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL5 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-02-28

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
BGL5 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-03-18

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunn-Coleman, Nigel; Ward, Michael

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2014-03-04

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL7 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2015-04-14

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL7 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2014-03-25

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2015-08-11

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2007-09-25

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-04-01

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL4 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2011-12-06

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.
BGL4 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-05-16

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2011-06-14

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Ward, Michael [San Francisco, CA

2009-09-01

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2012-10-30

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.

BGL4 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-01-22

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.
The Biomolecule Sequencer Project: Nanopore Sequencing as a Dual-Use Tool for Crew Health and Astrobiology Investigations

NASA Technical Reports Server (NTRS)

John, K. K.; Botkin, D. S.; Burton, A. S.; Castro-Wallace, S. L.; Chaput, J. D.; Dworkin, J. P.; Lehman, N.; Lupisella, M. L.; Mason, C. E.; Smith, D. J.;

2016-01-01

Human missions to Mars will fundamentally transform how the planet is explored, enabling new scientific discoveries through more sophisticated sample acquisition and processing than can currently be implemented in robotic exploration. The presence of humans also poses new challenges, including ensuring astronaut safety and health and monitoring contamination. Because the capability to transfer materials to Earth will be extremely limited, there is a strong need for in situ diagnostic capabilities. Nucleotide sequencing is a particularly powerful tool because it can be used to: (1) mitigate microbial risks to crew by allowing identification of microbes in water, in air, and on surfaces; (2) identify optimal treatment strategies for infections that arise in crew members; and (3) track how crew members, microbes, and mission-relevant organisms (e.g., farmed plants) respond to conditions on Mars through transcriptomic and genomic changes. Sequencing would also offer benefits for science investigations occurring on the surface of Mars by permitting identification of Earth-derived contamination in samples. If Mars contains indigenous life, and that life is based on nucleic acids or other closely related molecules, sequencing would serve as a critical tool for the characterization of those molecules. Therefore, spaceflight-compatible nucleic acid sequencing would be an important capability for both crew health and astrobiology exploration. Advances in sequencing technology on Earth have been driven largely by needs for higher throughput and read accuracy. Although some reduction in size has been achieved, nearly all commercially available sequencers are not compatible with spaceflight due to size, power, and operational requirements. Exceptions are nanopore-based sequencers that measure changes in current caused by DNA passing through pores; these devices are inherently much smaller and require significantly less power than sequencers using other detection methods. Consequently, nanopore-based sequencers could be made flight-ready with only minimal modifications.

Toward a solid-phase nucleic acid hybridization assay within microfluidic channels using immobilized quantum dots as donors in fluorescence resonance energy transfer.

PubMed

Chen, Lu; Algar, W Russ; Tavares, Anthony J; Krull, Ulrich J

2011-01-01

The optical properties and surface area of quantum dots (QDs) have made them an attractive platform for the development of nucleic acid biosensors based on fluorescence resonance energy transfer (FRET). Solid-phase assays based on FRET using mixtures of immobilized QD-oligonucleotide conjugates (QD biosensors) have been developed. The typical challenges associated with solid-phase detection strategies include non-specific adsorption, slow kinetics of hybridization, and sample manipulation. The new work herein has considered the immobilization of QD biosensors onto the surfaces of microfluidic channels in order to address these challenges. Microfluidic flow can be used to dynamically control stringency by adjustment of the potential in an electrokinetic-based microfluidics environment. The shearing force, Joule heating, and the competition between electroosmotic and electrophoretic mobilities allow the optimization of hybridization conditions, convective delivery of target to the channel surface to speed hybridization, amelioration of adsorption, and regeneration of the sensing surface. Microfluidic flow can also be used to deliver (for immobilization) and remove QD biosensors. QDs that were conjugated with two different oligonucleotide sequences were used to demonstrate feasibility. One oligonucleotide sequence on the QD was available as a linker for immobilization via hybridization with complementary oligonucleotides located on a glass surface within a microfluidic channel. A second oligonucleotide sequence on the QD served as a probe to transduce hybridization with target nucleic acid in a sample solution. A Cy3 label on the target was excited by FRET using green-emitting CdSe/ZnS QD donors and provided an analytical signal to explore this detection strategy. The immobilized QDs could be removed under denaturing conditions by disrupting the duplex that was used as the surface linker and thus allowed a new layer of QD biosensors to be re-coated within the channel for re-use of the microfluidic chip.
Methods and compositions for efficient nucleic acid sequencing

DOEpatents

Drmanac, Radoje

2006-07-04

Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.
Methods and compositions for efficient nucleic acid sequencing

DOEpatents

Drmanac, Radoje

2002-01-01

Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.
Hybridization and sequencing of nucleic acids using base pair mismatches

DOEpatents

Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

2001-01-01

Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.
Inducible Alkylation of DNA by a Quinone Methide-Peptide Nucleic Acid Conjugate†

PubMed Central

Liu, Yang; Rokita, Steven E.

2012-01-01

The reversibility of alkylation by a quinone methide intermediate (QM) avoids the irreversible consumption that plagues most reagents based on covalent chemistry and allows for site specific reaction that is controlled by the thermodynamics rather than kinetics of target association. This characteristic was originally examined with an oligonucleotide QM conjugate but broad application depends on alternative derivatives that are compatible with a cellular environment. Now, a peptide nucleic acid (PNA) derivative has been constructed and shown to exhibit an equivalent ability to delivery the reactive QM in a controlled manner. This new conjugate demonstrates high selectivity for a complementary sequence of DNA even when challenged with an alternative sequence containing a single T/T mismatch. Alkylation of non-complementary sequences is only possible when a template strand is present to co-localize the conjugate and its target. For efficient alkylation in this example, a single-stranded region of the target is required adjacent to the QM conjugate. Most importantly, the intrastrand self adducts formed between the PNA and its attached QM remained active and reversible over more than eight days in aqueous solution prior to reaction with a chosen target added subsequently. PMID:22243337
Development of a monoclonal anitbody to immuno-cytochemical analysis of the cellular localization of the peripheral benzodiazepine receptor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dussossoy, D.; Carayon, P.; Feraut, D.

1996-05-01

Based on the amino acid sequence deduced from the cloned human peripheral benzodiazepine receptor (PBR) gene, monoclonal antibody (Mab 8D7) was produced against the C-terminal fragment of the receptor. Immunoblot experiments, performed against purified PBR, indicated that the antipeptide antibody recognized, under denaturing conditions, the corresponding amino acid sequence of the PBR. When mitochondrial membranes form PBR transfected yeast or from THP1 and U937 cells were used on immunoblot analysis, a high level of immunoreactivity was observed at 18 kDa, the PBR molecular mass deduced from cDNA, establishing the specificity of the antibody for the receptor. Moreover, binding experiments realizedmore » with intact mitochondria demonstrated that the immunogenic sequence was accessible to the antibody indicating that the C-terminal fragment of the PBR faces the cytosol. Using this Mab we developed a technique which allowed precise quantification of PBR density per cell. Furthermore, cellular localization studies by flow cytometric analysis and confocal microscopy on cell lines displaying different levels of PBR showed that Mab 8D7 was entirely colocalized with an antimitochondria Mab. 34 refs., 7 figs.« less
PepLine: a software pipeline for high-throughput direct mapping of tandem mass spectrometry data on genomic sequences.

PubMed

Ferro, Myriam; Tardif, Marianne; Reguer, Erwan; Cahuzac, Romain; Bruley, Christophe; Vermat, Thierry; Nugues, Estelle; Vigouroux, Marielle; Vandenbrouck, Yves; Garin, Jérôme; Viari, Alain

2008-05-01

PepLine is a fully automated software which maps MS/MS fragmentation spectra of trypsic peptides to genomic DNA sequences. The approach is based on Peptide Sequence Tags (PSTs) obtained from partial interpretation of QTOF MS/MS spectra (first module). PSTs are then mapped on the six-frame translations of genomic sequences (second module) giving hits. Hits are then clustered to detect potential coding regions (third module). Our work aimed at optimizing the algorithms of each component to allow the whole pipeline to proceed in a fully automated manner using raw nucleic acid sequences (i.e., genomes that have not been "reduced" to a database of ORFs or putative exons sequences). The whole pipeline was tested on controlled MS/MS spectra sets from standard proteins and from Arabidopsis thaliana envelope chloroplast samples. Our results demonstrate that PepLine competed with protein database searching softwares and was fast enough to potentially tackle large data sets and/or high size genomes. We also illustrate the potential of this approach for the detection of the intron/exon structure of genes.
Sequence Similarity Presenter: a tool for the graphic display of similarities of long sequences for use in presentations.

PubMed

Fröhlich, K U

1994-04-01

A new method for the presentation of alignments of long sequences is described. The degree of identity for the aligned sequences is averaged for sections of a fixed number of residues. The resulting values are converted to shades of gray, with white corresponding to lack of identity and black corresponding to perfect identity. A sequence alignment is represented as a bar filled with varying shades of gray. The display is compact and allows for a fast and intuitive recognition of the distribution of regions with a high similarity. It is well suited for the presentation of alignments of long sequences, e.g. of protein superfamilies, in plenary lectures. The method is implemented as a HyperCard stack for Apple Macintosh computers. Several options for the modification of the output are available (e.g. background reduction, size of the summation window, consideration of amino acid similarity, inclusion of graphic markers to indicate specific domains). The output is a PostScript file which can be printed, imported as EPS or processed further with Adobe Illustrator.
Human jagged polypeptide, encoding nucleic acids and methods of use

DOEpatents

Li, Linheng; Hood, Leroy

2000-01-01

The present invention provides an isolated polypeptide exhibiting substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the polypeptide does not have the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. The invention further provides an isolated nucleic acid molecule containing a nucleotide sequence encoding substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the nucleotide sequence does not encode the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. Also provided herein is a method of inhibiting differentiation of hematopoietic progenitor cells by contacting the progenitor cells with an isolated JAGGED polypeptide, or active fragment thereof. The invention additionally provides a method of diagnosing Alagille Syndrome in an individual. The method consists of detecting an Alagille Syndrome disease-associated mutation linked to a JAGGED locus.
Structured oligonucleotides for target indexing to allow single-vessel PCR amplification and solid support microarray hybridization.

PubMed

Girard, Laurie D; Boissinot, Karel; Peytavi, Régis; Boissinot, Maurice; Bergeron, Michel G

2015-02-07

The combination of molecular diagnostic technologies is increasingly used to overcome limitations on sensitivity, specificity or multiplexing capabilities, and provide efficient lab-on-chip devices. Two such techniques, PCR amplification and microarray hybridization are used serially to take advantage of the high sensitivity and specificity of the former combined with high multiplexing capacities of the latter. These methods are usually performed in different buffers and reaction chambers. However, these elaborate methods have high complexity and cost related to reagent requirements, liquid storage and the number of reaction chambers to integrate into automated devices. Furthermore, microarray hybridizations have a sequence dependent efficiency not always predictable. In this work, we have developed the concept of a structured oligonucleotide probe which is activated by cleavage from polymerase exonuclease activity. This technology is called SCISSOHR for Structured Cleavage Induced Single-Stranded Oligonucleotide Hybridization Reaction. The SCISSOHR probes enable indexing the target sequence to a tag sequence. The SCISSOHR technology also allows the combination of nucleic acid amplification and microarray hybridization in a single vessel in presence of the PCR buffer only. The SCISSOHR technology uses an amplification probe that is irreversibly modified in presence of the target, releasing a single-stranded DNA tag for microarray hybridization. Each tag is composed of a 3-nucleotide sequence-dependent segment and a unique "target sequence-independent" 14-nucleotide segment allowing for optimal hybridization with minimal cross-hybridization. We evaluated the performance of five (5) PCR buffers to support microarray hybridization, compared to a conventional hybridization buffer. Finally, as a proof of concept, we developed a multiplexed assay for the amplification, detection, and identification of three (3) DNA targets. This new technology will facilitate the design of lab-on-chip microfluidic devices, while also reducing consumable costs. At term, it will allow the cost-effective automation of highly multiplexed assays for detection and identification of genetic targets.
Systematic Evaluation of the Dependence of Deoxyribozyme Catalysis on Random Region Length

PubMed Central

Velez, Tania E.; Singh, Jaydeep; Xiao, Ying; Allen, Emily C.; Wong, On Yi; Chandra, Madhavaiah; Kwon, Sarah C.; Silverman, Scott K.

2012-01-01

Functional nucleic acids are DNA and RNA aptamers that bind targets, or they are deoxyribozymes and ribozymes that have catalytic activity. These functional DNA and RNA sequences can be identified from random-sequence pools by in vitro selection, which requires choosing the length of the random region. Shorter random regions allow more complete coverage of sequence space but may not permit the structural complexity necessary for binding or catalysis. In contrast, longer random regions are sampled incompletely but may allow adoption of more complicated structures that enable function. In this study, we systematically examined random region length (N20 through N60) for two particular deoxyribozyme catalytic activities, DNA cleavage and tyrosine-RNA nucleopeptide linkage formation. For both activities, we previously identified deoxyribozymes using only N40 regions. In the case of DNA cleavage, here we found that shorter N20 and N30 regions allowed robust catalytic function, either by DNA hydrolysis or by DNA deglycosylation and strand scission via β-elimination, whereas longer N50 and N60 regions did not lead to catalytically active DNA sequences. Follow-up selections with N20, N30, and N40 regions revealed an interesting interplay of metal ion cofactors and random region length. Separately, for Tyr-RNA linkage formation, N30 and N60 regions provided catalytically active sequences, whereas N20 was unsuccessful, and the N40 deoxyribozymes were functionally superior (in terms of rate and yield) to N30 and N60. Collectively, the results indicate that with future in vitro selection experiments for DNA and RNA catalysts, and by extension for aptamers, random region length should be an important experimental variable. PMID:23088677
Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof

DOEpatents

Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

2016-02-16

The invention relates to a polypeptide which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well asmore » the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.« less
Polypeptide having swollenin activity and uses thereof

DOEpatents

Schoonneveld-Bergmans, Margot Elizabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica D; Damveld, Robbertus Antonius

2015-11-04

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof

DOEpatents

Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel; Damveld, Robbertus Antonius

2015-09-01

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having cellobiohydrolase activity and uses thereof

DOEpatents

Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

2015-09-15

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having acetyl xylan esterase activity and uses thereof

DOEpatents

Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

2015-10-20

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having carbohydrate degrading activity and uses thereof

DOEpatents

Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica Diana; Damveld, Robbertus Antonius

2015-08-18

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
37 CFR 5.31-5.33 - [Reserved

Code of Federal Regulations, 2011 CFR

2011-07-01

... from abandonment 1.135 Amino Acid Sequences. (See Nucleotide and/or Amino Acid Sequences) Appeal to... Appeals and Interference 41.47 Of rejection of an application 1.104(a) Nucleotide and/or Amino Acid...) Symbols for nucleotide and/or amino acid sequence data 1.822 T Tables in patent applications 1.58 Terminal...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
Shared strategies for β-lactam catabolism in the soil microbiome.

PubMed

Crofts, Terence S; Wang, Bin; Spivak, Aaron; Gianoulis, Tara A; Forsberg, Kevin J; Gibson, Molly K; Johnsky, Lauren A; Broomall, Stacey M; Rosenzweig, C Nicole; Skowronski, Evan W; Gibbons, Henry S; Sommer, Morten O A; Dantas, Gautam

2018-06-01

The soil microbiome can produce, resist, or degrade antibiotics and even catabolize them. While resistance genes are widely distributed in the soil, there is a dearth of knowledge concerning antibiotic catabolism. Here we describe a pathway for penicillin catabolism in four isolates. Genomic and transcriptomic sequencing revealed β-lactamase, amidase, and phenylacetic acid catabolon upregulation. Knocking out part of the phenylacetic acid catabolon or an apparent penicillin utilization operon (put) resulted in loss of penicillin catabolism in one isolate. A hydrolase from the put operon was found to degrade in vitro benzylpenicilloic acid, the β-lactamase penicillin product. To test the generality of this strategy, an Escherichia coli strain was engineered to co-express a β-lactamase and a penicillin amidase or the put operon, enabling it to grow using penicillin or benzylpenicilloic acid, respectively. Elucidation of additional pathways may allow bioremediation of antibiotic-contaminated soils and discovery of antibiotic-remodeling enzymes with industrial utility.
Characterization of Lactic Acid Bacteria (LAB) isolated from Indonesian shrimp paste (terasi)

NASA Astrophysics Data System (ADS)

Amalia, U.; Sumardianto; Agustini, T. W.

2018-02-01

Shrimp paste was one of fermented products, popular as a taste enhancer in many dishes. The processing of shrimp paste was natural fermentation, depends on shrimp it self and the presence of salt. The salt inhibits the growth of undesirable microorganism and allows the salt-tolerant lactic acid bacteria (LAB) to ferment the protein source to lactic acids. The objectives of this study were to characterize LAB isolated from Indonesian shrimp paste or "Terasi" with different times of fermentation (30, 60 and 90 days). Vitech analysis showed that there were four strains of the microorganism referred to as lactic acid bacteria (named: LABS1, LABS2, LABS3 and LABS4) with 95% sequence similarity. On the basis of biochemical, four isolates represented Lactobacillus, which the name Lactobacillus plantarum is proposed. L.plantarum was play role in resulting secondary metabolites, which gave umami flavor in shrimp paste.
Gene encoding a novel extracellular metalloprotease in Bacillus subtilis.

PubMed Central

Sloma, A; Rudolph, C F; Rufo, G A; Sullivan, B J; Theriault, K A; Ally, D; Pero, J

1990-01-01

The gene for a novel extracellular metalloprotease was cloned, and its nucleotide sequence was determined. The gene (mpr) encodes a primary product of 313 amino acids that has little similarity to other known Bacillus proteases. The amino acid sequence of the mature protease was preceded by a signal sequence of approximately 34 amino acids and a pro sequence of 58 amino acids. Four cysteine residues were found in the deduced amino acid sequence of the mature protein, indicating the possible presence of disulfide bonds. The mpr gene mapped in the cysA-aroI region of the chromosome and was not required for growth or sporulation. Images FIG. 2 FIG. 7 PMID:2105291
The DynaMine webserver: predicting protein dynamics from sequence.

PubMed

Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim F

2014-07-01

Protein dynamics are important for understanding protein function. Unfortunately, accurate protein dynamics information is difficult to obtain: here we present the DynaMine webserver, which provides predictions for the fast backbone movements of proteins directly from their amino-acid sequence. DynaMine rapidly produces a profile describing the statistical potential for such movements at residue-level resolution. The predicted values have meaning on an absolute scale and go beyond the traditional binary classification of residues as ordered or disordered, thus allowing for direct dynamics comparisons between protein regions. Through this webserver, we provide molecular biologists with an efficient and easy to use tool for predicting the dynamical characteristics of any protein of interest, even in the absence of experimental observations. The prediction results are visualized and can be directly downloaded. The DynaMine webserver, including instructive examples describing the meaning of the profiles, is available at http://dynamine.ibsquare.be. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Continuously Tunable Nucleic Acid Hybridization Probes

PubMed Central

Wu, Lucia R.; Wang, J. Sherry; Fang, John Z.; Reiser, Emily; Pinto, Alessandro; Pekker, Irena; Boykin, Richard; Ngouenet, Celine; Webster, Philippa J.; Beechem, Joseph; Zhang, David Yu

2015-01-01

In silico designed nucleic acid probes and primers often fail to achieve favorable specificity and sensitivity tradeoffs on the first try, and iterative empirical sequence-based optimization is needed, particularly in multiplexed assays. Here, we present a novel, on-the-fly method of tuning probe affinity and selectivity via the stoichiometry of auxiliary species, allowing independent and decoupled adjustment of hybridization yield for different probes in multiplexed assays. Using this method, we achieve near-continuous tuning of probe effective free energy (0.03 kcal·mol−1 granularity). As applications, we enforced uniform capture efficiency of 31 DNA molecules (GC content 0% – 100%), maximized signal difference for 11 pairs of single nucleotide variants, and performed tunable hybrid-capture of mRNA from total RNA. Using the Nanostring nCounter platform, we applied stoichiometric tuning to simultaneously adjust yields for a 24-plex assay, and we show multiplexed quantitation of RNA sequences and variants from formalin-fixed, paraffin-embedded samples (FFPE). PMID:26480474
Synthesis of hydroxyphthioceranic acid using a traceless lithiation-borylation-protodeboronation strategy

NASA Astrophysics Data System (ADS)

Rasappan, Ramesh; Aggarwal, Varinder K.

2014-09-01

In planning organic syntheses, disconnections are most often made adjacent to functional groups, which assist in C-C bond formation. For molecules devoid of obvious functional groups this approach presents a problem, and so functionalities must be installed temporarily and then removed. Here we present a traceless strategy for organic synthesis that uses a boronic ester as such a group in a one-pot lithiation-borylation-protodeboronation sequence. To realize this strategy, we developed a methodology for the protodeboronation of alkyl pinacol boronic esters that involves the formation of a boronate complex with a nucleophile followed by oxidation with Mn(OAc)3 in the presence of the hydrogen-atom donor 4-tert-butylcatechol. Iterative lithiation-borylation-protodeboronation allows the coupling of smaller fragments to build-up long alkyl chains. We employed this strategy in the synthesis of hydroxyphthioceranic acid, a key component of the cell-wall lipid of the virulent Mycobacterium tuberculosis, in just 14 steps (longest linear sequence) with full stereocontrol.
Evolution of the arginase fold and functional diversity

PubMed Central

Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.

2009-01-01

The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
Helicobacter pylori Heat Shock Protein A: Serologic Responses and Genetic Diversity

PubMed Central

Ng, Enders K. W.; Thompson, Stuart A.; Pérez-Pérez, Guillermo I.; Kansau, Imad; van der Ende, Arie; Labigne, Agnès; Sung, Joseph J. Y.; Chung, S. C. Sydney; Blaser, Martin J.

1999-01-01

Helicobacter pylori synthesizes an unusual GroES homolog, heat shock protein A (HspA). The present study was aimed at an assessment of the serological response to HspA in a group of Chinese patients with defined gastroduodenal pathologies and determination of whether diversity is present in the nucleotide sequences encoding HspA in isolates from these patients. Serum samples collected from 154 patients who had an upper gastrointestinal pathology and the presence of H. pylori defined by biopsy were tested for an immunoglobulin G (IgG) serologic response to H. pylori HspA by an enzyme linked immunosorbant assay. HspA-encoding nucleotide sequences in H. pylori isolates from 14 patients (7 seropositive and 7 seronegative for HspA) were analyzed by PCR and direct sequencing of the PCR products. The sequencing results were compared to those of 48 isolates from other parts of the world. Of the 154 known H. pylori-positive patients, 54 (35.1%) were seropositive for HspA. The A domain (GroES homology) of HspA was highly conserved in the 14 isolates tested. Although the B domain (metal-binding site unique to H. pylori) resembled that in the known major variant, particular amino acid substitutions allowed definition of an HspA variant associated with isolates from East Asia. There were no associations between patient characteristics and HspA seropositivity or amino acid sequences. We confirmed in this study that the clinical outcomes of H. pylori infection are not related to HspA antigenicity or to sequence variation. However, B-domain sequence variation may be a marker for the study of the genetic diversity of H. pylori strains of different geographic origins. PMID:10225839
Adapt or Die on the Highway To Hell: Metagenomic Insights into Altered Genomes of Firmicutes from the Deep Biosphere

NASA Astrophysics Data System (ADS)

Briggs, B. R.; Colwell, F. S.

2014-12-01

The ability of a microbe to persist in low-nutrient environments requires adaptive mechanisms to survive. These microorganisms must reduce metabolic energy and increase catabolic efficiency. For example, Escherichia coli surviving in low-nutrient extended stationary phase have mutations that confer a growth advantage in stationary phase (GASP) phenotype, thus allowing for persistence for years in low-nutrient environments. Based on the fact that subseafloor environments are characterized by energy flux decrease with time of burial we hypothesize that cells from older (deeper) sediment layers will have more altered genomes compared to sequenced surface relatives and that these differences reflect adaptations to a low-energy flux environment. To test this hypothesis, sediment samples were collected from the Andaman Sea from the depths of 21, 40 and 554 meters below seafloor, with the ages of 0.34, 0.66, and 8.76 million years, respectively. A single operational taxonomic unit within Firmicutes, based on full-length 16S rDNA, dominated these low diversity samples. This unique feature allowed for metagenomic sequencing using the Illumina HiSeq to identify nucleotide variations (NV) between the subsurface Firmicutes and the closest sequenced representative, Bacillus subtilis BEST7613. NVs were present at all depths in genes that code for proteins used in energy-dependent proteolysis, cell division, sporulation, and (similar to the GASP mutants) biosynthetic pathways for amino acids, nucleotides, and fatty acids. Conserved genes such as 16S rDNA did not contain NVs. More NVs were found in genes from deeper depths. These NV may be beneficial or harmful allowing them to survive for millions of years in the deep biosphere or may be latent deleterious gene alterations that are masked by the minimal-growth status of these deep microbes. Either way these results show that microbes present in the deep biosphere experience environmental forcing that alters the genome.
Thermophilic cellobiohydrolase

DOEpatents

Sapra, Rajat; Park, Joshua I.; Datta, Supratim; Simmons, Blake A.

2017-04-18

The present invention provides for a composition comprising a polypeptide comprising a first amino acid sequence having at least 70% identity with the amino acid sequence of Csac GH5 wherein said first amino acid sequence has a thermostable or thermophilic cellobiohydrolase (CBH) or exoglucanase activity.
Contemporary NMR Studies of Protein Electrostatics.

PubMed

Hass, Mathias A S; Mulder, Frans A A

2015-01-01

Electrostatics play an important role in many aspects of protein chemistry. However, the accurate determination of side chain proton affinity in proteins by experiment and theory remains challenging. In recent years the field of nuclear magnetic resonance spectroscopy has advanced the way that protonation states are measured, allowing researchers to examine electrostatic interactions at an unprecedented level of detail and accuracy. Experiments are now in place that follow pH-dependent (13)C and (15)N chemical shifts as spatially close as possible to the sites of protonation, allowing all titratable amino acid side chains to be probed sequence specifically. The strong and telling response of carefully selected reporter nuclei allows individual titration events to be monitored. At the same time, improved frameworks allow researchers to model multiple coupled protonation equilibria and to identify the underlying pH-dependent contributions to the chemical shifts.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, M.S.

1998-08-18

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

2004-05-11

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

1998-08-18

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

2003-08-19

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Cell culture compositions

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yiao, Jian

2014-03-18

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6 (SEQ ID NO:1 encodes the full length endoglucanase; SEQ ID NO:4 encodes the mature form), and the corresponding endoglucanase VI amino acid sequence ("EGVI"; SEQ ID NO:3 is the signal sequence; SEQ ID NO:2 is the mature sequence). The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
Labeled nucleotide phosphate (NP) probes

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2009-02-03

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

SeqAPASS: Sequence alignment to predict across-species ...

EPA Pesticide Factsheets

Efforts to shift the toxicity testing paradigm from whole organism studies to those focused on the initiation of toxicity and relevant pathways have led to increased utilization of in vitro and in silico methods. Hence the emergence of high through-put screening (HTS) programs, such as U.S. EPA ToxCast, and application of the adverse outcome pathway (AOP) framework for identifying and defining biological key events triggered upon perturbation of molecular initiating events and leading to adverse outcomes occuring at a level of organization relevant for risk assessment [1]. With these recent initiatives to harness the power of “the pathway” in describing and evaluating toxicity comes the need to extrapolate data beyond the model species. Sequence alignment to predict across-species susceptibilty (SeqAPASS) is a web-based tool that allows the user to begin to understand how broadly HTS data or AOP constructs may plausibly be extrapolated across species, while describing the relative intrinsic susceptibiltiy of different taxa to chemicals with known modes of action (e.g., pharmaceuticals and pesticides). The tool rapidly and strategically assesses available molecular target information to describe protein sequence similarity at the primary amino acid sequence, conserved domain, and individual amino acid residue levels. This in silico approach to species extrapolation was designed to automate and streamline the relatively complex and time-consuming process of co
Conservation of a pH-sensitive structure in the C-terminal region of spider silk extends across the entire silk gene family.

PubMed

Strickland, Michelle; Tudorica, Victor; Řezáč, Milan; Thomas, Neil R; Goodacre, Sara L

2018-06-01

Spiders produce multiple silks with different physical properties that allow them to occupy a diverse range of ecological niches, including the underwater environment. Despite this functional diversity, past molecular analyses show a high degree of amino acid sequence similarity between C-terminal regions of silk genes that appear to be independent of the physical properties of the resulting silks; instead, this domain is crucial to the formation of silk fibers. Here, we present an analysis of the C-terminal domain of all known types of spider silk and include silk sequences from the spider Argyroneta aquatica, which spins the majority of its silk underwater. Our work indicates that spiders have retained a highly conserved mechanism of silk assembly, despite the extraordinary diversification of species, silk types and applications of silk over 350 million years. Sequence analysis of the silk C-terminal domain across the entire gene family shows the conservation of two uncommon amino acids that are implicated in the formation of a salt bridge, a functional bond essential to protein assembly. This conservation extends to the novel sequences isolated from A. aquatica. This finding is relevant to research regarding the artificial synthesis of spider silk, suggesting that synthesis of all silk types will be possible using a single process.
Droplet digital PCR technology promises new applications and research areas.

PubMed

Manoj, P

2016-01-01

Digital Polymerase Chain Reaction (dPCR) is used to quantify nucleic acids and its applications are in the detection and precise quantification of low-level pathogens, rare genetic sequences, quantification of copy number variants, rare mutations and in relative gene expressions. Here the PCR is performed in large number of reaction chambers or partitions and the reaction is carried out in each partition individually. This separation allows a more reliable collection and sensitive measurement of nucleic acid. Results are calculated by counting amplified target sequence (positive droplets) and the number of partitions in which there is no amplification (negative droplets). The mean number of target sequences was calculated by Poisson Algorithm. Poisson correction compensates the presence of more than one copy of target gene in any droplets. The method provides information with accuracy and precision which is highly reproducible and less susceptible to inhibitors than qPCR. It has been demonstrated in studying variations in gene sequences, such as copy number variants and point mutations, distinguishing differences between expression of nearly identical alleles, assessment of clinically relevant genetic variations and it is routinely used for clonal amplification of samples for NGS methods. dPCR enables more reliable predictors of tumor status and patient prognosis by absolute quantitation using reference normalizations. Rare mitochondrial DNA deletions associated with a range of diseases and disorders as well as aging can be accurately detected with droplet digital PCR.
Biosynthesis of Lipoic Acid in Arabidopsis: Cloning and Characterization of the cDNA for Lipoic Acid Synthase1

PubMed Central

Yasuno, Rie; Wada, Hajime

1998-01-01

Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738
CodonLogo: a sequence logo-based viewer for codon patterns.

PubMed

Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V

2012-07-15

Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.
A Mobile Element in mutS Drives Hypermutation in a Marine Vibrio

PubMed Central

Chu, Nathaniel D.; Clarke, Sean A.; Timberlake, Sonia; Polz, Martin F.; Grossman, Alan D.

2017-01-01

ABSTRACT Bacteria face a trade-off between genetic fidelity, which reduces deleterious mistakes in the genome, and genetic innovation, which allows organisms to adapt. Evidence suggests that many bacteria balance this trade-off by modulating their mutation rates, but few mechanisms have been described for such modulation. Following experimental evolution and whole-genome resequencing of the marine bacterium Vibrio splendidus 12B01, we discovered one such mechanism, which allows this bacterium to switch to an elevated mutation rate. This switch is driven by the excision of a mobile element residing in mutS, which encodes a DNA mismatch repair protein. When integrated within the bacterial genome, the mobile element provides independent promoter and translation start sequences for mutS—different from the bacterium’s original mutS promoter region—which allow the bacterium to make a functional mutS gene product. Excision of this mobile element rejoins the mutS gene with host promoter and translation start sequences but leaves a 2-bp deletion in the mutS sequence, resulting in a frameshift and a hypermutator phenotype. We further identified hundreds of clinical and environmental bacteria across Betaproteobacteria and Gammaproteobacteria that possess putative mobile elements within the same amino acid motif in mutS. In a subset of these bacteria, we detected excision of the element but not a frameshift mutation; the mobile elements leave an intact mutS coding sequence after excision. Our findings reveal a novel mechanism by which one bacterium alters its mutation rate and hint at a possible evolutionary role for mobile elements within mutS in other bacteria. PMID:28174306
Molecular diagnosis in clinical parasitology: when and why?

PubMed

Wong, Samson S Y; Fung, Kitty S C; Chau, Sandy; Poon, Rosana W S; Wong, Sally C Y; Yuen, Kwok-Yung

2014-11-01

Microscopic detection and morphological identification of parasites from clinical specimens are the gold standards for the laboratory diagnosis of parasitic infections. The limitations of such diagnostic assays include insufficient sensitivity and operator dependence. Immunoassays for parasitic antigens are not available for most parasitic infections and have not significantly improved the sensitivity of laboratory detection. Advances in molecular detection by nucleic acid amplification may improve the detection in asymptomatic infections with low parasitic burden. Rapidly accumulating genomic data on parasites allow the design of polymerase chain reaction (PCR) primers directed towards multi-copy gene targets, such as the ribosomal and mitochondrial genes, which further improve the sensitivity. Parasitic cell or its free circulating parasitic DNA can be shed from parasites into blood and excreta which may allow its detection without the whole parasite being present within the portion of clinical sample used for DNA extraction. Multiplex nucleic acid amplification technology allows the simultaneous detection of many parasitic species within a single clinical specimen. In addition to improved sensitivity, nucleic acid amplification with sequencing can help to differentiate different parasitic species at different stages with similar morphology, detect and speciate parasites from fixed histopathological sections and identify anti-parasitic drug resistance. The use of consensus primer and PCR sequencing may even help to identify novel parasitic species. The key limitation of molecular detection is the technological expertise and expense which are usually lacking in the field setting at highly endemic areas. However, such tests can be useful for screening important parasitic infections in asymptomatic patients, donors or recipients coming from endemic areas in the settings of transfusion service or tertiary institutions with transplantation service. Such tests can also be used for monitoring these recipients or highly immunosuppressed patients, so that early preemptive treatment can be given for reactivated parasitic infections while the parasitic burden is still low. © 2014 by the Society for Experimental Biology and Medicine.
Helicase-dependent amplification of nucleic acids.

PubMed

Cao, Yun; Kim, Hyun-Jin; Li, Ying; Kong, Huimin; Lemieux, Bertrand

2013-10-11

Helicase-dependent amplification (HDA) is a novel method for the isothermal in vitro amplification of nucleic acids. The HDA reaction selectively amplifies a target sequence by extension of two oligonucleotide primers. Unlike the polymerase chain reaction (PCR), HDA uses a helicase enzyme to separate the deoxyribonucleic acid (DNA) strands, rather than heat denaturation. This allows DNA amplification without the need for thermal cycling. The helicase used in HDA is a helicase super family II protein obtained from a thermophilic organism, Thermoanaerobacter tengcongensis (TteUvrD). This thermostable helicase is capable of unwinding blunt-end nucleic acid substrates at elevated temperatures (60° to 65°C). The HDA reaction can also be coupled with reverse transcription for ribonucleic acid (RNA) amplification. The products of this reaction can be detected during the reaction using fluorescent probes when incubations are conducted in a fluorimeter. Alternatively, products can be detected after amplification using a disposable amplicon containment device that contains an embedded lateral flow strip. Copyright © 2013 John Wiley & Sons, Inc.
Phylo-mLogo: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

PubMed Central

Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei

2007-01-01

Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966
Trichoderma .beta.-glucosidase

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-01-03

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

1999-10-26

A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

2001-06-05

A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Carbohydrate degrading polypeptide and uses thereof

DOEpatents

Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

2015-10-20

The invention relates to a polypeptide having carbohydrate material degrading activity which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 4, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional protein and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Vicilin and convicilin are potential major allergens from pea.

PubMed

Sanchez-Monge, R; Lopez-Torrejón, G; Pascual, C Y; Varela, J; Martin-Esteban, M; Salcedo, G

2004-11-01

Allergic reactions to pea (Pisum sativum) ingestion are frequently associated with lentil allergy in the Spanish population. Vicilin have been described as a major lentil allergen. To identify the main IgE binding components from pea seeds and to study their potential cross-reactivity with lentil vicilin. A serum pool or individual sera from 18 patients with pea allergy were used to detect IgE binding proteins from pea seeds by immunodetection and immunoblot inhibition assays. Protein preparations enriched in pea vicilin were obtained by gel filtration chromatography followed by reverse-phase high-performance liquid chromatography (HPLC). IgE binding components were identified by means of N-terminal amino acid sequencing. Complete cDNAs encoding pea vicilin were isolated by PCR, using primers based on the amino acid sequence of the reactive proteins. IgE immunodetection of crude pea extracts revealed that convicilin (63 kDa), as well as vicilin (44 kDa) and one of its proteolytic fragments (32 kDa), reacted with more than 50% of the individual sera tested. Additional proteolytic subunits of vicilin (36, 16 and 13 kDa) bound IgE from approximately 20% of the sera. The lentil vicilin allergen Len c 1 strongly inhibited the IgE binding to all components mentioned above. The characterization of cDNA clones encoding pea vicilin has allowed the deduction of its complete amino acid sequence (90% of sequence identity to Len c 1), as well as those of its reactive proteolytic processed subunits. Vicilin and convicilin are potential major allergens from pea seeds. Furthermore, proteolytic fragments from vicilin are also relevant IgE binding pea components. All these proteins cross-react with the major lentil allergen Len c 1.
Structure of the beta-galactosidase gene from Thermus sp. strain T2: expression in Escherichia coli and purification in a single step of an active fusion protein.

PubMed

Vian, A; Carrascosa, A V; García, J L; Cortés, E

1998-06-01

The nucleotide sequence of both the bgaA gene, coding for a thermostable beta-galactosidase of Thermus sp. strain T2, and its flanking regions was determined. The deduced amino acid sequence of the enzyme predicts a polypeptide of 645 amino acids (Mr, 73,595). Comparative analysis of the open reading frames located in the flanking regions of the bgaA gene revealed that they might encode proteins involved in the transport and hydrolysis of sugars. The observed homology between the deduced amino acid sequences of BgaA and the beta-galactosidase of Bacillus stearothermophilus allows us to classify the new enzyme within family 42 of glycosyl hydrolases. BgaA was overexpressed in its active form in Escherichia coli, but more interestingly, an active chimeric beta-galactosidase was constructed by fusing the BgaA protein to the choline-binding domain of the major pneumococcal autolysin. This chimera illustrates a novel approach for producing an active and thermostable hybrid enzyme that can be purified in a single step by affinity chromatography on DEAE-cellulose, retaining the catalytic properties of the native enzyme. The chimeric enzyme showed a specific activity of 191,000 U/mg at 70 degrees C and a Km value of 1.6 mM with o-nitrophenyl-beta-D-galactopyranoside as a substrate, and it retained 50% of its initial activity after 1 h of incubation at 70 degrees C.
Combinatorial interactions of two amino acids with a single base pair define target site specificity in plant dimeric homeodomain proteins

PubMed Central

Tron, Adriana E.; Bertoncini, Carlos W.; Palena, Claudia M.; Chan, Raquel L.; Gonzalez, Daniel H.

2001-01-01

Four groups of plant homeodomain proteins contain a dimerization motif closely linked to the homeodomain. We here show that two sunflower homeodomain proteins, Hahb-4 and HAHR1, which belong to the Hd-Zip I and GL2/Hd-Zip IV groups, respectively, show different binding preferences at a defined position of a pseudopalindromic DNA-binding site used as a target. HAHR1 shows a preference for the sequence 5′-CATT(A/T)AATG-3′, rather than 5′-CAAT(A/T)ATTG-3′, recognized by Hahb-4. To analyze the molecular basis of this behavior, we have constructed a set of mutants with exchanged residues (Phe→Ile and Ile→Phe) at position 47 of the homeodomain, together with chimeric proteins between HAHR1 and Hahb-4. The results obtained indicate that Phe47, but not Ile47, allows binding to 5′-CATT(A/T)AATG-3′. However, the preference for this sequence is determined, in addition, by amino acids located C-terminal to residue 53 of the HAHR1 homeodomain. A double mutant of Hahb-4 (Ile47→Phe/Ala54→Thr) shows the same binding behavior as HAHR1, suggesting that combinatorial interactions of amino acid residues at positions 47 and 54 of the homeodomain are involved in establishing the affinity and selectivity of plant dimeric homeodomain proteins with different DNA target sequences. PMID:11726696
H3.Y discriminates between HIRA and DAXX chaperone complexes and reveals unexpected insights into human DAXX-H3.3-H4 binding and deposition requirements

PubMed Central

Zink, Lisa-Maria; Delbarre, Erwan; Eberl, H. Christian; Keilhauer, Eva C.; Bönisch, Clemens; Pünzeler, Sebastian; Bartkuhn, Marek; Collas, Philippe; Mann, Matthias

2017-01-01

Abstract Histone chaperones prevent promiscuous histone interactions before chromatin assembly. They guarantee faithful deposition of canonical histones and functionally specialized histone variants into chromatin in a spatial- and temporally-restricted manner. Here, we identify the binding partners of the primate-specific and H3.3-related histone variant H3.Y using several quantitative mass spectrometry approaches, and biochemical and cell biological assays. We find the HIRA, but not the DAXX/ATRX, complex to recognize H3.Y, explaining its presence in transcriptionally active euchromatic regions. Accordingly, H3.Y nucleosomes are enriched in the transcription-promoting FACT complex and depleted of repressive post-translational histone modifications. H3.Y mutational gain-of-function screens reveal an unexpected combinatorial amino acid sequence requirement for histone H3.3 interaction with DAXX but not HIRA, and for H3.3 recruitment to PML nuclear bodies. We demonstrate the importance and necessity of specific H3.3 core and C-terminal amino acids in discriminating between distinct chaperone complexes. Further, chromatin immunoprecipitation sequencing experiments reveal that in contrast to euchromatic HIRA-dependent deposition sites, human DAXX/ATRX-dependent regions of histone H3 variant incorporation are enriched in heterochromatic H3K9me3 and simple repeat sequences. These data demonstrate that H3.Y's unique amino acids allow a functional distinction between HIRA and DAXX binding and its consequent deposition into open chromatin. PMID:28334823
Automated Sanger Analysis Pipeline (ASAP): A Tool for Rapidly Analyzing Sanger Sequencing Data with Minimum User Interference.

PubMed

Singh, Aditya; Bhatia, Prateek

2016-12-01

Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.
Nucleic acid analysis using terminal-phosphate-labeled nucleotides

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-04-22

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Multigene panel next generation sequencing in a patient with cherry red macular spot: Identification of two novel mutations in NEU1 gene causing sialidosis type I associated with mild to unspecific biochemical and enzymatic findings.

PubMed

Mütze, Ulrike; Bürger, Friederike; Hoffmann, Jessica; Tegetmeyer, Helmut; Heichel, Jens; Nickel, Petra; Lemke, Johannes R; Syrbe, Steffen; Beblo, Skadi

2017-03-01

Lysosomal storage diseases (LSD) often manifest with cherry red macular spots. Diagnosis is based on clinical features and specific biochemical and enzymatic patterns. In uncertain cases, genetic testing with next generation sequencing can establish a diagnosis, especially in milder or atypical phenotypes. We report on the diagnostic work-up in a boy with sialidosis type I, presenting initially with marked cherry red macular spots but non-specific urinary oligosaccharide patterns and unusually mild excretion of bound sialic acid. Biochemical, enzymatic and genetic tests were performed in the patient. The clinical and electrophysiological data was reviewed and a genotype-phenotype analysis was performed. In addition a systematic literature review was carried out. Cherry red macular spots were first noted at 6 years of age after routine screening myopia. Physical examination, psychometric testing, laboratory investigations as well as cerebral MRI were unremarkable at 9 years of age. So far no clinical myoclonic seizures occurred, but EEG displays generalized epileptic discharges and visual evoked potentials are prolonged bilaterally. Urine thin layer chromatography showed an oligosaccharide pattern compatible with different LSD including sialidosis, galactosialidosis, GM1 gangliosidosis or mucopolysaccharidosis type IV B. Urinary bound sialic acid excretion was mildly elevated in spontaneous and 24 h urine samples. In cultured fibroblasts, α-sialidase activity was markedly decreased to < 1%; however, bound and free sialic acid were within normal range. Diagnosis was eventually established by multigene panel next generation sequencing of genes associated to LSD, identifying two novel, compound heterozygous variants in NEU1 gene (c.699C > A, p.S233R in exon 4 and c.803A > G; p.Y268C in Exon 5 in NEU1 transcript NM_000434.3), leading to amino acid changes predicted to impair protein function. Sialidosis should be suspected in patients with cherry red macular spots, even with non-significant urinary sialic acid excretion. Multigene panel next generation sequencing can establish a definite diagnosis, allowing for counseling of the patient and family.

Draft genome sequence of Dethiosulfovibrio salsuginis DSM 21565T an anaerobic, slightly halophilic bacterium isolated from a Colombian saline spring.

PubMed

Díaz-Cárdenas, Carolina; López, Gina; Alzate-Ocampo, José David; González, Laura N; Shapiro, Nicole; Woyke, Tanja; Kyrpides, Nikos C; Restrepo, Silvia; Baena, Sandra

2017-01-01

A bacterium belonging to the phylum Synergistetes , genus Dethiosulfovibrio was isolated in 2007 from a saline spring in Colombia. Dethiosulfovibrio salsuginis USBA 82 T ( DSM 21565 T = KCTC 5659 T ) is a mesophilic, strictly anaerobic, slightly halophilic, Gram negative bacterium with a diderm cell envelope. The strain ferments peptides, amino acids and a few organic acids. Here we present the description of the complete genome sequencing and annotation of the type species Dethiosulfovibrio salsuginis USBA 82 T . The genome consisted of 2.68 Mbp with a 53.7% G + C . A total of 2609 genes were predicted and of those, 2543 were protein coding genes and 66 were RNA genes. We detected in USBA 82 T genome six Synergistetes conserved signature indels (CSIs), specific for Jonquetella, Pyramidobacter and Dethiosulfovibrio . The genome of D. salsuginis contained, as expected, genes related to amino acid transport, amino acid metabolism and thiosulfate reduction. These genes represent the major gene groups of Synergistetes , related with their phenotypic traits, and interestingly, 11.8% of the genes in the genome belonged to the amino acid fermentation COG category. In addition, we identified in the genome some ammonification genes such as nitrate reductase genes. The presence of proline operon genes could be related to de novo synthesis of proline to protect the cell in response to high osmolarity. Our bioinformatics workflow included antiSMASH and BAGEL3 which allowed us to identify bacteriocins genes in the genome.
Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

DOEpatents

Studier, F. William

1995-04-18

Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.
Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

DOEpatents

Studier, F.W.

1995-04-18

Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.
Variability of the protein sequences of lcrV between epidemic and atypical rhamnose-positive strains of Yersinia pestis.

PubMed

Anisimov, Andrey P; Panfertsev, Evgeniy A; Svetoch, Tat'yana E; Dentovskaya, Svetlana V

2007-01-01

Sequencing of lcrV genes and comparison of the deduced amino acid sequences from ten Y. pestis strains belonging mostly to the group of atypical rhamnose-positive isolates (non-pestis subspecies or pestoides group) showed that the LcrV proteins analyzed could be classified into five sequence types. This classification was based on major amino acid polymorphisms among LcrV proteins in the four "hot points" of the protein sequences. Some additional minor polymorphisms were found throughout these sequence types. The "hot points" corresponded to amino acids 18 (Lys --> Asn), 72 (Lys --> Arg), 273 (Cys --> Ser), and 324-326 (Ser-Gly-Lys --> Arg) in the LcrV sequence of the reference Y. pestis strain CO92. One possible explanation for polymorphism in amino acid sequences of LcrV among different strains is that strain-specific variation resulted from adaptation of the plague pathogen to different rodent and lagomorph hosts.
The complete genome structure and phylogenetic relationship of infectious hematopoietic necrosis virus

USGS Publications Warehouse

Morzunov , Sergey P.; Winton, James R.; Nichol, Stuart T.

1995-01-01

Infectious hematopoietic necrosis virus (IHNV), a member of the family Rhabdoviridae, causes a severe disease with high mortality in salmonid fish. The nucleotide sequence (11, 131 bases) of the entire genome was determined for the pathogenic WRAC strain of IHNV from southern Idaho. This allowed detailed analysis of all 6 genes, the deduced amino acid sequences of their encoded proteins, and important control motifs including leader, trailer and gene junction regions. Sequence analysis revealed that the 6 virus genes are located along the genome in the 3′ to 5′ order: nucleocapsid (N), polymerase-associated phosphoprotein (P or M1), matrix protein (M or M2), surface glycoprotein (G), a unique non-virion protein (NV) and virus polymerase (L). The IHNV genome RNA was found to have highly complementary termini (15 of 16 nucleotides). The gene junction regions display the highly conserved sequence UCURUC(U)7RCCGUG(N)4CACR (in the vRNA sense), which includes the typical rhabdovirus transcription termination/polyadenylation signal and a novel putative transcription initiation signal. Phylogenetic analysis of M, G and L protein sequences allowed insights into the evolutionary and taxonomic relationship of rhabdoviruses of fish relative to those of insects or mammals, and a broader sense of the relationship of non-segmented negative-strand RNA viruses. Based on these data, a new genus, piscivirus, is proposed which will initially contain IHNV, viral hemorrhagic septicemia virus and Hirame rhabdovirus.
.beta.-glucosidase 5 (BGL5) compositions

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2010-06-01

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences

PubMed Central

Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia

2017-01-01

Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613
The Hypothesis that the Genetic Code Originated in Coupled Synthesis of Proteins and the Evolutionary Predecessors of Nucleic Acids in Primitive Cells

PubMed Central

Francis, Brian R.

2015-01-01

Although analysis of the genetic code has allowed explanations for its evolution to be proposed, little evidence exists in biochemistry and molecular biology to offer an explanation for the origin of the genetic code. In particular, two features of biology make the origin of the genetic code difficult to understand. First, nucleic acids are highly complicated polymers requiring numerous enzymes for biosynthesis. Secondly, proteins have a simple backbone with a set of 20 different amino acid side chains synthesized by a highly complicated ribosomal process in which mRNA sequences are read in triplets. Apparently, both nucleic acid and protein syntheses have extensive evolutionary histories. Supporting these processes is a complex metabolism and at the hub of metabolism are the carboxylic acid cycles. This paper advances the hypothesis that the earliest predecessor of the nucleic acids was a β-linked polyester made from malic acid, a highly conserved metabolite in the carboxylic acid cycles. In the β-linked polyester, the side chains are carboxylic acid groups capable of forming interstrand double hydrogen bonds. Evolution of the nucleic acids involved changes to the backbone and side chain of poly(β-d-malic acid). Conversion of the side chain carboxylic acid into a carboxamide or a longer side chain bearing a carboxamide group, allowed information polymers to form amide pairs between polyester chains. Aminoacylation of the hydroxyl groups of malic acid and its derivatives with simple amino acids such as glycine and alanine allowed coupling of polyester synthesis and protein synthesis. Use of polypeptides containing glycine and l-alanine for activation of two different monomers with either glycine or l-alanine allowed simple coded autocatalytic synthesis of polyesters and polypeptides and established the first genetic code. A primitive cell capable of supporting electron transport, thioester synthesis, reduction reactions, and synthesis of polyesters and polypeptides is proposed. The cell consists of an iron-sulfide particle enclosed by tholin, a heterogeneous organic material that is produced by Miller-Urey type experiments that simulate conditions on the early Earth. As the synthesis of nucleic acids evolved from β-linked polyesters, the singlet coding system for replication evolved into a four nucleotide/four amino acid process (AMP = aspartic acid, GMP = glycine, UMP = valine, CMP = alanine) and then into the triplet ribosomal process that permitted multiple copies of protein to be synthesized independent of replication. This hypothesis reconciles the “genetics first” and “metabolism first” approaches to the origin of life and explains why there are four bases in the genetic alphabet. PMID:25679748
Methods of diagnosing alagille syndrome

DOEpatents

Li, Linheng; Hood, Leroy; Krantz, Ian D.; Spinner, Nancy B.

2004-03-09

The present invention provides an isolated polypeptide exhibiting substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the polypeptide does not have the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. The invention further provides an isolated nucleic acid molecule containing a nucleotide sequence encoding substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the nucleotide sequence does not encode the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. Also provided herein is a method of inhibiting differentiation of hematopoietic progenitor cells by contacting the progenitor cells with an isolated JAGGED polypeptide, or active fragment thereof. The invention additionally provides a method of diagnosing Alagille Syndrome in an individual. The method consists of detecting an Alagille Syndrome disease-associated mutation linked to a JAGGED locus.
Optimal protein library design using recombination or point mutations based on sequence-based scoring functions.

PubMed

Pantazes, Robert J; Saraf, Manish C; Maranas, Costas D

2007-08-01

In this paper, we introduce and test two new sequence-based protein scoring systems (i.e. S1, S2) for assessing the likelihood that a given protein hybrid will be functional. By binning together amino acids with similar properties (i.e. volume, hydrophobicity and charge) the scoring systems S1 and S2 allow for the quantification of the severity of mismatched interactions in the hybrids. The S2 scoring system is found to be able to significantly functionally enrich a cytochrome P450 library over other scoring methods. Given this scoring base, we subsequently constructed two separate optimization formulations (i.e. OPTCOMB and OPTOLIGO) for optimally designing protein combinatorial libraries involving recombination or mutations, respectively. Notably, two separate versions of OPTCOMB are generated (i.e. model M1, M2) with the latter allowing for position-dependent parental fragment skipping. Computational benchmarking results demonstrate the efficacy of models OPTCOMB and OPTOLIGO to generate high scoring libraries of a prespecified size.
A method for multi-codon scanning mutagenesis of proteins based on asymmetric transposons.

PubMed

Liu, Jia; Cropp, T Ashton

2012-02-01

Random mutagenesis followed by selection or screening is a commonly used strategy to improve protein function. Despite many available methods for random mutagenesis, nearly all generate mutations at the nucleotide level. An ideal mutagenesis method would allow for the generation of 'codon mutations' to change protein sequence with defined or mixed amino acids of choice. Herein we report a method that allows for mutations of one, two or three consecutive codons. Key to this method is the development of a Mu transposon variant with asymmetric terminal sequences. As a demonstration of the method, we performed multi-codon scanning on the gene encoding superfolder GFP (sfGFP). Characterization of 50 randomly chosen clones from each library showed that more than 40% of the mutants in these three libraries contained seamless, in-frame mutations with low site preference. By screening only 500 colonies from each library, we successfully identified several spectra-shift mutations, including a S205D variant that was found to bear a single excitation peak in the UV region.
PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

PubMed

Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

2016-07-08

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Microbial genomic taxonomy

PubMed Central

2013-01-01

A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, <10 in Karlin genomic signature, and > 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups. PMID:24365132
Design and synthesis of digitally encoded polymers that can be decoded and erased

NASA Astrophysics Data System (ADS)

Roy, Raj Kumar; Meszynska, Anna; Laure, Chloé; Charles, Laurence; Verchin, Claire; Lutz, Jean-François

2015-05-01

Biopolymers such as DNA store information in their chains using controlled sequences of monomers. Here we describe a non-natural information-containing macromolecule that can store and retrieve digital information. Monodisperse sequence-encoded poly(alkoxyamine amide)s were synthesized using an iterative strategy employing two chemoselective steps: the reaction of a primary amine with an acid anhydride and the radical coupling of a carbon-centred radical with a nitroxide. A binary code was implemented in the polymer chains using three monomers: one nitroxide spacer and two interchangeable anhydrides defined as 0-bit and 1-bit. This methodology allows encryption of any desired sequence in the chains. Moreover, the formed sequences are easy to decode using tandem mass spectrometry. Indeed, these polymers follow predictable fragmentation pathways that can be easily deciphered. Moreover, poly(alkoxyamine amide)s are thermolabile. Thus, the digital information encrypted in the chains can be erased by heating the polymers in the solid state or in solution.
Design and synthesis of digitally encoded polymers that can be decoded and erased.

PubMed

Roy, Raj Kumar; Meszynska, Anna; Laure, Chloé; Charles, Laurence; Verchin, Claire; Lutz, Jean-François

2015-05-26

Biopolymers such as DNA store information in their chains using controlled sequences of monomers. Here we describe a non-natural information-containing macromolecule that can store and retrieve digital information. Monodisperse sequence-encoded poly(alkoxyamine amide)s were synthesized using an iterative strategy employing two chemoselective steps: the reaction of a primary amine with an acid anhydride and the radical coupling of a carbon-centred radical with a nitroxide. A binary code was implemented in the polymer chains using three monomers: one nitroxide spacer and two interchangeable anhydrides defined as 0-bit and 1-bit. This methodology allows encryption of any desired sequence in the chains. Moreover, the formed sequences are easy to decode using tandem mass spectrometry. Indeed, these polymers follow predictable fragmentation pathways that can be easily deciphered. Moreover, poly(alkoxyamine amide)s are thermolabile. Thus, the digital information encrypted in the chains can be erased by heating the polymers in the solid state or in solution.
Cleavage sites within the poliovirus capsid protein precursors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Larsen, G.R.; Anderson, C.W.; Dorner, A.J.

1982-01-01

Partial amino-terminal sequence analysis was performed on radiolabeled poliovirus capsid proteins VP1, VP2, and VP3. A computer-assisted comparison of the amino acid sequences obtained with that predicted by the nucleotide sequence of the poliovirus genome allows assignment of the amino terminus of each capsid protein to a unique position within the virus polyprotein. Sequence analysis of trypsin-digested VP4, which has a blocked amino terminus, demonstrates that VP4 is encoded at or very near to the amino terminus of the polyprotein. The gene order of the capsid proteins is VP4-VP2-VP3-VP1. Cleavage of VP0 to VP4 and VP2 is shown to occurmore » between asparagine and serine, whereas the cleavages that separate VP2/VP3 and VP3/VP1 occur between glutamine and glycine residues. This finding supports the hypothesis that the cleavage of VP0, which occurs during virion morphogenesis, is distinct from the cleavages that separate functional regions of the polyprotein.« less
CAFE: aCcelerated Alignment-FrEe sequence analysis.

PubMed

Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu

2017-07-03

Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Microbial community structure in fermentation process of Shaoxing rice wine by Illumina-based metagenomic sequencing.

PubMed

Xie, Guangfa; Wang, Lan; Gao, Qikang; Yu, Wenjing; Hong, Xutao; Zhao, Lingyun; Zou, Huijun

2013-09-01

To understand the role of the community structure of microbes in the environment in the fermentation of Shaoxing rice wine, samples collected from a wine factory were subjected to Illumina-based metagenomic sequencing. De novo assembly of the sequencing reads allowed the characterisation of more than 23 thousand microbial genes derived from 1.7 and 1.88 Gbp of sequences from two samples fermented for 5 and 30 days respectively. The microbial community structure at different fermentation times of Shaoxing rice wine was revealed, showing the different roles of the microbiota in the fermentation process of Shaoxing rice wine. The gene function of both samples was also studied in the COG database, with most genes belonging to category S (function unknown), category E (amino acid transport and metabolism) and unclassified group. The results show that both the microbial community structure and gene function composition change greatly at different time points of Shaoxing rice wine fermentation. © 2013 Society of Chemical Industry.
3D RNA and functional interactions from evolutionary couplings

PubMed Central

Weinreb, Caleb; Riesselman, Adam; Ingraham, John B.; Gross, Torsten; Sander, Chris; Marks, Debora S.

2016-01-01

Summary Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces research on their structure and functional interactions. We mine the evolutionary sequence record to derive precise information about function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules, and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions, e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by accelerating sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA. PMID:27087444
Estimates of Soil Bacterial Ribosome Content and Diversity Are Significantly Affected by the Nucleic Acid Extraction Method Employed

PubMed Central

Wüst, Pia K.; Nacke, Heiko; Kaiser, Kristin; Marhan, Sven; Sikorski, Johannes; Kandeler, Ellen; Daniel, Rolf

2016-01-01

Modern sequencing technologies allow high-resolution analyses of total and potentially active soil microbial communities based on their DNA and RNA, respectively. In the present study, quantitative PCR and 454 pyrosequencing were used to evaluate the effects of different extraction methods on the abundance and diversity of 16S rRNA genes and transcripts recovered from three different types of soils (leptosol, stagnosol, and gleysol). The quality and yield of nucleic acids varied considerably with respect to both the applied extraction method and the analyzed type of soil. The bacterial ribosome content (calculated as the ratio of 16S rRNA transcripts to 16S rRNA genes) can serve as an indicator of the potential activity of bacterial cells and differed by 2 orders of magnitude between nucleic acid extracts obtained by the various extraction methods. Depending on the extraction method, the relative abundances of dominant soil taxa, in particular Actinobacteria and Proteobacteria, varied by a factor of up to 10. Through this systematic approach, the present study allows guidelines to be deduced for the selection of the appropriate extraction protocol according to the specific soil properties, the nucleic acid of interest, and the target organisms. PMID:26896137

HASP server: a database and structural visualization platform for comparative models of influenza A hemagglutinin proteins.

PubMed

Ambroggio, Xavier I; Dommer, Jennifer; Gopalan, Vivek; Dunham, Eleca J; Taubenberger, Jeffery K; Hurt, Darrell E

2013-06-18

Influenza A viruses possess RNA genomes that mutate frequently in response to immune pressures. The mutations in the hemagglutinin genes are particularly significant, as the hemagglutinin proteins mediate attachment and fusion to host cells, thereby influencing viral pathogenicity and species specificity. Large-scale influenza A genome sequencing efforts have been ongoing to understand past epidemics and pandemics and anticipate future outbreaks. Sequencing efforts thus far have generated nearly 9,000 distinct hemagglutinin amino acid sequences. Comparative models for all publicly available influenza A hemagglutinin protein sequences (8,769 to date) were generated using the Rosetta modeling suite. The C-alpha root mean square deviations between a randomly chosen test set of models and their crystallographic templates were less than 2 Å, suggesting that the modeling protocols yielded high-quality results. The models were compiled into an online resource, the Hemagglutinin Structure Prediction (HASP) server. The HASP server was designed as a scientific tool for researchers to visualize hemagglutinin protein sequences of interest in a three-dimensional context. With a built-in molecular viewer, hemagglutinin models can be compared side-by-side and navigated by a corresponding sequence alignment. The models and alignments can be downloaded for offline use and further analysis. The modeling protocols used in the HASP server scale well for large amounts of sequences and will keep pace with expanded sequencing efforts. The conservative approach to modeling and the intuitive search and visualization interfaces allow researchers to quickly analyze hemagglutinin sequences of interest in the context of the most highly related experimental structures, and allow them to directly compare hemagglutinin sequences to each other simultaneously in their two- and three-dimensional contexts. The models and methodology have shown utility in current research efforts and the ongoing aim of the HASP server is to continue to accelerate influenza A research and have a positive impact on global public health.
Intact Protein Analysis at 21 Tesla and X-Ray Crystallography Define Structural Differences in Single Amino Acid Variants of Human Mitochondrial Branched-Chain Amino Acid Aminotransferase 2 (BCAT2)

NASA Astrophysics Data System (ADS)

Anderson, Lissa C.; Håkansson, Maria; Walse, Björn; Nilsson, Carol L.

2017-09-01

Structural technologies are an essential component in the design of precision therapeutics. Precision medicine entails the development of therapeutics directed toward a designated target protein, with the goal to deliver the right drug to the right patient at the right time. In the field of oncology, protein structural variants are often associated with oncogenic potential. In a previous proteogenomic screen of patient-derived glioblastoma (GBM) tumor materials, we identified a sequence variant of human mitochondrial branched-chain amino acid aminotransferase 2 as a putative factor of resistance of GBM to standard-of-care-treatments. The enzyme generates glutamate, which is neurotoxic. To elucidate structural coordinates that may confer altered substrate binding or activity of the variant BCAT2 T186R, a 45 kDa protein, we applied combined ETD and CID top-down mass spectrometry in a LC-FT-ICR MS at 21 T, and X-Ray crystallography in the study of both the variant and non-variant intact proteins. The combined ETD/CID fragmentation pattern allowed for not only extensive sequence coverage but also confident localization of the amino acid variant to its position in the sequence. The crystallographic experiments confirmed the hypothesis generated by in silico structural homology modeling, that the Lys59 side-chain of BCAT2 may repulse the Arg186 in the variant protein (PDB code: 5MPR), leading to destabilization of the protein dimer and altered enzyme kinetics. Taken together, the MS and novel 3D structural data give us reason to further pursue BCAT2 T186R as a precision drug target in GBM. [Figure not available: see fulltext.
Introduction on Using the FastPCR Software and the Related Java Web Tools for PCR and Oligonucleotide Assembly and Analysis.

PubMed

Kalendar, Ruslan; Tselykh, Timofey V; Khassenov, Bekbolat; Ramanculov, Erlan M

2017-01-01

This chapter introduces the FastPCR software as an integrated tool environment for PCR primer and probe design, which predicts properties of oligonucleotides based on experimental studies of the PCR efficiency. The software provides comprehensive facilities for designing primers for most PCR applications and their combinations. These include the standard PCR as well as the multiplex, long-distance, inverse, real-time, group-specific, unique, overlap extension PCR for multi-fragments assembling cloning and loop-mediated isothermal amplification (LAMP). It also contains a built-in program to design oligonucleotide sets both for long sequence assembly by ligase chain reaction and for design of amplicons that tile across a region(s) of interest. The software calculates the melting temperature for the standard and degenerate oligonucleotides including locked nucleic acid (LNA) and other modifications. It also provides analyses for a set of primers with the prediction of oligonucleotide properties, dimer and G/C-quadruplex detection, linguistic complexity as well as a primer dilution and resuspension calculator. The program consists of various bioinformatical tools for analysis of sequences with the GC or AT skew, CG% and GA% content, and the purine-pyrimidine skew. It also analyzes the linguistic sequence complexity and performs generation of random DNA sequence as well as restriction endonucleases analysis. The program allows to find or create restriction enzyme recognition sites for coding sequences and supports the clustering of sequences. It performs efficient and complete detection of various repeat types with visual display. The FastPCR software allows the sequence file batch processing that is essential for automation. The program is available for download at http://primerdigital.com/fastpcr.html , and its online version is located at http://primerdigital.com/tools/pcr.html .
Study of nucleic acid-gold nanorod interactions and detecting nucleic acid hybridization using gold nanorod solutions in the presence of sodium citrate.

PubMed

Kanjanawarut, Roejarek; Su, Xiaodi

2010-09-01

In this study, the authors report that sodium citrate can aggregate hexadecyl-trimethyl-ammonium ion(+)-coated gold nanorods (AuNRs), and nucleic acids of different charge and structure properties, i.e., single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), single-stranded peptide nucleic acid (PNA), and PNA-DNA complex, can bind to the AuNRs and therefore retard the sodium citrate-induced aggregation to different extents. The discovery that hybridized dsDNA (and the PNA-DNA complex) has a more pronounced protection effect than ssDNA (and PNA) allows the authors to develop a homogeneous phase AuNRs-based UV-visible (UV-vis) spectral assay for detecting specific sequences of oligonucleotides (20 mer) with a single-base-mismatch selectivity and a limit of detection of 5 nM. This assay involves no tedious bioconjugation and on-particle hybridization. The simple "set and test" format allows for a highly efficient hybridization in a homogeneous phase and a rapid display of the results in less than a minute. By measuring the degree of reduction in AuNR aggregation in the presence of different nucleic acid samples, one can assess how different nucleic acids interact with the AuNRs to complement the knowledge of spherical gold nanoparticles. Besides UV-vis characterization, transmission electron microscopy and zeta potential measurements were conduced to provide visual evidence of the particle aggregation and to support the discussion of the assay principle.
Complete amino acid sequence of bovine colostrum low-Mr cysteine proteinase inhibitor.

PubMed

Hirado, M; Tsunasawa, S; Sakiyama, F; Niinobe, M; Fujii, S

1985-07-01

The complete amino acid sequence of bovine colostrum cysteine proteinase inhibitor was determined by sequencing native inhibitor and peptides obtained by cyanogen bromide degradation, Achromobacter lysylendopeptidase digestion and partial acid hydrolysis of reduced and S-carboxymethylated protein. Achromobacter peptidase digestion was successfully used to isolate two disulfide-containing peptides. The inhibitor consists of 112 amino acids with an Mr of 12787. Two disulfide bonds were established between Cys 66 and Cys 77 and between Cys 90 and Cys 110. A high degree of homology in the sequence was found between the colostrum inhibitor and human gamma-trace, human salivary acidic protein and chicken egg-white cystatin.
ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

PubMed

Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

2002-12-19

Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Detection and isolation of nucleic acid sequences using competitive hybridization probes

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

1997-01-01

A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.
Detection and isolation of nucleic acid sequences using competitive hybridization probes

DOEpatents

Lucas, J.N.; Straume, T.; Bogen, K.T.

1997-04-01

A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.
Functional and Structural Characterization of a Cation-dependent O-Methyltransferase from the Cyanobacterium Synechocystis sp. Strain PCC 6803*S⃞

PubMed Central

Kopycki, Jakub Grzegorz; Stubbs, Milton T.; Brandt, Wolfgang; Hagemann, Martin; Porzel, Andrea; Schmidt, Jürgen; Schliemann, Willibald; Zenk, Meinhart H.; Vogt, Thomas

2008-01-01

The coding sequence of the cyanobacterium Synechocystis sp. strain PCC 6803 slr0095 gene was cloned and functionally expressed in Escherichia coli. The corresponding enzyme was classified as a cation- and S-adenosyl-l-methionine-dependent O-methyltransferase (SynOMT), consistent with considerable amino acid sequence identities to eukaryotic O-methyltransferases (OMTs). The substrate specificity of SynOMT was similar with those of plant and mammalian CCoAOMT-like proteins accepting a variety of hydroxycinnamic acids and flavonoids as substrates. In contrast to the known mammalian and plant enzymes, which exclusively methylate the meta-hydroxyl position of aromatic di- and trihydroxy systems, Syn-OMT also methylates the para-position of hydroxycinnamic acids like 5-hydroxyferulic and 3,4,5-trihydroxycinnamic acid, resulting in the formation of novel compounds. The x-ray structure of SynOMT indicates that the active site allows for two alternative orientations of the hydroxylated substrates in comparison to the active sites of animal and plant enzymes, consistent with the observed preferred para-methylation and position promiscuity. Lys3 close to the N terminus of the recombinant protein appears to play a key role in the activity of the enzyme. The possible implications of these results with respect to modifications of precursors of polymers like lignin are discussed. PMID:18502765
iFeature: a python package and web server for features extraction and selection from protein and peptide sequences.

PubMed

Chen, Zhen; Zhao, Pei; Li, Fuyi; Leier, André; Marquez-Lago, Tatiana T; Wang, Yanan; Webb, Geoffrey I; Smith, A Ian; Daly, Roger J; Chou, Kuo-Chen; Song, Jiangning

2018-03-08

Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection, and dimensionality reduction algorithms, greatly facilitating training, analysis, and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit. http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/. jiangning.song@monash.edu; kcchou@gordonlifescience.org; roger.daly@monash.edu. Supplementary data are available at Bioinformatics online.
Detection of nucleic acids by multiple sequential invasive cleavages

DOEpatents

Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.

1999-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Nucleic acid detection kits

DOEpatents

Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann; Kwiatkowski, Robert W.; Vavra, Stephanie H.

2005-03-29

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of nucleic acid from various viruses in a sample.
Detection of nucleic acids by multiple sequential invasive cleavages 02

DOEpatents

Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.

2002-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Detection of nucleic acids by multiple sequential invasive cleavages

DOEpatents

Hall, Jeff G; Lyamichev, Victor I; Mast, Andrea L; Brow, Mary Ann D

2012-10-16

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Simple, multiplexed, PCR-based barcoding of DNA enables sensitive mutation detection in liquid biopsies using sequencing.

PubMed

Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E

2016-06-20

Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level.

PubMed

Rodriguez-R, Luis M; Gunturu, Santosh; Harvey, William T; Rosselló-Mora, Ramon; Tiedje, James M; Cole, James R; Konstantinidis, Konstantinos T

2018-06-14

The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.
The short interspersed repetitive element of Trypanosoma cruzi, SIRE, is part of VIPER, an unusual retroelement related to long terminal repeat retrotransposons

PubMed Central

Vázquez, Martín; Ben-Dov, Claudia; Lorenzi, Hernan; Moore, Troy; Schijman, Alejandro; Levin, Mariano J.

2000-01-01

The short interspersed repetitive element (SIRE) of Trypanosoma cruzi was first detected when comparing the sequences of loci that encode the TcP2β genes. It is present in about 1,500–3,000 copies per genome, depending on the strain, and it is distributed in all chromosomes. An initial analysis of SIRE sequences from 21 genomic fragments allowed us to derive a consensus nucleotide sequence and structure for the element, consisting of three regions (I, II, and III) each harboring distinctive features. Analysis of 158 transcribed SIREs demonstrates that the consensus is highly conserved. The sequences of 51 cDNAs show that SIRE is included in the 3′ end of several mRNAs, always transcribed from the sense strand, contributing the polyadenylation site in 63% of the cases. This study led to the characterization of VIPER (vestigial interposed retroelement), a 2,326-bp-long unusual retroelement. VIPER's 5′ end is formed by the first 182 bp of SIRE, whereas its 3′ end is formed by the last 220 bp of the element. Both SIRE moieties are connected by a 1,924-bp-long fragment that carries a unique ORF encoding a complete reverse transcriptase-RNase H gene whose 15 C-terminal amino acids derive from codons specified by SIRE's region II. The amino acid sequence of VIPER's reverse transcriptase-RNase H shares significant homology to that of long terminal repeat retrotransposons. The fact that SIRE and VIPER sequences are found only in the T. cruzi genome may be of relevance for studies concerning the evolution and the genome flexibility of this protozoan parasite. PMID:10688909
Evolutionary relationships in the ilarviruses: nucleotide sequence of prunus necrotic ringspot virus RNA 3.

PubMed

Sánchez-Navarro, J A; Pallás, V

1997-01-01

The complete nucleotide sequence of an isolate of prunus necrotic ringspot virus (PNRSV) RNA 3 has been determined. Elucidation of the amino acid sequence of the proteins encoded by the two large open reading frames (ORFs) allowed us to carry out comparative and phylogenetic studies on the movement (MP) and coat (CP) proteins in the ilarvirus group. Amino acid sequence comparison of the MP revealed a highly conserved basic sequence motif with an amphipathic alpha-helical structure preceding the conserved motif of the '30K superfamily' proposed by Mushegian and Koonin [26] for MP's. Within this '30K' motif a strictly conserved transmembrane domain is present in all ilarviruses sequenced so far. At the amino-terminal end, prune dwarf virus (PDV) has an extension not present in other ilarviruses but which is observed in all bromo- and cucumoviruses, suggesting a common ancestor or a recombinational event in the Bromoviridae family. Examination of the N-terminus of the CP's of all ilarviruses revealed a highly basic region, part of which resembles the Arg-rich motif that has been characterized in the RNA-binding protein family. This motif has also been found in the other members of the Bromoviridae family, suggesting its involvement in a structural function. Furthermore this region is required for infectivity in ilarviruses. The similarities found in this Arg-rich motif are discussed in terms of this process known as genome activation. Finally, phylogenetic analysis of both the MP and CP proteins revealed a higher relationship of A1MV to PNRSV, apple mosaic virus (ApMV) and PDV than any other member of the ilarvirus group. In that sense, A1MV should be considered as a true ilarvirus instead of forming a distinct group of viruses.
An OmpA Family Protein, a Target of the GinI/GinR Quorum-Sensing System in Gluconacetobacter intermedius, Controls Acetic Acid Fermentation▿ †

PubMed Central

Iida, Aya; Ohnishi, Yasuo; Horinouchi, Sueharu

2008-01-01

Via N-acylhomoserine lactones, the GinI/GinR quorum-sensing system in Gluconacetobacter intermedius NCI1051, a gram-negative acetic acid bacterium, represses acetic acid and gluconic acid fermentation. Two-dimensional polyacrylamide gel electrophoretic analysis of protein profiles of strain NCI1051 and ginI and ginR mutants identified a protein that was produced in response to the GinI/GinR regulatory system. Cloning and nucleotide sequencing of the gene encoding this protein revealed that it encoded an OmpA family protein, named GmpA. gmpA was a member of the gene cluster containing three adjacent homologous genes, gmpA to gmpC, the organization of which appeared to be unique to vinegar producers, including “Gluconacetobacter polyoxogenes.” In addition, GmpA was unique among the OmpA family proteins in that its N-terminal membrane domain forming eight antiparallel transmembrane β-strands contained an extra sequence in one of the surface-exposed loops. Transcriptional analysis showed that only gmpA of the three adjacent gmp genes was activated by the GinI/GinR quorum-sensing system. However, gmpA was not controlled directly by GinR but was controlled by an 89-amino-acid protein, GinA, a target of this quorum-sensing system. A gmpA mutant grew more rapidly in the presence of 2% (vol/vol) ethanol and accumulated acetic acid and gluconic acid in greater final yields than strain NCI1051. Thus, GmpA plays a role in repressing oxidative fermentation, including acetic acid fermentation, which is unique to acetic acid bacteria and allows ATP synthesis via ethanol oxidation. Consistent with the involvement of gmpA in oxidative fermentation, its transcription was also enhanced by ethanol and acetic acid. PMID:18487322
Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

PubMed Central

Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

1986-01-01

A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461

Method of increasing conversion of a fatty acid to its corresponding dicarboxylic acid

DOEpatents

Craft, David L.; Wilson, C. Ron; Eirich, Dudley; Zhang, Yeyan

2004-09-14

A nucleic acid sequence including a CYP promoter operably linked to nucleic acid encoding a heterologous protein is provided to increase transcription of the nucleic acid. Expression vectors and host cells containing the nucleic acid sequence are also provided. The methods and compositions described herein are especially useful in the production of polycarboxylic acids by yeast cells.
A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

PubMed

Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

1995-04-01

The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).
An intuitive graphical webserver for multiple-choice protein sequence search.

PubMed

Banky, Daniel; Szalkai, Balazs; Grolmusz, Vince

2014-04-10

Every day tens of thousands of sequence searches and sequence alignment queries are submitted to webservers. The capitalized word "BLAST" becomes a verb, describing the act of performing sequence search and alignment. However, if one needs to search for sequences that contain, for example, two hydrophobic and three polar residues at five given positions, the query formation on the most frequently used webservers will be difficult. Some servers support the formation of queries with regular expressions, but most of the users are unfamiliar with their syntax. Here we present an intuitive, easily applicable webserver, the Protein Sequence Analysis server, that allows the formation of multiple choice queries by simply drawing the residues to their positions; if more than one residue are drawn to the same position, then they will be nicely stacked on the user interface, indicating the multiple choice at the given position. This computer-game-like interface is natural and intuitive, and the coloring of the residues makes possible to form queries requiring not just certain amino acids in the given positions, but also small nonpolar, negatively charged, hydrophobic, positively charged, or polar ones. The webserver is available at http://psa.pitgroup.org. Copyright © 2014 Elsevier B.V. All rights reserved.
Metagenomic approaches for direct and cell culture evaluation of the virological quality of wastewater

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aw, Tiong Gim; Howe, Adina; Rose, Joan B.

2014-12-01

Genomic-based molecular techniques are emerging as powerful tools that allow a comprehensive characterization of water and wastewater microbiomes. Most recently, next generation sequencing (NGS) technologies which produce large amounts of sequence data are beginning to impact the field of environmental virology. In this study, NGS and bioinformatics have been employed for the direct detection and characterization of viruses in wastewater and of viruses isolated after cell culture. Viral particles were concentrated and purified from sewage samples by polyethylene glycol precipitation. Viral nucleic acid was extracted and randomly amplified prior to sequencing using Illumina technology, yielding a total of 18 millionmore » sequence reads. Most of the viral sequences detected could not be characterized, indicating the great viral diversity that is yet to be discovered. This sewage virome was dominated by bacteriophages and contained sequences related to known human pathogenic viruses such as adenoviruses (species B, C and F), polyomaviruses JC and BK and enteroviruses (type B). An array of other animal viruses was also found, suggesting unknown zoonotic viruses. This study demonstrated the feasibility of metagenomic approaches to characterize viruses in complex environmental water samples.« less
Identification and Characterization of the Genes and Enzymes Belonging to the Bile Acid Catabolic Pathway in Pseudomonas.

PubMed

Luengo, José M; Olivera, Elías R

2017-01-01

The study of the catabolic potential of microbial species isolated from different habitats has allowed the identification and characterization of bacteria able to assimilate bile acids and other steroids (e.g., testosterone and 4-androsten-3,17-dione). From soil samples, we have isolated several strains belonging to genus Pseudomonas that grow efficiently in chemical defined media containing some cyclopentane-perhydro-phenantrene derivatives as carbon sources. Genetic and biochemical studies performed with one of these bacteria (P. putida DOC21) allowed the identification of the genes and enzymes belonging to the 9,10-seco pathway, the route involved in the aerobic assimilation of steroids. In this manuscript, we describe the most relevant methods required for (1) isolation and characterization of these species; (2) determining the chromosomal location, nucleotide sequence, and functional analysis of the catabolic genes (or gene clusters) encoding the enzymes from this pathway; and (3) the tools employed to establish the role of some of the proteins that participate in this route.
Detection and Characterization of Viral Species/Subspecies Using Isothermal Recombinase Polymerase Amplification (RPA) Assays.

PubMed

Glais, Laurent; Jacquot, Emmanuel

2015-01-01

Numerous molecular-based detection protocols include an amplification step of the targeted nucleic acids. This step is important to reach the expected sensitive detection of pathogens in diagnostic procedures. Amplifications of nucleic acid sequences are generally performed, in the presence of appropriate primers, using thermocyclers. However, the time requested to amplify molecular targets and the cost of the thermocycler machines could impair the use of these methods in routine diagnostics. Recombinase polymerase amplification (RPA) technique allows rapid (short-term incubation of sample and primers in an enzymatic mixture) and simple (isothermal) amplification of molecular targets. RPA protocol requires only basic molecular steps such as extraction procedures and agarose gel electrophoresis. Thus, RPA can be considered as an interesting alternative to standard molecular-based diagnostic tools. In this paper, the complete procedures to set up an RPA assay, applied to detection of RNA (Potato virus Y, Potyvirus) and DNA (Wheat dwarf virus, Mastrevirus) viruses, are described. The proposed procedure allows developing species- or subspecies-specific detection assay.
Amino-terminal domain of the v-fms oncogene product includes a functional signal peptide that directs synthesis of a transforming glycoprotein in the absence of feline leukemia virus gag sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wheeler, E.F.; Roussel, M.F.; Hampe, A.

1986-08-01

The nucleotide sequence of a 5' segment of the human genomic c-fms proto-oncogene suggested that recombination between feline leukemia virus and feline c-fms sequences might have occurred in a region encoding the 5' untranslated portion of c-fms mRNA. The polyprotein precursor gP180/sup gag-fms/ encoded by the McDonough strain of feline sarcoma virus was therefore predicted to contain 34 v-fms-coded amino acids derived from sequences of the c-fms gene that are not ordinarily translated from the proto-oncogene mRNA. The (gP180/sup gag-fms/) polyprotein was cotranslationally cleaved near the gag-fms junction to remove its gag gene-coded portion. Determination of the amino-terminal sequence ofmore » the resulting v-fms-coded glycoprotein, gp120/sup v-fms/, showed that the site of proteolysis corresponded to a predicted signal peptidase cleavage site within the c-fms gene product. Together, these analyses suggested that the linked gag sequences may not be necessary for expression of a biologically active v-fms gene product. The gag-fms sequences of feline sarcoma virus strain McDonough and the v-fms sequences alone were inserted into a murine retroviral vector containing a neomycin resistance gene. The authors conclude that a cryptic hydrophobic signal peptide sequence in v-fms was unmasked by gag deletion, thereby allowing the correct orientation and transport of the v-fms was unmasked by gag deletion, thereby allowing the correct orientation and transport of the v-fms gene product within membranous organelles. It seems likely that the proteolytic cleavage of gP180/gag-fms/ is mediated by signal peptidase and that the amino termini of gp140/sup v-fms/ and the c-fms gene product are identical.« less
Primary structure of prostaglandin G/H synthase from sheep vesicular gland determined from the complementary DNA sequence.

PubMed Central

DeWitt, D L; Smith, W L

1988-01-01

Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548
PubDNA Finder: a web database linking full-text articles to sequences of nucleic acids.

PubMed

García-Remesal, Miguel; Cuevas, Alejandro; Pérez-Rey, David; Martín, Luis; Anguita, Alberto; de la Iglesia, Diana; de la Calle, Guillermo; Crespo, José; Maojo, Víctor

2010-11-01

PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder
Parameters of proteome evolution from histograms of amino-acid sequence identities of paralogous proteins

PubMed Central

Axelsen, Jacob Bock; Yan, Koon-Kiu; Maslov, Sergei

2007-01-01

Background The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins. Results We introduce a simple mathematical framework based on a stochastic birth-and-death model that allows one to extract some of this information and apply it to the set of all pairs of paralogous proteins in H. pylori, E. coli, S. cerevisiae, C. elegans, D. melanogaster, and H. sapiens. It was found that the histogram of sequence identities p generated by an all-to-all alignment of all protein sequences encoded in a genome is well fitted with a power-law form ~ p-γ with the value of the exponent γ around 4 for the majority of organisms used in this study. This implies that the intra-protein variability of substitution rates is best described by the Gamma-distribution with the exponent α ≈ 0.33. Different features of the shape of such histograms allow us to quantify the ratio between the genome-wide average deletion/duplication rates and the amino-acid substitution rate. Conclusion We separately measure the short-term ("raw") duplication and deletion rates rdup∗, rdel∗ which include gene copies that will be removed soon after the duplication event and their dramatically reduced long-term counterparts rdup, rdel. High deletion rate among recently duplicated proteins is consistent with a scenario in which they didn't have enough time to significantly change their functional roles and thus are to a large degree disposable. Systematic trends of each of the four duplication/deletion rates with the total number of genes in the genome were analyzed. All but the deletion rate of recent duplicates rdel∗ were shown to systematically increase with Ngenes. Abnormally flat shapes of sequence identity histograms observed for yeast and human are consistent with lineages leading to these organisms undergoing one or more whole-genome duplications. This interpretation is corroborated by our analysis of the genome of Paramecium tetraurelia where the p-4 profile of the histogram is gradually restored by the successive removal of paralogs generated in its four known whole-genome duplication events. PMID:18039386
Superimposed Code Theoretic Analysis of Deoxyribonucleic Acid (DNA) Codes and DNA Computing

DTIC Science & Technology

2010-01-01

partitioned by font type) of sequences are allowed to be in each position (e.g., Arial = position 0, Comic = position 1, etc. ) and within each collection...movement was modeled by a Brownian motion 3 dimensional random walk. The one dimensional diffusion coefficient D for the ellipsoid shape with 3...temperature, kB is Boltzmann’s constant, and η is the viscosity of the medium. The random walk motion is modeled by assuming the oligo is on a three
Antigenic specificity of the Mycobacterium leprae homologue of ESAT-6.

PubMed

Spencer, John S; Marques, Maria Angela M; Lima, Monica C B S; Junqueira-Kipnis, Ana Paula; Gregory, Bruce C; Truman, Richard W; Brennan, Patrick J

2002-02-01

The sequence of the Mycobacterium leprae homologue of ESAT-6 shows only 36% amino acid correspondence to that from Mycobacterium tuberculosis. Anti-M. leprae ESAT-6 polyclonal and monoclonal antibodies and T-cell hybridomas reacted only with the homologous protein and allowed identification of the B- and T-cell epitopes. The protein is expressed in M. leprae and appears in the cell wall fraction. Thus, M. leprae ESAT-6 shows promise as a specific diagnostic agent for leprosy.
Nucleotide sequence analysis of the gene encoding the Deinococcus radiodurans surface protein, derived amino acid sequence, and complementary protein chemical studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peters, J.; Peters, M.; Lottspeich, F.

1987-11-01

The complete nucleotide sequence of the gene encoding the surface (hexagonally packed intermediate (HPI))-layer polypeptide of Deinococcus radiodurans Sark was determined and found to encode a polypeptide of 1036 amino acids. Amino acid sequence analysis of about 30% of the residues revealed that the mature polypeptide consists of at least 978 amino acids. The N terminus was blocked to Edman degradation. The results of proteolytic modification of the HPI layer in situ and M/sub r/ estimations of the HPI polypeptide expressed in Escherichia coli indicated that there is a leader sequence. The N-terminal region contained a very high percentage (29%)more » of threonine and serine, including a cluster of nine consecutive serine or threonine residues, whereas a stretch near the C terminus was extremely rich in aromatic amino acids (29%). The protein contained at least two disulfide bridges, as well as tightly bound reducing sugars and fatty acids.« less
Artificial mismatch hybridization

DOEpatents

Guo, Zhen; Smith, Lloyd M.

1998-01-01

An improved nucleic acid hybridization process is provided which employs a modified oligonucleotide and improves the ability to discriminate a control nucleic acid target from a variant nucleic acid target containing a sequence variation. The modified probe contains at least one artificial mismatch relative to the control nucleic acid target in addition to any mismatch(es) arising from the sequence variation. The invention has direct and advantageous application to numerous existing hybridization methods, including, applications that employ, for example, the Polymerase Chain Reaction, allele-specific nucleic acid sequencing methods, and diagnostic hybridization methods.
Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

2000-01-01

A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.
A mobile element in mutS drives hypermutation in a marine Vibrio

DOE PAGES

Chu, Nathaniel D.; Clarke, Sean A.; Timberlake, Sonia; ...

2017-02-07

Bacteria face a trade-off between genetic fidelity, which reduces deleterious mistakes in the genome, and genetic innovation, which allows organisms to adapt. Evidence suggests that many bacteria balance this trade-off by modulating their mutation rates, but few mechanisms have been described for such modulation. Following experimental evolution and whole-genome resequencing of the marine bacterium Vibrio splendidus 12B01, we discovered one such mechanism, which allows this bacterium to switch to an elevated mutation rate. This switch is driven by the excision of a mobile element residing in mutS, which encodes a DNA mismatch repair protein. When integrated within the bacterial genome,more » the mobile element provides independent promoter and translation start sequences for mutS—different from the bacterium’s original mutS promoter region—which allow the bacterium to make a functional mutS gene product. Excision of this mobile element rejoins the mutS gene with host promoter and translation start sequences but leaves a 2-bp deletion in the mutS sequence, resulting in a frameshift and a hypermutator phenotype. We further identified hundreds of clinical and environmental bacteria across Betaproteobacteria and Gammaproteobacteria that possess putative mobile elements within the same amino acid motif in mutS. In a subset of these bacteria, we detected excision of the element but not a frameshift mutation; the mobile elements leave an intact mutS coding sequence after excision. Finally, our findings reveal a novel mechanism by which one bacterium alters its mutation rate and hint at a possible evolutionary role for mobile elements within mutS in other bacteria.« less
A mobile element in mutS drives hypermutation in a marine Vibrio

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chu, Nathaniel D.; Clarke, Sean A.; Timberlake, Sonia

Bacteria face a trade-off between genetic fidelity, which reduces deleterious mistakes in the genome, and genetic innovation, which allows organisms to adapt. Evidence suggests that many bacteria balance this trade-off by modulating their mutation rates, but few mechanisms have been described for such modulation. Following experimental evolution and whole-genome resequencing of the marine bacterium Vibrio splendidus 12B01, we discovered one such mechanism, which allows this bacterium to switch to an elevated mutation rate. This switch is driven by the excision of a mobile element residing in mutS, which encodes a DNA mismatch repair protein. When integrated within the bacterial genome,more » the mobile element provides independent promoter and translation start sequences for mutS—different from the bacterium’s original mutS promoter region—which allow the bacterium to make a functional mutS gene product. Excision of this mobile element rejoins the mutS gene with host promoter and translation start sequences but leaves a 2-bp deletion in the mutS sequence, resulting in a frameshift and a hypermutator phenotype. We further identified hundreds of clinical and environmental bacteria across Betaproteobacteria and Gammaproteobacteria that possess putative mobile elements within the same amino acid motif in mutS. In a subset of these bacteria, we detected excision of the element but not a frameshift mutation; the mobile elements leave an intact mutS coding sequence after excision. Finally, our findings reveal a novel mechanism by which one bacterium alters its mutation rate and hint at a possible evolutionary role for mobile elements within mutS in other bacteria.« less
Characterization of the Aspergillus nidulans aspnd1 gene demonstrates that the ASPND1 antigen, which it encodes, and several Aspergillus fumigatus immunodominant antigens belong to the same family.

PubMed Central

Calera, J A; Ovejero, M C; López-Medrano, R; Segurado, M; Puente, P; Leal, F

1997-01-01

For the first time, an immunodominant Aspergillus nidulans antigen (ASPND1) consistently reactive with serum samples from aspergilloma patients has been purified and characterized, and its coding gene (aspnd1) has been cloned and sequenced. ASPND1 is a glycoprotein with four N-glycosidically-bound sugar chains (around 2.1 kDa each) which are not necessary for reactivity with immune human sera. The polypeptide part is synthesized as a 277-amino-acid precursor of 30.6 kDa that after cleavage of a putative signal peptide of 16 amino acids, affords a mature protein of 261 amino acids with a molecular mass of 29 kDa and a pI of 4.24 (as deduced from the sequence). The ASPND1 protein is 53.1% identical to the AspfII allergen from Aspergillus fumigatus and 48% identical to an unpublished Candida albicans antigen. All of the cysteine residues and most of the glycosylation sites are perfectly conserved in the three proteins, suggesting a similar but yet unknown function. Analysis of the primary structure of the ASPND1 coding gene (aspnd1) has allowed the establishment of a clear relationship between several previously reported A. fumigatus and A. nidulans immunodominant antigens. PMID:9119471
Structured oligonucleotides for target indexing to allow single-vessel PCR amplification and solid support microarray hybridization

PubMed Central

Girard, Laurie D.; Boissinot, Karel; Peytavi, Régis; Boissinot, Maurice; Bergeron, Michel G.

2014-01-01

The combination of molecular diagnostic technologies is increasingly used to overcome limitations on sensitivity, specificity or multiplexing capabilities, and provide efficient lab-on-chip devices. Two such techniques, PCR amplification and microarray hybridization are used serially to take advantage of the high sensitivity and specificity of the former combined with high multiplexing capacities of the latter. These methods are usually performed in different buffers and reaction chambers. However, these elaborate methods have a high complexity cost related to reagent requirements, liquid storage and the number of reaction chambers to integrate into automated devices. Furthermore, microarray hybridizations have a sequence dependent efficiency not always predictable. In this work, we have developed the concept of a structured oligonucleotide probe which is activated by cleavage from polymerase exonuclease activity. This technology is called SCISSOHR for Structured Cleavage Induced Single-Stranded Oligonucleotide Hybridization Reaction. The SCISSOHR probes enable indexing the target sequence to a tag sequence. The SCISSOHR technology also allows the combination of nucleic acid amplification and microarray hybridization in a single vessel in presence of the PCR buffer only. The SCISSOHR technology uses an amplification probe that is irreversibly modified in presence of the target, releasing a single-stranded DNA tag for microarray hybridization. Each tag is composed of a 3-nucleotidesequence-dependent segment and a unique “target sequence-independent” 14-nucleotide segment allowing for optimal hybridization with minimal cross-hybridization. We evaluated the performance of five (5) PCR buffers to support microarray hybridization, compared to a conventional hybridization buffer. Finally, as a proof of concept, we developed a multiplexed assay for the amplification, detection, and identification of three (3) DNA targets. This new technology will facilitate the design of lab-on-chip microfluidic devices, while also reducing consumable costs. At term, it will allow the cost-effective automation of highly multiplexed assays for detection and identification of genetic targets. PMID:25489607
A new cofactor in prokaryotic enzyme: Tryptophan tryptophylquinone as the redox prosthetic group in methylamine dehydrogenase

DOE Office of Scientific and Technical Information (OSTI.GOV)

McIntire, W.S.; Wemmer, D.E.; Chistoserdov, A.

Methylamine dehydrogenase (MADH), an {alpha}{sub 2}{beta}{sub 2} enzyme from numerous methylotrophic soil bacteria, contains a novel quinonoid redox prosthetic group that is covalently bound to its small {beta} subunit through two amino acyl residues. A comparison of the amino acid sequence deduced from the gene sequence of the small subunit for the enzyme from Methylobacterium extorquens AM1 with the published amino acid sequence obtained by Edman degradation method, allowed the identification of the amino acyl constituents of the cofactor as two tryptophyl residues. This information was crucial for interpreting {sup 1}H and {sup 13}C nuclear magnetic resonance, and mass spectralmore » data collected for the semicarbazide- and carboxymethyl-derivatized bis(tripeptidyl)-cofactor of MADH from bacterium W3A1. The cofactor is composed of two cross-linked tryptophyl residues. Although there are many possible isomers, only one is consistent with all the data: The first tryptophyl residue in the peptide sequence exists as an indole-6,7-dione, and is attached at its 4 position to the 2 position of the second, otherwise unmodified, indole side group. Contrary to earlier reports, the cofactor of MADH is not 2,7,9-tricarboxypyrroloquinoline quinone (PQQ), a derivative thereof, of pro-PQQ. This appears to be the only example of two cross-linked, modified amino acyl residues having a functional role in the active site of an enzyme, in the absence of other cofactors or metal ions.« less

Optical resolution of phenylthiohydantoin-amino acids by capillary electrophoresis and identification of the phenylthiohydantoin-D-amino acid residue of [D-Ala2]-methionine enkephalin.

PubMed

Kurosu, Y; Murayama, K; Shindo, N; Shisa, Y; Ishioka, N

1996-11-01

This is an initial report to propose a protein sequence analysis system with DL differentiation using capillary electrophoresis (CE). This system consists of a protein sequencer and a CE system. After fractionation of phenyl-thiohydantoin (PTH)-amino acids using a protein sequencer, optical resolution for each PTH-amino acid is performed by CE using some chiral selectors such as digitonin, beta-escin and others. As a model peptide, [D-Ala2]-methionine enkephalin (L-Tyr-D-Ala-Gly-L-Phe-L-Met), was used and the sequence with DL differentiation was determined, with the exception of the fourth amino acid, L-Phe, using our proposed system.
Combining Rosetta with molecular dynamics (MD): A benchmark of the MD-based ensemble protein design.

PubMed

Ludwiczak, Jan; Jarmula, Adam; Dunin-Horkawicz, Stanislaw

2018-07-01

Computational protein design is a set of procedures for computing amino acid sequences that will fold into a specified structure. Rosetta Design, a commonly used software for protein design, allows for the effective identification of sequences compatible with a given backbone structure, while molecular dynamics (MD) simulations can thoroughly sample near-native conformations. We benchmarked a procedure in which Rosetta design is started on MD-derived structural ensembles and showed that such a combined approach generates 20-30% more diverse sequences than currently available methods with only a slight increase in computation time. Importantly, the increase in diversity is achieved without a loss in the quality of the designed sequences assessed by their resemblance to natural sequences. We demonstrate that the MD-based procedure is also applicable to de novo design tasks started from backbone structures without any sequence information. In addition, we implemented a protocol that can be used to assess the stability of designed models and to select the best candidates for experimental validation. In sum our results demonstrate that the MD ensemble-based flexible backbone design can be a viable method for protein design, especially for tasks that require a large pool of diverse sequences. Copyright © 2018 Elsevier Inc. All rights reserved.
VKCDB: voltage-gated K+ channel database updated and upgraded.

PubMed

Gallin, Warren J; Boutet, Patrick A

2011-01-01

The Voltage-gated K(+) Channel DataBase (VKCDB) (http://vkcdb.biology.ualberta.ca) makes a comprehensive set of sequence data readily available for phylogenetic and comparative analysis. The current update contains 2063 entries for full-length or nearly full-length unique channel sequences from Bacteria (477), Archaea (18) and Eukaryotes (1568), an increase from 346 solely eukaryotic entries in the original release. In addition to protein sequences for channels, corresponding nucleotide sequences of the open reading frames corresponding to the amino acid sequences are now available and can be extracted in parallel with sets of protein sequences. Channels are categorized into subfamilies by phylogenetic analysis and by using hidden Markov model analyses. Although the raw database contains a number of fragmentary, duplicated, obsolete and non-channel sequences that were collected in early steps of data collection, the web interface will only return entries that have been validated as likely K(+) channels. The retrieval function of the web interface allows retrieval of entries that contain a substantial fraction of the core structural elements of VKCs, fragmentary entries, or both. The full database can be downloaded as either a MySQL dump or as an XML dump from the web site. We have now implemented automated updates at quarterly intervals.
Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase.

PubMed

Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V

2006-10-15

The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Biochemical characterization of a novel L-Asparaginase with low glutaminase activity from Rhizomucor miehei and its application in food safety and leukemia treatment.

PubMed

Huang, Linhua; Liu, Yu; Sun, Yan; Yan, Qiaojuan; Jiang, Zhengqiang

2014-03-01

A novel fungal gene encoding the Rhizomucor miehei l-asparaginase (RmAsnase) was cloned and expressed in Escherichia coli. Its deduced amino acid sequence shared only 57% identity with the amino acid sequences of other reported l-asparaginases. The purified l-asparaginase homodimer had a molecular mass of 133.7 kDa, a high specific activity of 1,985 U/mg, and very low glutaminase activity. RmAsnase was optimally active at pH 7.0 and 45°C and was stable at this temperature for 30 min. The final level of acrylamide in biscuits and bread was decreased by about 81.6% and 94.2%, respectively, upon treatment with 10 U RmAsnase per mg flour. Moreover, this l-asparaginase was found to potentiate a lectin's induction of leukemic K562 cell apoptosis, allowing lowering of the drug dosage and shortening of the incubation time. Overall, our findings suggest that RmAsnase possesses a remarkable potential for the food industry and in chemotherapeutics for leukemia.
Biochemical Characterization of a Novel l-Asparaginase with Low Glutaminase Activity from Rhizomucor miehei and Its Application in Food Safety and Leukemia Treatment

PubMed Central

Huang, Linhua; Liu, Yu; Sun, Yan

2014-01-01

A novel fungal gene encoding the Rhizomucor miehei l-asparaginase (RmAsnase) was cloned and expressed in Escherichia coli. Its deduced amino acid sequence shared only 57% identity with the amino acid sequences of other reported l-asparaginases. The purified l-asparaginase homodimer had a molecular mass of 133.7 kDa, a high specific activity of 1,985 U/mg, and very low glutaminase activity. RmAsnase was optimally active at pH 7.0 and 45°C and was stable at this temperature for 30 min. The final level of acrylamide in biscuits and bread was decreased by about 81.6% and 94.2%, respectively, upon treatment with 10 U RmAsnase per mg flour. Moreover, this l-asparaginase was found to potentiate a lectin's induction of leukemic K562 cell apoptosis, allowing lowering of the drug dosage and shortening of the incubation time. Overall, our findings suggest that RmAsnase possesses a remarkable potential for the food industry and in chemotherapeutics for leukemia. PMID:24362429
Production of volatile fatty acids from sewage organic matter by combined bioflocculation and alkaline fermentation.

PubMed

Khiewwijit, Rungnapha; Temmink, Hardy; Labanda, Alvaro; Rijnaarts, Huub; Keesman, Karel J

2015-12-01

This study explored the potential of volatile fatty acids (VFA) production from sewage by a combined high-loaded membrane bioreactor and sequencing batch fermenter. VFA production was optimized with respect to SRT and alkaline pH (pH 8-10). Application of pH shock to a value of 9 at the start of a sequencing batch cycle, followed by a pH uncontrolled phase for 7days, gave the highest VFA yield of 440mgVFA-COD/g VSS. This yield was much higher than at fermentation without pH control or at a constant pH between 8 and 10. The high yield in the pH 9 shocked system could be explained by (1) a reduction of methanogenic activity, or (2) a high degree of solids degradation or (3) an enhanced protein hydrolysis and fermentation. VFA production can be further optimized by fine-tuning pH level and longer operation, possibly allowing enrichment of alkalophilic and alkali-tolerant fermenting microorganisms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Effects of side chains in helix nucleation differ from helix propagation

PubMed Central

Miller, Stephen E.; Watkins, Andrew M.; Kallenbach, Neville R.; Arora, Paramjit S.

2014-01-01

Helix–coil transition theory connects observable properties of the α-helix to an ensemble of microstates and provides a foundation for analyzing secondary structure formation in proteins. Classical models account for cooperative helix formation in terms of an energetically demanding nucleation event (described by the σ constant) followed by a more facile propagation reaction, with corresponding s constants that are sequence dependent. Extensive studies of folding and unfolding in model peptides have led to the determination of the propagation constants for amino acids. However, the role of individual side chains in helix nucleation has not been separately accessible, so the σ constant is treated as independent of sequence. We describe here a synthetic model that allows the assessment of the role of individual amino acids in helix nucleation. Studies with this model lead to the surprising conclusion that widely accepted scales of helical propensity are not predictive of helix nucleation. Residues known to be helix stabilizers or breakers in propagation have only a tenuous relationship to residues that favor or disfavor helix nucleation. PMID:24753597
High sensitive and direct fluorescence detection of single viral DNA sequences by integration of double strand probes onto microgels particles.

PubMed

Aliberti, A; Cusano, A M; Battista, E; Causa, F; Netti, P A

2016-02-21

A novel class of probes for fluorescence detection was developed and combined to microgel particles for a high sensitive fluorescence detection of nucleic acids. A double strand probe with an optimized fluorescent-quencher couple was designed for the detection of different lengths of nucleic acids (39 nt and 100 nt). Such probe proved efficient in target detection in different contests and specific even in presence of serum proteins. The conjugation of double strand probes onto polymeric microgels allows for a sensitive detection of DNA sequences from HIV, HCV and SARS corona viruses with a LOD of 1.4 fM, 3.7 fM and 1.4 fM, respectively, and with a dynamic range of 10(-9)-10(-15) M. Such combination enhances the sensitivity of the detection of almost five orders of magnitude when compared to the only probe. The proposed platform based on the integration of innovative double strand probe into microgels particles represents an attractive alternative to conventional sensitive DNA detection technologies that rely on amplifications methods.
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
DNA detection using water-soluble conjugated polymers and peptide nucleic acid probes

PubMed Central

Gaylord, Brent S.; Heeger, Alan J.; Bazan, Guillermo C.

2002-01-01

The light-harvesting properties of cationic conjugated polymers are used to sensitize the emission of a dye on a specific peptide nucleic acid (PNA) sequence for the purpose of homogeneous, “real-time” DNA detection. Signal transduction is controlled by hybridization of the neutral PNA probe and the negative DNA target. Electrostatic interactions bring the hybrid complex and cationic polymer within distances required for Förster energy transfer. Conjugated polymer excitation provides fluorescein emission >25 times higher than that obtained by exciting the dye, allowing detection of target DNA at concentrations of 10 pM with a standard fluorometer. A simple and highly sensitive assay with optical amplification that uses the improved hybridization behavior of PNA/DNA complexes is thus demonstrated. PMID:12167673
Selection of Streptococcus lactis Mutants Defective in Malolactic Fermentation

PubMed Central

Renault, Pierre P.; Heslot, Henri

1987-01-01

An enrichment medium and a new sensitive medium were developed to detect malolactic variants in different strains of lactic bacteria. Factors such as the concentration of glucose and l-malate, pH level, and the type of indicator dye used are discussed with regard to the kinetics of malic acid conversion to lactic acid. Use of these media allowed a rapid and easier screening of mutagenized streptococcal cells unable to ferment l-malate. A collection of malolactic-negative mutants of Streptococcus lactis induced by UV, nitrosoguanidine, or transposonal mutagenesis were characterized. The results showed that several mutants were apparently defective in the structural gene of malolactic enzyme, whereas others contained mutations which may either inactivate a putative permease or affect a regulatory sequence. PMID:16347282
The 4-pyridylmethyl ester as a protecting group for glutamic and aspartic acids: 'flipping' peptide charge states for characterization by positive ion mode ESI-MS.

PubMed

Garapati, Sriramya; Burns, Colin S

2014-03-01

Use of the 4-pyridylmethyl ester group for side-chain protection of glutamic acid residues in solid-phase peptide synthesis enables switching of the charge state of a peptide from negative to positive, thus making detection by positive ion mode ESI-MS possible. The pyridylmethyl ester moiety is readily removed from peptides in high yield by hydrogenation. Combining the 4-pyridylmethyl ester protecting group with benzyl ester protection reduces the number of the former needed to produce a net positive charge and allows for purification by RP HPLC. This protecting group is useful in the synthesis of highly acidic peptide sequences, which are often beset by problems with purification by standard RP HPLC and characterization by ESI-MS. Copyright © 2014 European Peptide Society and John Wiley & Sons, Ltd.
Genetic variants of human serum cholinesterase influence metabolism of the muscle relaxant succinylcholine.

PubMed

Lockridge, O

1990-01-01

People with genetic variants of cholinesterase respond abnormally to succinylcholine, experiencing substantial prolongation of muscle paralysis with apnea rather than the usual 2-6 min. The structure of usual cholinesterase has been determined including the complete amino acid and nucleotide sequence. This has allowed identification of altered amino acids and nucleotides. The variant most frequently found in patients who respond abnormally to succinylcholine is atypical cholinesterase, which occurs in homozygous form in 1 out of 3500 Caucasians. Atypical cholinesterase has a single substitution at nucleotide 209 which changes aspartic acid 70 to glycine. This suggests that Asp 70 is part of the anionic site, and that the absence of this negatively charged amino acid explains the reduced affinity of atypical cholinesterase for positively charged substrates and inhibitors. The clinical consequence of reduced affinity for succinylcholine is that none of the succinylcholine is hydrolyzed in blood and a large overdose reaches the nerve-muscle junction where it causes prolonged muscle paralysis. Silent cholinesterase has a frame shift mutation at glycine 117 which prematurely terminates protein synthesis and yields no active enzyme. The K variant, named in honor of W. Kalow, has threonine in place of alanine 539. The K variant is associated with 33% lower activity. All variants arise from a single locus as there is only one gene for human cholinesterase (EC 3.1.1.8). Comparison of amino acid sequences of esterases and proteases shows that cholinesterase belongs to a new family of serine esterases which is different from the serine proteases.
Chemical Cleavage of an Asp-Cys Sequence Allows Efficient Production of Recombinant Peptides with an N-Terminal Cysteine Residue.

PubMed

Pane, Katia; Verrillo, Mariavittoria; Avitabile, Angela; Pizzo, Elio; Varcamonti, Mario; Zanfardino, Anna; Di Maro, Antimo; Rega, Camilla; Amoresano, Angela; Izzo, Viviana; Di Donato, Alberto; Cafaro, Valeria; Notomista, Eugenio

2018-04-18

Peptides with an N-terminal cysteine residue allow site-specific modification of proteins and peptides and chemical synthesis of proteins. They have been widely used to develop new strategies for imaging, drug discovery, diagnostics, and chip technologies. Here we present a method to produce recombinant peptides with an N-terminal cysteine residue as a convenient alternative to chemical synthesis. The method is based on the release of the desired peptide from a recombinant fusion protein by mild acid hydrolysis of an Asp-Cys sequence. To test the general validity of the method we prepared four fusion proteins bearing three different peptides (20-37 amino acid long) at the C-terminus of a ketosteroid isomerase-derived and two Onconase-derived carriers for the production of toxic peptides in E. coli. The chosen peptides were (C)GKY20, an antimicrobial peptide from the C-terminus of human thrombin, (C)ApoB L , an antimicrobial peptide from an inner region of human Apolipoprotein B, and (C)p53pAnt, an anticancer peptide containing the C-terminal region of the p53 protein fused to the cell penetrating peptide Penetratin. Cleavage efficiency of Asp-Cys bonds in the four fusion proteins was studied as a function of pH, temperature, and incubation time. In spite of the differences in the amino acid sequence (GTGDCGKY, GTGDCHVA, GSGTDCGSR, SQGSDCGSR) we obtained for all the proteins a cleavage efficiency of about 70-80% after 24 h incubation at 60 °C and pH 2. All the peptides were produced with very good yield (5-16 mg/L of LB cultures), high purity (>96%), and the expected content of free thiol groups (1 mol per mole of peptide). Furthermore, (C)GKY20 was modified with PyMPO-maleimide, a commercially available fluorophore bearing a thiol reactive group, and with 6-hydroxy-2-cyanobenzothiazole, a reagent specific for N-terminal cysteines, with yields of 100% thus demonstrating that our method is very well suited for the production of fully reactive peptides with an N-terminal cysteine residue.
HPLC-ESI-MS/MS analysis of hemoglobin peptides in tryptic digests of dried-blood spot extracts detects HbS, HbC, HbD, HbE, HbO-Arab, and HbG-Philadelphia mutations.

PubMed

Haynes, Christopher A; Guerra, Stephanie L; Fontana, Jessalyn C; DeJesús, Víctor R

2013-09-23

Hemoglobinopathies are mutations resulting in abnormal globin chain structure; some have clinically significant outcomes such as anemia or reduced lifespan. Five β-globin mutations are (c.20A>T, p.E6V), (c.19G>A, p. E6K), (c.79G>A, p.E26K), (c.364G>C, p.E121Q), and (c.364G>A, p.E121K), resulting in HbS (sickle-cell hemoglobin), HbC, HbE, HbD-Los Angeles, and HbO-Arab, respectively. One α-globin mutation is (c.[207C>G or 207C>A], p.N68K), resulting in HbG-Philadelphia. HPLC-ESI-MS/MS analysis of dried-blood spot (DBS) punches from newborns extracted with a trypsin-containing solution provides greater than 90% coverage of α-, β-, and γ-globin amino acid sequences. Because the (c.20A>T, p.E6V), (c.19G>A, p. E6K), (c.79G>A, p.E26K), (c.364G>C, p.E121Q), (c.364G>A, p.E121K), and (c.[207C>G or 207C>A], p.N68K) mutations generate globin peptides with novel amino acid sequences, detecting one of these peptides in DBS extracts is indicative of the presence of a hemoglobinopathy in the newborn. The method described here can distinguish normal β-globin peptides from the mutant HbS, HbC, HbE, HbD-Los Angeles and HbO-Arab peptides, as well as normal α-globin peptide from the mutant HbG-Philadelphia peptide, allowing the identification of unaffected heterozygotes such as HbAS, and of compound heterozygotes such as HbASG-Philadelphia. This HPLC-ESI-MS/MS analytical approach provides information that is not available from traditional hemoglobin analyses such as isoelectric focusing and HPLC-UV. It is also capable of determining the amino acid sequence of hemoglobin peptides, potentially allowing the detection of numerous hemoglobinopathies resulting from point mutations. Published by Elsevier B.V.
AT_CHLORO, a comprehensive chloroplast proteome database with subplastidial localization and curated information on envelope proteins.

PubMed

Ferro, Myriam; Brugière, Sabine; Salvi, Daniel; Seigneurin-Berny, Daphné; Court, Magali; Moyet, Lucas; Ramus, Claire; Miras, Stéphane; Mellal, Mourad; Le Gall, Sophie; Kieffer-Jaquinod, Sylvie; Bruley, Christophe; Garin, Jérôme; Joyard, Jacques; Masselon, Christophe; Rolland, Norbert

2010-06-01

Recent advances in the proteomics field have allowed a series of high throughput experiments to be conducted on chloroplast samples, and the data are available in several public databases. However, the accurate localization of many chloroplast proteins often remains hypothetical. This is especially true for envelope proteins. We went a step further into the knowledge of the chloroplast proteome by focusing, in the same set of experiments, on the localization of proteins in the stroma, the thylakoids, and envelope membranes. LC-MS/MS-based analyses first allowed building the AT_CHLORO database (http://www.grenoble.prabi.fr/protehome/grenoble-plant-proteomics/), a comprehensive repertoire of the 1323 proteins, identified by 10,654 unique peptide sequences, present in highly purified chloroplasts and their subfractions prepared from Arabidopsis thaliana leaves. This database also provides extensive proteomics information (peptide sequences and molecular weight, chromatographic retention times, MS/MS spectra, and spectral count) for a unique chloroplast protein accurate mass and time tag database gathering identified peptides with their respective and precise analytical coordinates, molecular weight, and retention time. We assessed the partitioning of each protein in the three chloroplast compartments by using a semiquantitative proteomics approach (spectral count). These data together with an in-depth investigation of the literature were compiled to provide accurate subplastidial localization of previously known and newly identified proteins. A unique knowledge base containing extensive information on the proteins identified in envelope fractions was thus obtained, allowing new insights into this membrane system to be revealed. Altogether, the data we obtained provide unexpected information about plastidial or subplastidial localization of some proteins that were not suspected to be associated to this membrane system. The spectral counting-based strategy was further validated as the compartmentation of well known pathways (for instance, photosynthesis and amino acid, fatty acid, or glycerolipid biosynthesis) within chloroplasts could be dissected. It also allowed revisiting the compartmentation of the chloroplast metabolism and functions.
Application of 2D graphic representation of protein sequence based on Huffman tree method.

PubMed

Qi, Zhao-Hui; Feng, Jun; Qi, Xiao-Qin; Li, Ling

2012-05-01

Based on Huffman tree method, we propose a new 2D graphic representation of protein sequence. This representation can completely avoid loss of information in the transfer of data from a protein sequence to its graphic representation. The method consists of two parts. One is about the 0-1 codes of 20 amino acids by Huffman tree with amino acid frequency. The amino acid frequency is defined as the statistical number of an amino acid in the analyzed protein sequences. The other is about the 2D graphic representation of protein sequence based on the 0-1 codes. Then the applications of the method on ten ND5 genes and seven Escherichia coli strains are presented in detail. The results show that the proposed model may provide us with some new sights to understand the evolution patterns determined from protein sequences and complete genomes. Copyright © 2012 Elsevier Ltd. All rights reserved.
Opsin cDNA sequences of a UV and green rhodopsin of the satyrine butterfly Bicyclus anynana.

PubMed

Vanhoutte, K J A; Eggen, B J L; Janssen, J J M; Stavenga, D G

2002-11-01

The cDNAs of an ultraviolet (UV) and long-wavelength (LW) (green) absorbing rhodopsin of the bush brown Bicyclus anynana were partially identified. The UV sequence, encoding 377 amino acids, is 76-79% identical to the UV sequences of the papilionids Papilio glaucus and Papilio xuthus and the moth Manduca sexta. A dendrogram derived from aligning the amino acid sequences reveals an equidistant position of Bicyclus between Papilio and Manduca. The sequence of the green opsin cDNA fragment, which encodes 242 amino acids, represents six of the seven transmembrane regions. At the amino acid level, this fragment is more than 80% identical to the corresponding LW opsin sequences of Dryas, Heliconius, Papilio (rhodopsin 2) and Manduca. Whereas three LW absorbing rhodopsins were identified in the papilionid butterflies, only one green opsin was found in B. anynana.
Complete amino acid sequence of ananain and a comparison with stem bromelain and other plant cysteine proteases.

PubMed Central

Lee, K L; Albee, K L; Bernasconi, R J; Edmunds, T

1997-01-01

The amino acid sequences of ananain (EC3.4.22.31) and stem bromelain (3.4.22.32), two cysteine proteases from pineapple stem, are similar yet ananain and stem bromelain possess distinct specificities towards synthetic peptide substrates and different reactivities towards the cysteine protease inhibitors E-64 and chicken egg white cystatin. We present here the complete amino acid sequence of ananain and compare it with the reported sequences of pineapple stem bromelain, papain and chymopapain from papaya and actinidin from kiwifruit. Ananain is comprised of 216 residues with a theoretical mass of 23464 Da. This primary structure includes a sequence insert between residues 170 and 174 not present in stem bromelain or papain and a hydrophobic series of amino acids adjacent to His-157. It is possible that these sequence differences contribute to the different substrate and inhibitor specificities exhibited by ananain and stem bromelain. PMID:9355753

PIPI: PTM-Invariant Peptide Identification Using Coding Method.

PubMed

Yu, Fengchao; Li, Ning; Yu, Weichuan

2016-12-02

In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
WebLogo

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crooks, Gavin E.

WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Sequesnce logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richermore » and more precise description of, for example, a binding site, than would a consensus sequence.« less
Contribution of Tryptophan Residues to the Combining Site of a Monoclonal Anti Dinitrophenyl Spin-Label Antibody

DTIC Science & Technology

1987-01-01

identified in the difference spectra, implying that: there are five to seven tryptophans within 17 A of the spin-label hapten. Amino acid sequences...of the heavy, and light chains were obtained by a combination of amino acid and DNA sequencing. A molecular model’ was constructed from the sequence...Clore & acids yields detailed information about the amino acid com- Gronenborn, 1982, 1983). This technique should also identify position of the combining
Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

PubMed Central

Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

1988-01-01

Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437
Sequence of a cDNA encoding pancreatic preprosomatostatin-22.

PubMed Central

Magazin, M; Minth, C D; Funckes, C L; Deschenes, R; Tavianini, M A; Dixon, J E

1982-01-01

We report the nucleotide sequence of a precursor to somatostatin that upon proteolytic processing may give rise to a hormone of 22 amino acids. The nucleotide sequence of a cDNA from the channel catfish (Ictalurus punctatus) encodes a precursor to somatostatin that is 105 amino acids (Mr, 11,500). The cDNA coding for somatostatin-22 consists of 36 nucleotides in the 5' untranslated region, 315 nucleotides that code for the precursor to somatostatin-22, 269 nucleotides at the 3' untranslated region, and a variable length of poly(A). The putative preprohormone contains a sequence of hydrophobic amino acids at the amino terminus that has the properties of a "signal" peptide. A connecting sequence of approximately 57 amino acids is followed by a single Arg-Arg sequence, which immediately precedes the hormone. Somatostatin-22 is homologous to somatostatin-14 in 7 of the 14 amino acids, including the Phe-Trp-Lys sequence. Hybridization selection of mRNA, followed by its translation in a wheat germ cell-free system, resulted in the synthesis of a single polypeptide having a molecular weight of approximately 10,000 as estimated on Na-DodSO4/polyacrylamide gels. Images PMID:6127673
Phylogenetic Relationship of Necoclí Virus to Other South American Hantaviruses (Bunyaviridae: Hantavirus).

PubMed

Montoya-Ruiz, Carolina; Cajimat, Maria N B; Milazzo, Mary Louise; Diaz, Francisco J; Rodas, Juan David; Valbuena, Gustavo; Fulhorst, Charles F

2015-07-01

The results of a previous study suggested that Cherrie's cane rat (Zygodontomys cherriei) is the principal host of Necoclí virus (family Bunyaviridae, genus Hantavirus) in Colombia. Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences in this study confirmed that Necoclí virus is phylogenetically closely related to Maporal virus, which is principally associated with the delicate pygmy rice rat (Oligoryzomys delicatus) in western Venezuela. In pairwise comparisons, nonidentities between the complete amino acid sequence of the nucleocapsid protein of Necoclí virus and the complete amino acid sequences of the nucleocapsid proteins of other hantaviruses were ≥8.7%. Likewise, nonidentities between the complete amino acid sequence of the glycoprotein precursor of Necoclí virus and the complete amino acid sequences of the glycoprotein precursors of other hantaviruses were ≥11.7%. Collectively, the unique association of Necoclí virus with Z. cherriei in Colombia, results of the Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences, and results of the pairwise comparisons of amino acid sequences strongly support the notion that Necoclí virus represents a novel species in the genus Hantavirus. Further work is needed to determine whether Calabazo virus (a hantavirus associated with Z. brevicauda cherriei in Panama) and Necoclí virus are conspecific.
Identification and characterization of Theileria ovis surface protein (ToSp) resembled TaSp in Theileria annulata.

PubMed

Shayan, P; Jafari, S; Fattahi, R; Ebrahimzade, E; Amininia, N; Changizi, E

2016-05-01

Ovine theileriosis is an important hemoprotozoal disease of sheep and goats in tropical and subtropical regions which caused high economic loses in the livestock industry. Theileria annulata surface protein (TaSp) was used previously as a tool for serological analysis in livestock. Since the amino acid sequences of TaSp is, at least, in part very conserved in T. annulata, Theileria lestoquardi and Theileria china I and II, it is very important to determine the amino acid sequence of this protein in Theileria ovis as well, to avoid false interpretation of serological data based on this protein in small animal. In the present study, the nucleotide sequence and amino acid sequence of T. ovis surface protein (ToSp) were determined. The comparison of the nucleotide sequence of ToSp showed 96, 96, 99, and 86 % homology to the corresponding nucleotide sequence of TaSp genes by T. annulata, T. China I, T. China II and T. lestoquardi, previously registered in GenBank under accession nos. AJ316260.1, AY274329.1, DQ120058.1, and EF092924.1 respectively. The amino acid sequence analysis showed 95, 81, 98 and 70 % homology to the corresponding amino acid sequence of T. annulata, T chinaI, T china II and T. lestoquardi, registered in GenBank under accession nos. CAC87478.1, AAP36993.1, AAZ30365.1 and AAP36999.11, respectively. Interestingly, in contrast to the C terminus, a significant difference in amino acid sequence in the N teminus of the ToSp protein could be determined compared to the other known corresponding TaSp sequences, which make this region attractive for designing of a suitable tool for serological diagnosis.
Brain cDNA clone for human cholinesterase

DOE Office of Scientific and Technical Information (OSTI.GOV)

McTiernan, C.; Adkins, S.; Chatonnet, A.

1987-10-01

A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
Trh (tdh-/trh+) gene analysis of clinical, environmental and food isolates of Vibrio parahaemolyticus as a tool for investigating pathogenicity.

PubMed

Leoni, Francesca; Talevi, Giulia; Masini, Laura; Ottaviani, Donatella; Rocchegiani, Elena

2016-05-16

Sequencing analysis of the trh gene encoding the TDH-related haemolysin of tdh-/trh+ Vibrio parahaemolyticus isolated in Italy between 2002 and 2011 from clinical, environmental, and food samples revealed the presence of the trh2 variant in all isolates. The trh2 of the clinical isolate was 100% identical to other clinical tdh-/trh2 V. parahaemolyticus from Europe. Nucleotide and amino acid differences in the trh2 sequences of clinical isolates from Italy and other countries allowed a differentiation of the clinical strains from the majority of environmental or food strains isolated in Italy. Aspartic acid and isoleucine at positions 113 and 115, encoded by nucleotide triplets GAT and ATT at positions 337-339 and 343-345 of the complete trh gene sequence, were present in clinical strains from Europe (Italy, Norway and Germany), Asia and the United States. Only 35.5% of the tdh-/trh2 V. parahaemolyticus of environmental or food origin from Italy shared the same triplets/amino acid detected in clinical isolates, while 64.5% of isolates from the marine environment were different from those of clinical origins, demonstrating that differences occur amongst the trh2 sequences of strains from the environment and these polymorphisms may differentiate potentially pathogenic from less or non-pathogenic cultures found in the environment and seafood. In addition the distribution of T3SS2 genes was investigated in this group of tdh-/trh+ V. parahaemolyticus from different sources and in three clinical tdh+/trh- V. parahaemolyticus isolates. All tdh-/trh+ V. parahaemolyticus of environmental or food source, independent of year of isolation or geographical origin, amplified all the screened T3SS2β genes and tested negative to PCR assays for all five T3SS2α genes, as the tdh-/trh+ clinical V. parahaemolyticus isolate. The vopC genes, encoding for one of the effector proteins of T3SS2, were partially sequenced and compared to clinical tdh-/trh+ and tdh+/trh+ V. parahaemolyticus isolates from other countries. Analysis of T3SS2β vopC sequences revealed variation in tdh-/trh2 isolates from Italy, which were separated from a group of vopC sequences derived from trh2 V. parahaemolyticus from the USA. Copyright © 2016 Elsevier B.V. All rights reserved.
Cloning and expression of cDNA coding for bouganin.

PubMed

den Hartog, Marcel T; Lubelli, Chiara; Boon, Louis; Heerkens, Sijmie; Ortiz Buijsse, Antonio P; de Boer, Mark; Stirpe, Fiorenzo

2002-03-01

Bouganin is a ribosome-inactivating protein that recently was isolated from Bougainvillea spectabilis Willd. In this work, the cloning and expression of the cDNA encoding for bouganin is described. From the cDNA, the amino-acid sequence was deduced, which correlated with the primary sequence data obtained by amino-acid sequencing on the native protein. Bouganin is synthesized as a pro-peptide consisting of 305 amino acids, the first 26 of which act as a leader signal while the 29 C-terminal amino acids are cleaved during processing of the molecule. The mature protein consists of 250 amino acids. Using the cDNA sequence encoding the mature protein of 250 amino acids, a recombinant protein was expressed, purified and characterized. The recombinant molecule had similar activity in a cell-free protein synthesis assay and had comparable toxicity on living cells as compared to the isolated native bouganin.
Method for altering antibody light chain interactions

DOEpatents

Stevens, Fred J.; Stevens, Priscilla Wilkins; Raffen, Rosemarie; Schiffer, Marianne

2002-01-01

A method for recombinant antibody subunit dimerization including modifying at least one codon of a nucleic acid sequence to replace an amino acid occurring naturally in the antibody with a charged amino acid at a position in the interface segment of the light polypeptide variable region, the charged amino acid having a first polarity; and modifying at least one codon of the nucleic acid sequence to replace an amino acid occurring naturally in the antibody with a charged amino acid at a position in an interface segment of the heavy polypeptide variable region corresponding to a position in the light polypeptide variable region, the charged amino acid having a second polarity opposite the first polarity. Nucleic acid sequences which code for novel light chain proteins, the latter of which are used in conjunction with the inventive method, are also provided.
Large Scale Analyses and Visualization of Adaptive Amino Acid Changes Projects.

PubMed

Vázquez, Noé; Vieira, Cristina P; Amorim, Bárbara S R; Torres, André; López-Fernández, Hugo; Fdez-Riverola, Florentino; Sousa, José L R; Reboiro-Jato, Miguel; Vieira, Jorge

2018-03-01

When changes at few amino acid sites are the target of selection, adaptive amino acid changes in protein sequences can be identified using maximum-likelihood methods based on models of codon substitution (such as codeml). Although such methods have been employed numerous times using a variety of different organisms, the time needed to collect the data and prepare the input files means that tens or hundreds of coding regions are usually analyzed. Nevertheless, the recent availability of flexible and easy to use computer applications that collect relevant data (such as BDBM) and infer positively selected amino acid sites (such as ADOPS), means that the entire process is easier and quicker than before. However, the lack of a batch option in ADOPS, here reported, still precludes the analysis of hundreds or thousands of sequence files. Given the interest and possibility of running such large-scale projects, we have also developed a database where ADOPS projects can be stored. Therefore, this study also presents the B+ database, which is both a data repository and a convenient interface that looks at the information contained in ADOPS projects without the need to download and unzip the corresponding ADOPS project file. The ADOPS projects available at B+ can also be downloaded, unzipped, and opened using the ADOPS graphical interface. The availability of such a database ensures results repeatability, promotes data reuse with significant savings on the time needed for preparing datasets, and effortlessly allows further exploration of the data contained in ADOPS projects.
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2013 CFR

2013-07-01

... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2010 CFR

2010-07-01

... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2012 CFR

2012-07-01

... in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was... ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as “n” or “Xaa... acids. (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter...
Use of CYP52A2A promoter to increase gene expression in yeast

DOEpatents

Craft, David L.; Wilson, C. Ron; Eirich, Dudley; Zhang, Yeyan

2004-01-06

A nucleic acid sequence including a CYP promoter operably linked to nucleic acid encoding a heterologous protein is provided to increase transcription of the nucleic acid. Expression vectors and host cells containing the nucleic acid sequence are also provided. The methods and compositions described herein are especially useful in the production of polycarboxylic acids by yeast cells.
Method of Identifying a Base in a Nucleic Acid

DOEpatents

Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

1999-01-01

Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.
Identifying a base in a nucleic acid

DOEpatents

Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

2005-02-08

Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.
Purification and Characterization of Suicin 65, a Novel Class I Type B Lantibiotic Produced by Streptococcus suis.

PubMed

Vaillancourt, Katy; LeBel, Geneviève; Frenette, Michel; Fittipaldi, Nahuel; Gottschalk, Marcelo; Grenier, Daniel

2015-01-01

Bacteriocins are antimicrobial peptides of bacterial origin that are considered as a promising alternative to the use of conventional antibiotics. Recently, our laboratory reported the purification and characterization of two lantibiotics, suicin 90-1330 and suicin 3908, produced by the swine pathogen and zoonotic agent Streptococcus suis (serotype 2). In this study, a novel bacteriocin produced by S. suis has been identified and characterized. The producing strain S. suis 65 (serotype 2) was found to belong to the sequence type 28, that includes strains known to be weakly or avirulent in a mouse model. The bacteriocin, whose production was only possible following growth on solid culture medium, was purified to homogeneity by cationic exchange and reversed-phase high-pressure liquid chromatography. The bacteriocin, named suicin 65, was heat, pH and protease resistant. Suicin 65 was active against all S. suis isolates tested, including antibiotic resistant strains. Amino acid sequencing of the purified bacteriocin by Edman degradation revealed the presence of modified amino acids suggesting a lantibiotic. Using the partial sequence obtained, a blast was performed against published genomes of S. suis and allowed to identify a putative lantibiotic locus in the genome of S. suis 89-1591. From this genome, primers were designed and the gene cluster involved in the production of suicin 65 by S. suis 65 was amplified by PCR. Sequence analysis revealed the presence of ten open reading frames, including a duplicate of the structural gene. The structural genes (sssA and sssA') of suicin 65 encodes a 25-amino acid residue leader peptide and a 26-amino acid residue mature peptide yielding an active bacteriocin with a deducted molecular mass of 3,005 Da. Mature suicin 65 showed a high degree of identity with class I type B lantibiotics (globular structure) produced by Streptococcus pyogenes (streptococcin FF22; 84.6%), Streptococcus macedonicus (macedocin ACA-DC 198; 84.6%), and Lactococcus lactis subsp. lactis (lacticin 481; 74.1%). Further studies will evaluate the ability of suicin 65 or the producing strain to prevent experimental S. suis infections in pigs.
Purification and Characterization of Suicin 65, a Novel Class I Type B Lantibiotic Produced by Streptococcus suis

PubMed Central

Vaillancourt, Katy; LeBel, Geneviève; Frenette, Michel; Fittipaldi, Nahuel; Gottschalk, Marcelo; Grenier, Daniel

2015-01-01

Bacteriocins are antimicrobial peptides of bacterial origin that are considered as a promising alternative to the use of conventional antibiotics. Recently, our laboratory reported the purification and characterization of two lantibiotics, suicin 90–1330 and suicin 3908, produced by the swine pathogen and zoonotic agent Streptococcus suis (serotype 2). In this study, a novel bacteriocin produced by S. suis has been identified and characterized. The producing strain S. suis 65 (serotype 2) was found to belong to the sequence type 28, that includes strains known to be weakly or avirulent in a mouse model. The bacteriocin, whose production was only possible following growth on solid culture medium, was purified to homogeneity by cationic exchange and reversed-phase high-pressure liquid chromatography. The bacteriocin, named suicin 65, was heat, pH and protease resistant. Suicin 65 was active against all S. suis isolates tested, including antibiotic resistant strains. Amino acid sequencing of the purified bacteriocin by Edman degradation revealed the presence of modified amino acids suggesting a lantibiotic. Using the partial sequence obtained, a blast was performed against published genomes of S. suis and allowed to identify a putative lantibiotic locus in the genome of S. suis 89–1591. From this genome, primers were designed and the gene cluster involved in the production of suicin 65 by S. suis 65 was amplified by PCR. Sequence analysis revealed the presence of ten open reading frames, including a duplicate of the structural gene. The structural genes (sssA and sssA’) of suicin 65 encodes a 25-amino acid residue leader peptide and a 26-amino acid residue mature peptide yielding an active bacteriocin with a deducted molecular mass of 3,005 Da. Mature suicin 65 showed a high degree of identity with class I type B lantibiotics (globular structure) produced by Streptococcus pyogenes (streptococcin FF22; 84.6%), Streptococcus macedonicus (macedocin ACA-DC 198; 84.6%), and Lactococcus lactis subsp. lactis (lacticin 481; 74.1%). Further studies will evaluate the ability of suicin 65 or the producing strain to prevent experimental S. suis infections in pigs. PMID:26709705

Covalent attachment of TAT peptides and thiolated alkyl molecules on GaAs surfaces.

PubMed

Cho, Youngnam; Ivanisevic, Albena

2005-07-07

Four TAT peptide fragments were used to functionalize GaAs surfaces by adsorption from solution. In addition, two well-studied alkylthiols, mercaptohexadecanoic acid (MHA) and 1-octadecanethiol (ODT) were utilized as references to understand the structure of the TAT peptide monolayer on GaAs. The different sequences of TAT peptides were employed in recognition experiments where a synthetic RNA sequence was tested to verify the specific interaction with the TAT peptide. The modified GaAs surfaces were characterized by atomic force microscopy (AFM), X-ray photoelectron spectroscopy (XPS), and Fourier transform infrared reflection absorption spectroscopy (FT-IRRAS). AFM studies were used to compare the surface roughness before and after functionalization. XPS allowed us to characterize the chemical composition of the GaAs surface and conclude that the monolayers composed of different sequences of peptides have similar surface chemistries. Finally, FT-IRRAS experiments enabled us to deduce that the TAT peptide monolayers have a fairly ordered and densely packed alkyl chain structure. The recognition experiments showed preferred interaction of the RNA sequence toward peptides with high arginine content.
Molecular coevolution of mammalian ribosomal gene terminator sequences and the transcription termination factor TTF-I.

PubMed Central

Evers, R; Grummt, I

1995-01-01

Both the DNA elements and the nuclear factors that direct termination of ribosomal gene transcription exhibit species-specific differences. Even between mammals--e.g., human and mouse--the termination signals are not identical and the respective transcription termination factors (TTFs) which bind to the terminator sequence are not fully interchangeable. To elucidate the molecular basis for this species-specificity, we have cloned TTF-I from human and mouse cells and compared their structural and functional properties. Recombinant TTF-I exhibits species-specific DNA binding and terminates transcription both in cell-free transcription assays and in transfection experiments. Chimeric constructs of mouse TTF-I and human TTF-I reveal that the major determinant for species-specific DNA binding resides within the C terminus of TTF-I. Replacing 31 C-terminal amino acids of mouse TTF-I with the homologous human sequences relaxes the DNA-binding specificity and, as a consequence, allows the chimeric factor to bind the human terminator sequence and to specifically stop rDNA transcription. Images Fig. 2 Fig. 3 Fig. 4 PMID:7597036
Towards comprehensive structural motif mining for better fold annotation in the "twilight zone" of sequence dissimilarity

PubMed Central

Jia, Yi; Huan, Jun; Buhr, Vincent; Zhang, Jintao; Carayannopoulos, Leonidas N

2009-01-01

Background Automatic identification of structure fingerprints from a group of diverse protein structures is challenging, especially for proteins whose divergent amino acid sequences may fall into the "twilight-" or "midnight-" zones where pair-wise sequence identities to known sequences fall below 25% and sequence-based functional annotations often fail. Results Here we report a novel graph database mining method and demonstrate its application to protein structure pattern identification and structure classification. The biologic motivation of our study is to recognize common structure patterns in "immunoevasins", proteins mediating virus evasion of host immune defense. Our experimental study, using both viral and non-viral proteins, demonstrates the efficiency and efficacy of the proposed method. Conclusion We present a theoretic framework, offer a practical software implementation for incorporating prior domain knowledge, such as substitution matrices as studied here, and devise an efficient algorithm to identify approximate matched frequent subgraphs. By doing so, we significantly expanded the analytical power of sophisticated data mining algorithms in dealing with large volume of complicated and noisy protein structure data. And without loss of generality, choice of appropriate compatibility matrices allows our method to be easily employed in domains where subgraph labels have some uncertainty. PMID:19208148
Cloning and characterization of a Prevotella melaninogenica hemolysin.

PubMed Central

Allison, H E; Hillman, J D

1997-01-01

Hemolysins have been proven to be important virulence factors in many medically relevant pathogenic organisms. Their production has also been implicated in the etiology of periodontal disease. Hemolytic strain 361B of Prevotella melaninogenica, a putative etiologic agent of periodontal disease, was used in this study. The cloning, sequencing, and characterization of phyA, the structural gene for a P. melaninogenica hemolysin, is described. No extensive sequence homology could be identified between phyA and any reported sequence at either the nucleotide or amino acid level. As predicted from sequence analysis, this gene produces a 39-kDa protein which has hemolytic activity as measured by zymogram analysis. Unlike many Ca2+-dependent bacterial hemolysins, both the cloned and native PhyA proteins were enhanced by the presence of EDTA in a dose-dependent fashion with 40 mM EDTA allowing maximum activity. Ca2+ and Mg2+ were found to be inhibitory. The hemolytic activity also was found to have a dose-dependent endpoint. Through recovery of hemolytic activity from a spent reaction, this endpoint was shown to be the result of end product inhibition. This is the first report describing the cloning and sequencing of a gene from P. melaninogenica. PMID:9199448
Cloning and characterization of a Prevotella melaninogenica hemolysin.

PubMed

Allison, H E; Hillman, J D

1997-07-01

Hemolysins have been proven to be important virulence factors in many medically relevant pathogenic organisms. Their production has also been implicated in the etiology of periodontal disease. Hemolytic strain 361B of Prevotella melaninogenica, a putative etiologic agent of periodontal disease, was used in this study. The cloning, sequencing, and characterization of phyA, the structural gene for a P. melaninogenica hemolysin, is described. No extensive sequence homology could be identified between phyA and any reported sequence at either the nucleotide or amino acid level. As predicted from sequence analysis, this gene produces a 39-kDa protein which has hemolytic activity as measured by zymogram analysis. Unlike many Ca2+-dependent bacterial hemolysins, both the cloned and native PhyA proteins were enhanced by the presence of EDTA in a dose-dependent fashion with 40 mM EDTA allowing maximum activity. Ca2+ and Mg2+ were found to be inhibitory. The hemolytic activity also was found to have a dose-dependent endpoint. Through recovery of hemolytic activity from a spent reaction, this endpoint was shown to be the result of end product inhibition. This is the first report describing the cloning and sequencing of a gene from P. melaninogenica.
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Domain similarity based orthology detection.

PubMed

Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich

2015-05-13

Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
Identification of antigenic regions on VP2 of African horsesickness virus serotype 3 by using phage-displayed epitope libraries.

PubMed

Bentley, L; Fehrsen, J; Jordaan, F; Huismans, H; du Plessis, D H

2000-04-01

VP2 is an outer capsid protein of African horsesickness virus (AHSV) and is recognized by serotype-discriminatory neutralizing antibodies. With the objective of locating its antigenic regions, a filamentous phage library was constructed that displayed peptides derived from the fragmentation of a cDNA copy of the gene encoding VP2. Peptides ranging in size from approximately 30 to 100 amino acids were fused with pIII, the attachment protein of the display vector, fUSE2. To ensure maximum diversity, the final library consisted of three sub-libraries. The first utilized enzymatically fragmented DNA encoding only the VP2 gene, the second included plasmid sequences, while the third included a PCR step designed to allow different peptide-encoding sequences to recombine before ligation into the vector. The resulting composite library was subjected to immunoaffinity selection with AHSV-specific polyclonal chicken IgY, polyclonal horse immunoglobulins and a monoclonal antibody (MAb) known to neutralize AHSV. Antigenic peptides were located by sequencing the DNA of phages bound by the antibodies. Most antigenic determinants capable of being mapped by this method were located in the N-terminal half of VP2. Important binding areas were mapped with high resolution by identifying the minimum overlapping areas of the selected peptides. The MAb was also used to screen a random 17-mer epitope library. Sequences that may be part of a discontinuous neutralization epitope were identified. The amino acid sequences of the antigenic regions on VP2 of serotype 3 were compared with corresponding regions on three other serotypes, revealing regions with the potential to discriminate AHSV serotypes serologically.
Rapid Identification of Cryptococcus neoformans var. grubii, C. neoformans var. neoformans, and C. gattii by Use of Rapid Biochemical Tests, Differential Media, and DNA Sequencing ▿

PubMed Central

McTaggart, Lisa; Richardson, Susan E.; Seah, Christine; Hoang, Linda; Fothergill, Annette; Zhang, Sean X.

2011-01-01

Rapid identification of Cryptococcus neoformans var. grubii, Cryptococcus neoformans var. neoformans, and Cryptococcus gattii is imperative for facilitation of prompt treatment of cryptococcosis and for understanding the epidemiology of the disease. Our purpose was to evaluate a test algorithm incorporating commercial rapid biochemical tests, differential media, and DNA sequence analysis that will allow us to differentiate these taxa rapidly and accurately. We assessed 147 type, reference, and clinical isolates, including 6 other Cryptococcus spp. (10 isolates) and 14 other yeast species (24 isolates), using a 4-hour urea broth test (Remel), a 24-hour urea broth test (Becton Dickinson), a 4-hour caffeic acid disk test (Hardy Diagnostics and Remel), 40- to 44-hour growth assessment on l-canavanine glycine bromothymol blue (CGB) agar, and intergenic spacer (IGS) sequence analysis. All 123 Cryptococcus isolates hydrolyzed urea, along with 7 isolates of Rhodotorula and Trichosporon. Eighty-five of 86 C. neoformans (99%) and 26 of 27 C. gattii (96%) isolates had positive caffeic acid results, unlike the other cryptococci (0/10) and yeast species (0/24). Together, these two tests positively identified virtually all C. neoformans/C. gattii isolates (98%) within 4 h. CGB agar or IGS sequencing further differentiated these isolates within 48 h. On CGB, 25 of 27 (93%) C. gattii strains induced a blue color change, in contrast to 0 of 86 C. neoformans isolates. Neighbor-joining cluster analysis of IGS sequences differentiated C. neoformans var. grubii, C. neoformans var. neoformans, and C. gattii. Based on these results, we describe a rapid identification algorithm for use in a microbiology laboratory to distinguish clinically relevant Cryptococcus spp. PMID:21593254
Rapid identification of Cryptococcus neoformans var. grubii, C. neoformans var. neoformans, and C. gattii by use of rapid biochemical tests, differential media, and DNA sequencing.

PubMed

McTaggart, Lisa; Richardson, Susan E; Seah, Christine; Hoang, Linda; Fothergill, Annette; Zhang, Sean X

2011-07-01

Rapid identification of Cryptococcus neoformans var. grubii, Cryptococcus neoformans var. neoformans, and Cryptococcus gattii is imperative for facilitation of prompt treatment of cryptococcosis and for understanding the epidemiology of the disease. Our purpose was to evaluate a test algorithm incorporating commercial rapid biochemical tests, differential media, and DNA sequence analysis that will allow us to differentiate these taxa rapidly and accurately. We assessed 147 type, reference, and clinical isolates, including 6 other Cryptococcus spp. (10 isolates) and 14 other yeast species (24 isolates), using a 4-hour urea broth test (Remel), a 24-hour urea broth test (Becton Dickinson), a 4-hour caffeic acid disk test (Hardy Diagnostics and Remel), 40- to 44-hour growth assessment on l-canavanine glycine bromothymol blue (CGB) agar, and intergenic spacer (IGS) sequence analysis. All 123 Cryptococcus isolates hydrolyzed urea, along with 7 isolates of Rhodotorula and Trichosporon. Eighty-five of 86 C. neoformans (99%) and 26 of 27 C. gattii (96%) isolates had positive caffeic acid results, unlike the other cryptococci (0/10) and yeast species (0/24). Together, these two tests positively identified virtually all C. neoformans/C. gattii isolates (98%) within 4 h. CGB agar or IGS sequencing further differentiated these isolates within 48 h. On CGB, 25 of 27 (93%) C. gattii strains induced a blue color change, in contrast to 0 of 86 C. neoformans isolates. Neighbor-joining cluster analysis of IGS sequences differentiated C. neoformans var. grubii, C. neoformans var. neoformans, and C. gattii. Based on these results, we describe a rapid identification algorithm for use in a microbiology laboratory to distinguish clinically relevant Cryptococcus spp.
Genetically encoded fluorescent tags

PubMed Central

Thorn, Kurt

2017-01-01

Genetically encoded fluorescent tags are protein sequences that can be fused to a protein of interest to render it fluorescent. These tags have revolutionized cell biology by allowing nearly any protein to be imaged by light microscopy at submicrometer spatial resolution and subsecond time resolution in a live cell or organism. They can also be used to measure protein abundance in thousands to millions of cells using flow cytometry. Here I provide an introduction to the different genetic tags available, including both intrinsically fluorescent proteins and proteins that derive their fluorescence from binding of either endogenous or exogenous fluorophores. I discuss their optical and biological properties and guidelines for choosing appropriate tags for an experiment. Tools for tagging nucleic acid sequences and reporter molecules that detect the presence of different biomolecules are also briefly discussed. PMID:28360214
The neXtProt peptide uniqueness checker: a tool for the proteomics community.

PubMed

Schaeffer, Mathieu; Gateau, Alain; Teixeira, Daniel; Michel, Pierre-André; Zahn-Zabal, Monique; Lane, Lydie

2017-11-01

The neXtProt peptide uniqueness checker allows scientists to define which peptides can be used to validate the existence of human proteins, i.e. map uniquely versus multiply to human protein sequences taking into account isobaric substitutions, alternative splicing and single amino acid variants. The pepx program is available at https://github.com/calipho-sib/pepx and can be launched from the command line or through a cgi web interface. Indexing requires a sequence file in FASTA format. The peptide uniqueness checker tool is freely available on the web at https://www.nextprot.org/tools/peptide-uniqueness-checker and from the neXtProt API at https://api.nextprot.org/. lydie.lane@sib.swiss. © The Author(s) 2017. Published by Oxford University Press.
SA-Search: a web tool for protein structure mining based on a Structural Alphabet

PubMed Central

Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

2004-01-01

SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search. PMID:15215446
SA-Search: a web tool for protein structure mining based on a Structural Alphabet.

PubMed

Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

2004-07-01

SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.
Streptococcal phosphoenolpyruvate-sugar phosphotransferase system: amino acid sequence and site of ATP-dependent phosphorylation of HPr

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deutscher, J.; Pevec, B.; Beyreuther, K.

1986-10-21

The amino acid sequence of histidine-containing protein (HPr) from Streptococcus faecalis has been determined by direct Edman degradation of intact HPr and by amino acid sequence analysis of tryptic peptides, V8 proteolyptic peptides, thermolytic peptides, and cyanogen bromide cleavage products. HPr from S. faecalis was found to contain 89 amino acid residues, corresponding to a molecular weight of 9438. The amino acid sequence of HPr from S. faecalis shows extended homology to the primary structure of HPr proteins from other bacteria. Besides the phosphoenolpyruvate-dependent phosphorylation of a histidyl residue in HPr, catalyzed by enzyme I of the bacterial phosphotransferase system,more » HPr was also found to be phosphorylated at a seryl residue in an ATP-dependent protein kinase catalyzed reaction. The site of ATP-dependent phosphorylation in HPr of S faecalis has now been determined. (/sup 32/P)P-Ser-HPr was digested with three different proteases, and in each case, a single labeled peptide was isolated. Following digestion with subtilisin, they obtained a peptide with the sequence -(P)Ser-Ile-Met-. Using chymotrypsin, they isolated a peptide with the sequence -Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-Gly-Val-Met-. The longest labeled peptide was obtained with V8 staphylococcal protease. According to amino acid analysis, this peptide contained 36 out of the 89 amino acid residues of HPr. The following sequence of 12 amino acid residues of the V8 peptide was determined: -Tyr-Lys-Gly-Lys-Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-. Thus, the site of ATP-dependent phosphorylation was determined to be Ser-46 within the primary structure of HPr.« less
Methods and compositions for regulating gene expression in plant cells

NASA Technical Reports Server (NTRS)

Dai, Shunhong (Inventor); Beachy, Roger N. (Inventor); Luis, Maria Isabel Ordiz (Inventor)

2010-01-01

Novel chimeric plant promoter sequences are provided, together with plant gene expression cassettes comprising such sequences. In certain preferred embodiments, the chimeric plant promoters comprise the BoxII cis element and/or derivatives thereof. In addition, novel transcription factors are provided, together with nucleic acid sequences encoding such transcription factors and plant gene expression cassettes comprising such nucleic acid sequences. In certain preferred embodiments, the novel transcription factors comprise the acidic domain, or fragments thereof, of the RF2a transcription factor. Methods for using the chimeric plant promoter sequences and novel transcription factors in regulating the expression of at least one gene of interest are provided, together with transgenic plants comprising such chimeric plant promoter sequences and novel transcription factors.
The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase.

PubMed Central

Freemont, P S; Dunbar, B; Fothergill-Gilmore, L A

1988-01-01

The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase, comprising 363 residues, was determined. The sequence was deduced by automated sequencing of CNBr-cleavage, o-iodosobenzoic acid-cleavage, trypsin-digest and staphylococcal-proteinase-digest fragments. Comparison of the sequence with other class I aldolase sequences shows that the mammalian muscle isoenzyme is one of the most highly conserved enzymes known, with only about 2% of the residues changing per 100 million years. Non-mammalian aldolases appear to be evolving at the same rate as other glycolytic enzymes, with about 4% of the residues changing per 100 million years. Secondary-structure predictions are analysed in an accompanying paper [Sawyer, Fothergill-Gilmore & Freemont (1988) Biochem. J. 249, 789-793]. PMID:3355497
Fragmentations of [M-H]- anions of peptides containing tyrosine sulfate. Does the sulfate group rearrange? A joint experimental and theoretical study.

PubMed

Tran, T T Nha; Wang, Tianfang; Hack, Sandra; Bowie, John H

2013-05-30

To investigate the fragmentations in the negative-ion electrospray mass spectra of peptides containing tyrosine sulfate. Possible fragmentation mechanisms were explored using a Waters QTOF2 tandem mass spectrometer in concert with calculations at the CAM-B3LYP/6-311++g(d,p) level of theory. The major negative ion formed in the ESI-MS of peptides containing tyrosine sulfate is [(M-H)-SO3](-) and this process normally yields the base peak of the spectrum. The basic backbone cleavages of [(M-H)-SO3](-) allowed the sequence of the peptide to be determined. Rearrangement reactions involving the formation of HOSO3(-) and [(M-H)-H2SO4](-) yielded minor peaks with relative abundances ≤ 10% and ≤ 2%, respectively. The mass spectra of the [M-H](-) and [(M-H)-SO3](-) anions of peptides containing tyrosine sulfate allowed the position of the tyrosine sulfate group to be determined, together with the amino acid sequence of the peptide. Copyright © 2013 John Wiley & Sons, Ltd.
Functional and evolutionary relationships between bacteriorhodopsin and halorhodopsin in the archaebacterium, halobacterium halobium

NASA Technical Reports Server (NTRS)

Lanyi, J. K.

1986-01-01

The archaebacteria occupy a unique place in phylogenetic trees constructed from analyses of sequences from key informational macromolecules, and their study continues to yield interesting ideas on the early evolution and divergence of biological forms. It is now known that the halobacteria among these species contain various retinal-proteins, resembling eukaryotic rhodopsins, but with different functions. Two of these pigments, located in the cytoplasmic membranes of the bacteria, are bacteriorhodopsin (a light-driven proton pump) and halorhodopsin (a light-driven chloride pump). Comparison of these systems is expected to reveal structure/function relationships in these simple (primitive?) energy transducing membrane components and evolutionary relationships which had produced the structural features which allow the divergent functions. Findings indicate that very different primary structures are needed for these proteins to accomplish their different functions. Indeed, analysis of partial amino acid sequences from halo-opsin shows already that few if any long segments exist which are homologous to bacterio-opsin. Either these proteins diverged a very long time ago to allow for the observed differences, or the evolutionary clock in the halobacteria runs faster than usual.
Cloning and sequencing of the allophycocyanin genes from Spirulina maxima (Cyanophyta)

NASA Astrophysics Data System (ADS)

Qin, Song; Hiroyuki, Kojima; Yoshikazu, Kawata; Shin-Ichi, Yano; Zeng, Cheng-Kui

1998-03-01

The genes coding for the α-and β-subunit of allophycocyanin ( apcA and apcB) from the cyanophyte Spirulina maxima were cloned and sequenced. The results revealed 44.4% of nucleotide sequence similarity and 30.4% of similarity of deduced amino acid sequence between them. The amino acid sequence identities between S. maxima and S. platensis are 99.4% for α subunit and 100% for β subunit.

Use of linalool synthase in genetic engineering of scent production

DOEpatents

Pichersky, E.

1998-12-15

A purified S-linalool synthase polypeptide from Clarkia breweri is disclosed as is the recombinant polypeptide and nucleic acid sequences encoding the polypeptide. Also disclosed are antibodies immunoreactive with the purified peptide and with recombinant versions of the polypeptide. Methods of using the nucleic acid sequences, as well as methods of enhancing the smell and the flavor of plants expressing the nucleic acid sequences are also disclosed. 5 figs.
Use of linalool synthase in genetic engineering of scent production

DOEpatents

Pichersky, Eran

1998-01-01

A purified S-linalool synthase polypeptide from Clarkia breweri is disclosed as is the recombinant polypeptide and nucleic acid sequences encoding the polypeptide. Also disclosed are antibodies immunoreactive with the purified peptide and with recombinant versions of the polypeptide. Methods of using the nucleic acid sequences, as well as methods of enhancing the smell and the flavor of plants expressing the nucleic acid sequences are also disclosed.
Probe kit for identifying a base in a nucleic acid

DOEpatents

Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

2001-01-01

Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.
Crotoxin: Structural Studies, Mechanism of Action and Cloning of its Gene

DTIC Science & Technology

1988-03-01

thirteen amino acids being acidic . Sequencing of the three peptides present in the acidic subunit, two of which are blocked by pyroglutamate ...the sequence determination of both the basic and acidic subunits of crotoxin- The acidic * subunit peptides were d!Tfficult, .sfi~n~e two of-ftflý...fluorescence spectroscopy. Results indicate a large conformational change occurs upon) ccmplex formation between the acidic and basic subunits of all four
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2014 CFR

2014-07-01

... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those listed... the Feature section. Otherwise, each occurrence of a base or amino acid not appearing in WIPO Standard...
G4RNA: an RNA G-quadruplex database

PubMed Central

Garant, Jean-Michel; Luce, Mikael J.; Scott, Michelle S.

2015-01-01

Abstract G-quadruplexes (G4) are tetrahelical structures formed from planar arrangement of guanines in nucleic acids. A simple, regular motif was originally proposed to describe G4-forming sequences. More recently, however, formation of G4 was discovered to depend, at least in part, on the contextual backdrop of neighboring sequences. Prediction of G4 folding is thus becoming more challenging as G4 outlier structures, not described by the originally proposed motif, are increasingly reported. Recent observations thus call for a comprehensive tool, capable of consolidating the expanding information on tested G4s, in order to conduct systematic comparative analyses of G4-promoting sequences. The G4RNA Database we propose was designed to help meet the need for easily-retrievable data on known RNA G4s. A user-friendly, flexible query system allows for data retrieval on experimentally tested sequences, from many separate genes, to assess G4-folding potential. Query output sorts data according to sequence position, G4 likelihood, experimental outcomes and associated bibliographical references. G4RNA also provides an ideal foundation to collect and store additional sequence and experimental data, considering the growing interest G4s currently generate. Database URL: scottgroup.med.usherbrooke.ca/G4RNA PMID:26200754
IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.

PubMed

Paez-Espino, David; Chen, I-Min A; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M; Nielsen, Torben; Huntemann, Marcel; K Reddy, T B; Pavlopoulos, Georgios A; Sullivan, Matthew B; Campbell, Barbara J; Chen, Feng; McMahon, Katherine; Hallam, Steve J; Denef, Vincent; Cavicchioli, Ricardo; Caffrey, Sean M; Streit, Wolfgang R; Webster, John; Handley, Kim M; Salekdeh, Ghasem H; Tsesmetzis, Nicolas; Setubal, Joao C; Pope, Phillip B; Liu, Wen-Tso; Rivers, Adam R; Ivanova, Natalia N; Kyrpides, Nikos C

2017-01-04

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Continuous in vitro evolution of bacteriophage RNA polymerase promoters

NASA Technical Reports Server (NTRS)

Breaker, R. R.; Banerji, A.; Joyce, G. F.

1994-01-01

Rapid in vitro evolution of bacteriophage T7, T3, and SP6 RNA polymerase promoters was achieved by a method that allows continuous enrichment of DNAs that contain functional promoter elements. This method exploits the ability of a special class of nucleic acid molecules to replicate continuously in the presence of both a reverse transcriptase and a DNA-dependent RNA polymerase. Replication involves the synthesis of both RNA and cDNA intermediates. The cDNA strand contains an embedded promoter sequence, which becomes converted to a functional double-stranded promoter element, leading to the production of RNA transcripts. Synthetic cDNAs, including those that contain randomized promoter sequences, can be used to initiate the amplification cycle. However, only those cDNAs that contain functional promoter sequences are able to produce RNA transcripts. Furthermore, each RNA transcript encodes the RNA polymerase promoter sequence that was responsible for initiation of its own transcription. Thus, the population of amplifying molecules quickly becomes enriched for those templates that encode functional promoters. Optimal promoter sequences for phage T7, T3, and SP6 RNA polymerase were identified after a 2-h amplification reaction, initiated in each case with a pool of synthetic cDNAs encoding greater than 10(10) promoter sequence variants.
GeneSilico protein structure prediction meta-server.

PubMed

Kurowski, Michal A; Bujnicki, Janusz M

2003-07-01

Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
GeneSilico protein structure prediction meta-server

PubMed Central

Kurowski, Michal A.; Bujnicki, Janusz M.

2003-01-01

Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313
Phenotype–genotype correlation in Hirschsprung disease is illuminated by comparative analysis of the RET protein sequence

PubMed Central

Kashuk, Carl S.; Stone, Eric A.; Grice, Elizabeth A.; Portnoy, Matthew E.; Green, Eric D.; Sidow, Arend; Chakravarti, Aravinda; McCallion, Andrew S.

2005-01-01

The ability to discriminate between deleterious and neutral amino acid substitutions in the genes of patients remains a significant challenge in human genetics. The increasing availability of genomic sequence data from multiple vertebrate species allows inclusion of sequence conservation and physicochemical properties of residues to be used for functional prediction. In this study, the RET receptor tyrosine kinase serves as a model disease gene in which a broad spectrum (≥116) of disease-associated mutations has been identified among patients with Hirschsprung disease and multiple endocrine neoplasia type 2. We report the alignment of the human RET protein sequence with the orthologous sequences of 12 non-human vertebrates (eight mammalian, one avian, and three teleost species), their comparative analysis, the evolutionary topology of the RET protein, and predicted tolerance for all published missense mutations. We show that, although evolutionary conservation alone provides significant information to predict the effect of a RET mutation, a model that combines comparative sequence data with analysis of physiochemical properties in a quantitative framework provides far greater accuracy. Although the ability to discern the impact of a mutation is imperfect, our analyses permit substantial discrimination between predicted functional classes of RET mutations and disease severity even for a multigenic disease such as Hirschsprung disease. PMID:15956201
Phylogenetic Characterization of Transport Protein Superfamilies: Superiority of SuperfamilyTree Programs over Those Based on Multiple Alignments

PubMed Central

Chen, Jonathan S.; Reddy, Vamsee; Chen, Joshua H.; Shlykov, Maksim A.; Zheng, Wei Hao; Cho, Jaehoon; Yen, Ming Ren; Saier, Milton H.

2012-01-01

Transport proteins function in the translocation of ions, solutes and macromolecules across cellular and organellar membranes. These integral membrane proteins fall into >600 families as tabulated in the Transporter Classification Database (www.tcdb.org). Recent studies, some of which are reported here, define distant phylogenetic relationships between families with the creation of superfamilies. Several of these are analyzed using a novel set of programs designed to allow reliable prediction of phylogenetic trees when sequence divergence is too great to allow the use of multiple alignments. These new programs, called SuperfamilyTree1 and 2 (SFT1 and 2), allow display of protein and family relationships, respectively, based on thousands of comparative BLAST scores rather than multiple alignments. Superfamilies analyzed include: (1) Aerolysins, (2) RTX Toxins, (3) Defensins, (4) Ion Transporters, (5) Bile/Arsenite/Riboflavin Transporters, (6) Cation: Proton Antiporters, and (7) the Glucose/Fructose/Lactose superfamily within the prokaryotic phosphoenol pyruvate-dependent Phosphotransferase System. In addition to defining the phylogenetic relationships of the proteins and families within these seven superfamilies, evidence is provided showing that the SFT programs outperform programs that are based on multiple alignments whenever sequence divergence of superfamily members is extensive. The SFT programs should be applicable to virtually any superfamily of proteins or nucleic acids. PMID:22286036
Mouse Vk gene classification by nucleic acid sequence similarity.

PubMed

Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

1989-01-01

Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.
Complementary DNA cloning and molecular evolution of opine dehydrogenases in some marine invertebrates.

PubMed

Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K

2004-01-01

The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.
Evolution of DMY, a newly emergent male sex-determination gene of medaka fish.

PubMed

Zhang, Jianzhi

2004-04-01

The Japanese medaka fish Oryzias latipes has an XX/XY sex-determination system. The Y-linked sex-determination gene DMY is a duplicate of the autosomal gene DMRT1, which encodes a DM-domain-containing transcriptional factor. DMY appears to have originated recently within Oryzias, allowing a detailed evolutionary study of the initial steps that led to the new gene and new sex-determination system. Here I analyze the publicly available DMRT1 and DMY gene sequences of Oryzias species and report the following findings. First, the synonymous substitution rate in DMY is 1.73 times that in DMRT1, consistent with the male-driven evolution hypothesis. Second, the ratio of the rate of nonsynonymous nucleotide substitution (d(N)) to that of synonymous substitution (d(S)) is significantly higher in DMY than in DMRT1. Third, in DMRT1, the d(N)/d(S) ratio for the DM domain is lower than that for non-DM regions, as expected from the functional importance of the DM domain. But in DMY, the opposite is observed and the DM domain is likely under positive Darwinian selection. Fourth, only one characteristic amino acid distinguishes all DMY sequences from all DMRT1 sequences, suggesting that a single amino acid change may be largely responsible for the establishment of DMY as the male sex-determination gene in medaka fish.
Molecular and Mutational Analysis of a Gelsolin-Family Member Encoded by the Flightless I Gene of Drosophila Melanogaster

PubMed Central

de-Couet, H. G.; Fong, KSK.; Weeds, A. G.; McLaughlin, P. J.; Miklos, GLG.

1995-01-01

The flightless locus of Drosophila melanogaster has been analyzed at the genetic, molecular, ultrastructural and comparative crystallographic levels. The gene encodes a single transcript encoding a protein consisting of a leucine-rich amino terminal half and a carboxyterminal half with high sequence similarity to gelsolin. We determined the genomic sequence of the flightless landscape, the breakpoints of four chromosomal rearrangements, and the molecular lesions in two lethal and two viable alleles of the gene. The two alleles that lead to flight muscle abnormalities encode mutant proteins exhibiting amino acid replacements within the S1-like domain of their gelsolin-like region. Furthermore, the deduced intronexon structure of the D. melanogaster gene has been compared with that of the Caenorhabditis elegans homologue. Furthermore, the sequence similarities of the flightless protein with gelsolin allow it to be evaluated in the context of the published crystallographic structure of the S1 domain of gelsolin. Amino acids considered essential for the structural integrity of the core are found to be highly conserved in the predicted flightless protein. Some of the residues considered essential for actin and calcium binding in gelsolin S1 and villin V1 are also well conserved. These data are discussed in light of the phenotypic characteristics of the mutants and the putative functions of the protein. PMID:8582612
Epidemiology of transmissible diseases: Array hybridization and next generation sequencing as universal nucleic acid-mediated typing tools.

PubMed

Michael Dunne, W; Pouseele, Hannes; Monecke, Stefan; Ehricht, Ralf; van Belkum, Alex

2017-09-21

The magnitude of interest in the epidemiology of transmissible human diseases is reflected in the vast number of tools and methods developed recently with the expressed purpose to characterize and track evolutionary changes that occur in agents of these diseases over time. Within the past decade a new suite of such tools has become available with the emergence of the so-called "omics" technologies. Among these, two are exponents of the ongoing genomic revolution. Firstly, high-density nucleic acid probe arrays have been proposed and developed using various chemical and physical approaches. Via hybridization-mediated detection of entire genes or genetic polymorphisms in such genes and intergenic regions these so called "DNA chips" have been successfully applied for distinguishing very closely related microbial species and strains. Second and even more phenomenal, next generation sequencing (NGS) has facilitated the assessment of the complete nucleotide sequence of entire microbial genomes. This technology currently provides the most detailed level of bacterial genotyping and hence allows for the resolution of microbial spread and short-term evolution in minute detail. We will here review the very recent history of these two technologies, sketch their usefulness in the elucidation of the spread and epidemiology of mostly hospital-acquired infections and discuss future developments. Copyright © 2017 Elsevier B.V. All rights reserved.
Amino acid and structural variability of Yersinia pestis LcrV protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anisimov, A P; Dentovskaya, S V; Panfertsev, E A

2009-11-09

The LcrV protein is a multifunctional virulence factor and protective antigen of the plague bacterium which is generally conserved between the epidemic strains of Yersinia pestis. They investigated the diversity in the LcrV sequences among non-epidemic Y. pestis strains which have a limited virulence in selected animal models and for humans. Sequencing of lcrV genes from ten Y. pestis strains belonging to different phylogenetic groups (subspecies) showed that the LcrV proteins possess four major variable hotspots at positions 18, 72, 273, and 324-326. These major variations, together with other minor substitutions in amino acid sequences, allowed them to classify themore » LcrV alleles into five sequence types (A-E). They observed that the strains of different Y. pestis subspecies can have the same typ of LcrV, and different types of LcrV can exist within the same natural plague focus. The LcrV polymorphisms were structurally analyzed by comparing the modeled structures of LcrV from all available strains. All changes except one occurred either in flexible regions or on the surface of the protein, but local chemical properties (i.e. those of a hydrophobic, hydrophilic, amphipathic, or charged nature) were conserved across all of the strains. Polymorphisms in flexible and surface regions are likely subject to less selective pressure, and have a limited impact on the structure. In contrast, the substitution of tryptophan at position 113 with either glutamic acid or glycine likely has a serious influence on the regional structure of the protein, and these mutations might have an effect on the function of LcrV. The polymorphisms at positions 18, 72 and 273 were accountable for differences in oligomerization of LcrV. The importance of the latter property in emergence of epidemic strains of Y. pestis during evolution of this pathogen will need to be further investigated.« less
Methods for making nucleotide probes for sequencing and synthesis

DOEpatents

Church, George M; Zhang, Kun; Chou, Joseph

2014-07-08

Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.
Soil amino acid composition across a boreal forest successional sequence

Treesearch

Nancy R. Werdin-Pfisterer; Knut Kielland; Richard D. Boone

2009-01-01

Soil amino acids are important sources of organic nitrogen for plant nutrition, yet few studies have examined which amino acids are most prevalent in the soil. In this study, we examined the composition, concentration, and seasonal patterns of soil amino acids across a primary successional sequence encompassing a natural gradient of plant productivity and soil...

37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2014 CFR

2014-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2013 CFR

2013-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2012 CFR

2012-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
Amino-terminal sequence of glycoprotein D of herpes simplex virus types 1 and 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eisenberg, R.J.; Long, D.; Hogue-Angeletti, R.

1984-01-01

Glycoprotein D (gD) of herpes simplex virus is a structural component of the virion envelope which stimulates production of high titers of herpes simplex virus type-common neutralizing antibody. The authors caried out automated N-terminal amino acid sequencing studies on radiolabeled preparations of gD-1 (gD of herpes simplex virus type 1) and gD-2 (gD of herpes simplex virus type 2). Although some differences were noted, particularly in the methionine and alanine profiles for gD-1 and gD-2, the amino acid sequence of a number of the first 30 residues of the amino terminus of gD-1 and gD-2 appears to be quite similar.more » For both proteins, the first residue is a lysine. When we compared out sequence data for gD-1 with those predicted by nucleic acid sequencing, the two sequences could be aligned (with one exception) starting at residue 26 (lysine) of the predicted sequence. Thus, the first 25 amino acids of the predicted sequence are absent from the polypeptides isolated from infected cells.« less
Cloning and sequencing of a gene encoding a novel extracellular neutral proteinase from Streptomyces sp. strain C5 and expression of the gene in Streptomyces lividans 1326.

PubMed Central

Lampel, J S; Aphale, J S; Lampel, K A; Strohl, W R

1992-01-01

The gene encoding a novel milk protein-hydrolyzing proteinase was cloned on a 6.56-kb SstI fragment from Streptomyces sp. strain C5 genomic DNA into Streptomyces lividans 1326 by using the plasmid vector pIJ702. The gene encoding the small neutral proteinase (snpA) was located within a 2.6-kb BamHI-SstI restriction fragment that was partially sequenced. The molecular mass of the deduced amino acid sequence of the mature protein was determined to be 15,740, which corresponds very closely with the relative molecular mass of the purified protein (15,500) determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The N-terminal amino acid sequence of the purified neutral proteinase was determined, and the DNA encoding this sequence was found to be located within the sequenced DNA. The deduced amino acid sequence contains a conserved zinc binding site, although secondary ligand binding and active sites typical of thermolysinlike metalloproteinases are absent. The combination of its small size, deduced amino acid sequence, and substrate and inhibition profile indicate that snpA encodes a novel neutral proteinase. Images PMID:1569011
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

PubMed

Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

2016-01-04

We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of amino acid residues involved in substrate specificity of plant acyl-ACP thioesterases using a bioinformatics-guided approach

PubMed Central

Mayer, Kimberly M; Shanklin, John

2007-01-01

Background The large amount of available sequence information for the plant acyl-ACP thioesterases (TEs) made it possible to use a bioinformatics-guided approach to identify amino acid residues involved in substrate specificity. The Conserved Property Difference Locator (CPDL) program allowed the identification of putative specificity-determining residues that differ between the FatA and FatB TE classes. Six of the FatA residue differences identified by CPDL were incorporated into the FatB-like parent via site-directed mutagenesis and the effect of each on TE activity was determined. Variants were expressed in E. coli strain K27 that allows determination of enzyme activity by GCMS analysis of fatty acids released into the medium. Results Substitutions at four of the positions (74, 86, 141, and 174) changed substrate specificity to varying degrees while changes at the remaining two positions, 110 and 221, essentially inactivated the thioesterase. The effects of substitutions at positions 74, 141, and 174 (3-MUT) or 74, 86, 141, 174 (4-MUT) were not additive with respect to specificity. Conclusion Four of six putative specificity determining positions in plant TEs, identified with the use of CPDL, were validated experimentally; a novel colorimetric screen that discriminates between active and inactive TEs is also presented. PMID:17201914
Biodiversity and technological-functional potential of lactic acid bacteria isolated from spontaneously fermented quinoa sourdoughs.

PubMed

Ruiz Rodríguez, L; Vera Pingitore, E; Rollan, G; Cocconcelli, P S; Fontana, C; Saavedra, L; Vignolo, G; Hebert, E M

2016-05-01

To analyse lactic acid bacteria (LAB) diversity and technological-functional and safety properties of strains present during spontaneous fermented quinoa sourdoughs. Fermentation was performed by daily backslopping at 30°C for 10 days. Autochthonous LAB microbiota was monitored by a biphasic approach combining random amplified polymorphic DNA (RAPD)-PCR and rRNA gene sequencing with PCR-denaturing gradient gel electrophoresis (DGGE) analysis. Identification and intraspecies differentiation allowed to group isolates within nine LAB species belonging to four genera. A succession of LAB species occurred during 10-days backslopping; Lactobacillus plantarum and Lactobacillus brevis were detected as dominant species in the consortium. The characterization of 15 representative LAB strains was performed based on the acidifying capacity, starch and protein hydrolysis, γ-aminobutyric acid and exopolysaccharides production, antimicrobial activity and antibiotic resistance. Strains characterization led to the selection of Lact. plantarum CRL1905 and Leuconostoc mesenteroides CRL1907 as candidates to be assayed as functional starter culture for the gluten-free (GF) quinoa fermented products. Results on native LAB microbiota present during quinoa sourdough fermentation will allow the selection of strains with appropriate technological properties to be used as a novel functional starter culture for GF-fermented products. © 2016 The Society for Applied Microbiology.
Chirality- and sequence-selective successive self-sorting via specific homo- and complementary-duplex formations

PubMed Central

Makiguchi, Wataru; Tanabe, Junki; Yamada, Hidekazu; Iida, Hiroki; Taura, Daisuke; Ousaka, Naoki; Yashima, Eiji

2015-01-01

Self-recognition and self-discrimination within complex mixtures are of fundamental importance in biological systems, which entirely rely on the preprogrammed monomer sequences and homochirality of biological macromolecules. Here we report artificial chirality- and sequence-selective successive self-sorting of chiral dimeric strands bearing carboxylic acid or amidine groups joined by chiral amide linkers with different sequences through homo- and complementary-duplex formations. A mixture of carboxylic acid dimers linked by racemic-1,2-cyclohexane bis-amides with different amide sequences (NHCO or CONH) self-associate to form homoduplexes in a completely sequence-selective way, the structures of which are different from each other depending on the linker amide sequences. The further addition of an enantiopure amide-linked amidine dimer to a mixture of the racemic carboxylic acid dimers resulted in the formation of a single optically pure complementary duplex with a 100% diastereoselectivity and complete sequence specificity stabilized by the amidinium–carboxylate salt bridges, leading to the perfect chirality- and sequence-selective duplex formation. PMID:26051291
ANCAC: amino acid, nucleotide, and codon analysis of COGs--a tool for sequence bias analysis in microbial orthologs.

PubMed

Meiler, Arno; Klinger, Claudia; Kaufmann, Michael

2012-09-08

The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC's NUCOCOG dataset as the largest one available for that purpose thus far. Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.
ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs

PubMed Central

2012-01-01

Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. PMID:22958836
RNAblueprint: flexible multiple target nucleic acid sequence design.

PubMed

Hammer, Stefan; Tschiatschek, Birgit; Flamm, Christoph; Hofacker, Ivo L; Findeiß, Sven

2017-09-15

Realizing the value of synthetic biology in biotechnology and medicine requires the design of molecules with specialized functions. Due to its close structure to function relationship, and the availability of good structure prediction methods and energy models, RNA is perfectly suited to be synthetically engineered with predefined properties. However, currently available RNA design tools cannot be easily adapted to accommodate new design specifications. Furthermore, complicated sampling and optimization methods are often developed to suit a specific RNA design goal, adding to their inflexibility. We developed a C ++ library implementing a graph coloring approach to stochastically sample sequences compatible with structural and sequence constraints from the typically very large solution space. The approach allows to specify and explore the solution space in a well defined way. Our library also guarantees uniform sampling, which makes optimization runs performant by not only avoiding re-evaluation of already found solutions, but also by raising the probability of finding better solutions for long optimization runs. We show that our software can be combined with any other software package to allow diverse RNA design applications. Scripting interfaces allow the easy adaption of existing code to accommodate new scenarios, making the whole design process very flexible. We implemented example design approaches written in Python to demonstrate these advantages. RNAblueprint , Python implementations and benchmark datasets are available at github: https://github.com/ViennaRNA . s.hammer@univie.ac.at, ivo@tbi.univie.ac.at or sven@tbi.univie.ac.at. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
The primary structure of the thymidine kinase gene of fish lymphocystis disease virus.

PubMed

Schnitzler, P; Handermann, M; Szépe, O; Darai, G

1991-06-01

The DNA nucleotide sequence of the thymidine kinase (TK) gene of fish lymphocystis disease virus (FLDV) which has been localized between the coordinates 0.678 to 0.688 of the viral genome was determined. The analysis of the DNA nucleotide sequence located between the recognition sites of HindIII (0.669 map unit; nucleotide position 1) and AccI (nucleotide position 2032) revealed the presence of an open reading frame of 954 bp on the lower strand of this region between nucleotide positions 1868 (ATG) and 915 (TAA). It encodes for a protein of 318 amino acid residues. The evolutionary relationships of the TK gene of FLDV to the other known TK genes was investigated using the method of progressive sequence alignment. These analyses revealed a high degree of diversity between the protein sequence of FLDV TK gene and the amino acid composition of other TKs tested. However, significant conservations were detected at several regions of amino acid residues of the FLDV TK protein when compared to the amino acid sequence of TKs of African swine fever virus, fowlpox virus, shope fibroma virus, and vaccinia virus and to the amino acid sequences of the cellular cytoplasmic TK of chicken, mouse, and man.
Proteogenomic Analysis of Polymorphisms and Gene Annotation Divergences in Prokaryotes using a Clustered Mass Spectrometry-Friendly Database*

PubMed Central

de Souza, Gustavo A.; Arntzen, Magnus Ø.; Fortuin, Suereta; Schürch, Anita C.; Målen, Hiwa; McEvoy, Christopher R. E.; van Soolingen, Dick; Thiede, Bernd; Warren, Robin M.; Wiker, Harald G.

2011-01-01

Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains. PMID:21030493
Linking Microbial Community Structure, Activity and Carbon Cycling in Biological Soil Crust

NASA Astrophysics Data System (ADS)

Swenson, T.; Karaoz, U.; Swenson, J.; Bowen, B.; Northen, T.

2016-12-01

Soils play a key role in the global carbon cycle, but the relationships between soil microbial communities and metabolic pathways are poorly understood. In this study, biological soil crusts (biocrusts) from the Colorado Plateau are being used to develop soil metabolomics methods and statistical models to link active microbes to the abundance and turnover of soil metabolites and to examine the detailed substrate and product profiles of individual soil bacteria isolated from biocrust. To simulate a pulsed activity (wetting) event and to analyze the subsequent correlations between soil metabolite dynamics, community structure and activity, biocrusts were wetup with water and samples (porewater and DNA) were taken at various timepoints up to 49.5 hours post-wetup. DNA samples were sequenced using the HiSeq sequencing platform and porewater metabolites were analyzed using untargeted liquid chromatography/ mass spectrometry. Exometabolite analysis revealed the release of a breadth of metabolites including sugars, amino acids, fatty acids, dicarboxylic acids, nucleobases and osmolytes. In general, many metabolites (e.g. amino acids and nucleobases) immediately increased in abundance following wetup and then steadily decreased. However, a few continued to increase over time (e.g. xanthine). Interestingly, in a previous study exploring utilization of soil metabolites by sympatric bacterial isolates from biocrust, we observed xanthine to be released by some Bacilli sp. Furthermore, our current metagenomics data show that members of the Paenibacillaceae family increase in abundance in late wetup samples. Previous 16S amplicon data also show a "Firmicutes bloom" following wetup with the new metagenomic data resolving this at genome-level. Our continued metagenome and exometabolome analyses are allowing us to examine complex pulsed-activity events in biocrust microbial communities specifically by correlating the abundance of microbes to the release of soil metabolites. Ultimately, these approaches will provide an important complement to sequencing efforts linking soil microbes and soil metabolites to enable genomic sciences approaches for understanding and modeling soil carbon cycling.
Molecular cloning and expression of the hyu genes from Microbacterium liquefaciens AJ 3912, responsible for the conversion of 5-substituted hydantoins to alpha-amino acids, in Escherichia coli.

PubMed

Suzuki, Shun'ichi; Takenaka, Yasuhiro; Onishi, Norimasa; Yokozeki, Kenzo

2005-08-01

A DNA fragment from Microbacterium liquefaciens AJ 3912, containing the genes responsible for the conversion of 5-substituted-hydantoins to alpha-amino acids, was cloned in Escherichia coli and sequenced. Seven open reading frames (hyuP, hyuA, hyuH, hyuC, ORF1, ORF2, and ORF3) were identified on the 7.5 kb fragment. The deduced amino acid sequence encoded by the hyuA gene included the N-terminal amino acid sequence of the hydantoin racemase from M. liquefaciens AJ 3912. The hyuA, hyuH, and hyuC genes were heterologously expressed in E. coli; their presence corresponded with the detection of hydantoin racemase, hydantoinase, and N-carbamoyl alpha-amino acid amido hydrolase enzymatic activities respectively. The deduced amino acid sequences of hyuP were similar to those of the allantoin (5-ureido-hydantoin) permease from Saccharomyces cerevisiae, suggesting that hyuP protein might function as a hydantoin transporter.
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Human Retroviruses and AIDS. A compilation and analysis of nucleic acid and amino acid sequences: I--II; III--V

DOE Office of Scientific and Technical Information (OSTI.GOV)

Myers, G.; Korber, B.; Wain-Hobson, S.

1993-12-31

This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (I) HIV and SIV Nucleotide Sequences; (II) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. Information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium.
Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

PubMed

Pietrowski, D; Förster, M

2000-01-01

The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).
Cloning of an avilamycin biosynthetic gene cluster from Streptomyces viridochromogenes Tü57.

PubMed Central

Gaisser, S; Trefzer, A; Stockert, S; Kirschning, A; Bechthold, A

1997-01-01

A 65-kb region of DNA from Streptomyces viridochromogenes Tü57, containing genes encoding proteins involved in the biosynthesis of avilamycins, was isolated. The DNA sequence of a 6.4-kb fragment from this region revealed four open reading frames (ORF1 to ORF4), three of which are fully contained within the sequenced fragment. The deduced amino acid sequence of AviM, encoded by ORF2, shows 37% identity to a 6-methylsalicylic acid synthase from Penicillium patulum. Cultures of S. lividans TK24 and S. coelicolor CH999 containing plasmids with ORF2 on a 5.5-kb PstI fragment were able to produce orsellinic acid, an unreduced version of 6-methylsalicylic acid. The amino acid sequence encoded by ORF3 (AviD) is 62% identical to that of StrD, a dTDP-glucose synthase from S. griseus. The deduced amino acid sequence of AviE, encoded by ORF4, shows 55% identity to a dTDP-glucose dehydratase (StrE) from S. griseus. Gene insertional inactivation experiments of aviE abolished avilamycin production, indicating the involvement of aviE in the biosynthesis of avilamycins. PMID:9335272

Generation of sequence signatures from DNA amplification fingerprints with mini-hairpin and microsatellite primers.

PubMed

Caetano-Anollés, G; Gresshoff, P M

1996-06-01

DNA amplification fingerprinting (DAF) with mini-hairpins harboring arbitrary "core" sequences at their 3' termini were used to fingerprint a variety of templates, including PCR products and whole genomes, to establish genetic relationships between plant tax at the interspecific and intraspecific level, and to identify closely related fungal isolates and plant accessions. No correlation was observed between the sequence of the arbitrary core, the stability of the mini-hairpin structure and DAF efficiency. Mini-hairpin primers with short arbitrary cores and primers complementary to simple sequence repeats present in microsatellites were also used to generate arbitrary signatures from amplification profiles (ASAP). The ASAP strategy is a dual-step amplification procedure that uses at least one primer in each fingerprinting stage. ASAP was able to reproducibly amplify DAF products (representing about 10-15 kb of sequence) following careful optimization of amplification parameters such as primer and template concentration. Avoidance of primer sequences partially complementary to DAF product termini was necessary in order to produce distinct fingerprints. This allowed the combinatorial use of oligomers in nucleic acid screening, with numerous ASAP fingerprinting reactions based on a limited number of primer sequences. Mini-hairpin primers and ASAP analysis significantly increased detection of polymorphic DNA, separating closely related bermudagrass (Cynodon) cultivars and detecting putatively linked markers in bulked segregant analysis of the soybean (Glycine max) supernodulation (nitrate-tolerant symbiosis) locus.
ScaffoldSeq: Software for characterization of directed evolution populations.

PubMed

Woldring, Daniel R; Holec, Patrick V; Hackel, Benjamin J

2016-07-01

ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

PubMed

Tian, Pengfei; Best, Robert B

2017-10-17

Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.
A "push and slide" mechanism allows sequence-insensitive translocation of secretory proteins by the SecA ATPase.

PubMed

Bauer, Benedikt W; Shemesh, Tom; Chen, Yu; Rapoport, Tom A

2014-06-05

In bacteria, most secretory proteins are translocated across the plasma membrane by the interplay of the SecA ATPase and the SecY channel. How SecA moves a broad range of polypeptide substrates is only poorly understood. Here we show that SecA moves polypeptides through the SecY channel by a "push and slide" mechanism. In its ATP-bound state, SecA interacts through a two-helix finger with a subset of amino acids in a substrate, pushing them into the channel. A polypeptide can also passively slide back and forth when SecA is in the predominant ADP-bound state or when SecA encounters a poorly interacting amino acid in its ATP-bound state. SecA performs multiple rounds of ATP hydrolysis before dissociating from SecY. The proposed push and slide mechanism is supported by a mathematical model and explains how SecA allows translocation of a wide range of polypeptides. This mechanism may also apply to hexameric polypeptide-translocating ATPases. Copyright © 2014 Elsevier Inc. All rights reserved.
Fragger: a protein fragment picker for structural queries.

PubMed

Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

2017-01-01

Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.
Escherichia coli tatC mutations that suppress defective twin-arginine transporter signal peptides.

PubMed

Strauch, Eva-Maria; Georgiou, George

2007-11-23

In vitro studies have suggested that the TatBC complex serves as the receptor for signal peptides targeted for export via the twin-arginine translocation (Tat) pathway. Substitution of the hallmark twin-arginine dipeptide with two lysines abrogates export of physiological substrates in all organisms. We report the isolation and characterization of suppressor mutations that allow export of an ssTor(KK)-GFP-SsrA tripartite fusion. We identified two amino acid suppressor mutations in the first cytoplasmic loop of TatC. In addition, two other amino acids in the first cytoplasmic loop exhibit epistatic suppression. Surprisingly, we also identified a suppressor mutation predicted to lie within the second periplasmic loop of TatC, a region that is not expected to interact directly with the signal peptide. The suppressor mutations allowed export of the native Esherichia coli Tat substrate trimethylamine N-oxide reductase with a twin-lysine substitution in its signal sequence. The cytoplasmic suppressor mutations conferred SDS sensitivity and partial filamentation, indicating that Tat export of authentic substrates was impaired.
Identification of Bacteriophage N4 Virion RNA Polymerase-Nucleic Acid Interactions in Transcription Complexes*

PubMed Central

Davydova, Elena K.; Kaganman, Irene; Kazmierczak, Krystyna M.; Rothman-Denes, Lucia B.

2009-01-01

Bacteriophage N4 mini-virion RNA polymerase (mini-vRNAP), the 1106-amino acid transcriptionally active domain of vRNAP, recognizes single-stranded DNA template-containing promoters composed of conserved sequences and a 3-base loop–5-base pair stem hairpin structure. The major promoter recognition determinants are a purine located at the center of the hairpin loop (–11G) and a base at the hairpin stem (–8G). Mini-vRNAP is an evolutionarily highly diverged member of the T7 family of RNAPs. A two-plasmid system was developed to measure the in vivo activity of mutant mini-vRNAP enzymes. Five mini-vRNAP derivatives, each containing a pair of cysteine residues separated by ∼100 amino acids and single cysteine-containing enzymes, were generated. These reagents were used to determine the smallest catalytically active polypeptide and to map promoter, substrate, and RNA-DNA hybrid contact sites to single amino acid residues in the enzyme by using end-labeled 5-iododeoxyuridine- and azidophenacyl-substituted oligonucleotides, cross-linkable derivatives of the initiating nucleotide, and RNA products with 5-iodouridine incorporated at specific positions. Localization of functionally important amino acid residues in the recently determined crystal structures of apomini-vRNAP and the mini-vRNAP-promoter complex and comparison with the crystal structures of the T7 RNAP initiation and elongation complexes allowed us to predict major rearrangements in mini-vRNAP in the transition from transcription initiation to elongation similar to those observed in T7 RNAP, a task otherwise precluded by the lack of sequence homology between N4 mini-vRNAP and T7 RNAP. PMID:19015264
Deletion mutants of Harvey ras p21 protein reveal the absolute requirement of at least two distant regions for GTP-binding and transforming activities.

PubMed Central

Lacal, J C; Anderson, P S; Aaronson, S A

1986-01-01

Deletions of small sequences from the viral Harvey ras gene have been generated, and resulting ras p21 mutants have been expressed in Escherichia coli. Purification of each deleted protein allowed the in vitro characterization of GTP-binding, GTPase and autokinase activity of the proteins. Microinjection of the highly purified proteins into quiescent NIH/3T3 cells, as well as transfection experiments utilizing a long terminal repeat (LTR)-containing vector, were utilized to analyze the biological activity of the deleted proteins. Two small regions located at 6-23 and 152-165 residues are shown to be absolutely required for in vitro and in vivo activities of the ras product. By contrast, the variable region comprising amino acids 165-184 was shown not to be necessary for either in vitro or in vivo activities. Thus, we demonstrate that: (i) amino acid sequences at positions 5-23 and 152-165 of ras p21 protein are probably directly involved in the GTP-binding activity; (ii) GTP-binding is required for the transforming activity of ras p21 and by extension for the normal function of the proto-oncogene product; and (iii) the variable region at the C-terminal end of the ras p21 molecule from amino acids 165 to 184 is not required for transformation. Images Fig.2. Fig.4. PMID:3011420
Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences.

PubMed

Chen, Peng; Li, Jinyan; Wong, Limsoon; Kuwahara, Hiroyuki; Huang, Jianhua Z; Gao, Xin

2013-08-01

Hot spot residues of proteins are fundamental interface residues that help proteins perform their functions. Detecting hot spots by experimental methods is costly and time-consuming. Sequential and structural information has been widely used in the computational prediction of hot spots. However, structural information is not always available. In this article, we investigated the problem of identifying hot spots using only physicochemical characteristics extracted from amino acid sequences. We first extracted 132 relatively independent physicochemical features from a set of the 544 properties in AAindex1, an amino acid index database. Each feature was utilized to train a classification model with a novel encoding schema for hot spot prediction by the IBk algorithm, an extension of the K-nearest neighbor algorithm. The combinations of the individual classifiers were explored and the classifiers that appeared frequently in the top performing combinations were selected. The hot spot predictor was built based on an ensemble of these classifiers and to work in a voting manner. Experimental results demonstrated that our method effectively exploited the feature space and allowed flexible weights of features for different queries. On the commonly used hot spot benchmark sets, our method significantly outperformed other machine learning algorithms and state-of-the-art hot spot predictors. The program is available at http://sfb.kaust.edu.sa/pages/software.aspx. Copyright © 2013 Wiley Periodicals, Inc.
Evaluation of commercial soy sauce koji strains of Aspergillus oryzae for γ-aminobutyric acid (GABA) production.

PubMed

Ab Kadir, Safuan; Wan-Mohtar, Wan Abd Al Qadr Imad; Mohammad, Rosfarizan; Abdul Halim Lim, Sarina; Sabo Mohammed, Abdulkarim; Saari, Nazamid

2016-10-01

In this study, four selected commercial strains of Aspergillus oryzae were collected from soy sauce koji. These A. oryzae strains designated as NSK, NSZ, NSJ and NST shared similar morphological characteristics with the reference strain (A. oryzae FRR 1675) which confirmed them as A. oryzae species. They were further evaluated for their ability to produce γ-aminobutyric acid (GABA) by cultivating the spore suspension in a broth medium containing 0.4 % (w/v) of glutamic acid as a substrate for GABA production. The results showed that these strains were capable of producing GABA; however, the concentrations differed significantly (P < 0.05) among themselves. Based on the A. oryzae strains, highest GABA concentration was obtained from NSK (194 mg/L) followed by NSZ (63 mg/L), NSJ (51.53 mg/L) and NST (31.66 mg/L). Therefore, A. oryzae NSK was characterized and the sequence was found to be similar to A. oryzae and A. flavus with 99 % similarity. The evolutionary distance (K nuc) between sequences of identical fungal species was calculated and a phylogenetic tree prepared from the K nuc data showed that the isolate belonged to the A. oryzae species. This finding may allow the development of GABA-rich ingredients using A. oryzae NSK as a starter culture for soy sauce production.
Conformational analysis of the N-terminal sequence Met1 Val60 of the tyrosine hydroxylase

NASA Astrophysics Data System (ADS)

Alieva, Irada N.; Mustafayeva, Narmina N.; Gojayev, Niftali M.

2006-03-01

Molecular mechanics method and molecular dynamics (MD) simulation techniques are used to study the behavior and the effect of the amino acids substitution on structure and molecular dynamics of the specific portion of Met1-Val60 amino acid residues from N-terminal regulatory domain of the tyrosine hydroxylase (TH) and its mutants in which the positively charged arginine residues at positions 37 and 38 were replaced by electrically neutral Gly and negatively charged Glu, and serine residue at position 40 was replaced by Ala or Asp residue. Our study allowed us to make the following conclusions: (i) the higher conformational flexibility of the Met1-Arg16 sequence is revealed in comparision to other part of the N-terminus; (ii) the stretch of amino acid residues Met30-Ser40 within the N-terminus forms β-turn so that two α-helices (residues 16-29 and residues 41-60) are paralel one another; (ii) the significant differences that are observed for the Arg37→Gly37, Arg37-Arg38→Glu37-Glu38 mutant segments indicates that the positive charge of the Arg37 and Arg38 residues is one of the main factor that maintains the characteristic of the turn; (ii) no major conformational changes are observed between Ser40→Ala40, and Ser40→Asp40 mutant segments.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Leong, JoAnn Ching

The nucleotide sequence of the IHNV glycoprotein gene has been determined from a cDNA clone containing the entire coding region. The glycoprotein cDNA clone contained a leader sequence of 48 bases, a coding region of 1524 nucleotides, and 39 bases at the 3 foot end. The entire cDNA clone contains 1609 nucleodites and encodes a protein of 508 amino acids. The deduced amino acid sequence gave a translated molecular weight of 56,795 daltons. A hydropathicity profile of the deduced amino acid sequence indicated that there were two major hydrophobic domains: one,at the N-terminus,delineating a signal peptide of 18 amino acidsmore » and the other, at the C-terminus,delineating the region of the transmembrane. Five possible sites of N-linked glyscoylation were identified. Although no nucleic acid homology existed between the IHNV glycoprotein gene and the glycoprotein genes of rabies and VSV, there was significant homology at the amino acid level between all three rhabdovirus glycoproteins.« less
Pelagirhabdus alkalitolerans gen. nov., sp. nov., an alkali-tolerant and thermotolerant bacterium isolated from beach sediment, and reclassification of Amphibacillus fermentum as Pelagirhabdus fermentum comb. nov.

PubMed

Sultanpuram, Vishnuvardhan Reddy; Mothe, Thirumala; Chintalapati, Sasikala; Chintalapati, Venkata Ramana

2016-01-01

A novel bacterial strain, designated S5T, was isolated from Pingaleshwar beach, in India. Cells were Gram-stain-positive, rod-shaped, non-motile and non-endospore-forming. Based on 16S rRNA gene sequence analysis, the strain was identified as belonging to the class Firmibacteria and was related most closely to Amphibacillus fermentum DSM 13869T (97.6 % sequence similarity). However, it shared only 93.1 % 16S rRNA gene sequence similarity with Amphibacillus xylanus NBRC 15112T, the type species of the genus, indicating that strain S5T might not be a member of the genus Amphibacillus. The DNA-DNA relatedness between strain S5T and Amphibacillus fermentum DSM 13869T was 39 %. The cell-wall peptidoglycan contained meso-diaminopimelic acid. Polar lipids included diphosphatidylglycerol, phosphatidylglycerol and two phospholipids. Isoprenoid quinones were absent from strain S5T. Fatty acid analysis revealed that anteiso-C15 : 0, C16 : 0 and iso-C15 : 0 were the predominant fatty acids present. The results of phylogenetic, chemotaxonomic and biochemical tests allowed the clear differentiation of strain S5T, which is considered to represent a novel species of a new genus in the family Bacillaceae, for which the name Pelagirhabdus alkalitolerans gen. nov., sp. nov. is proposed. The type strain of Pelagirhabdus alkalitolerans is S5T ( = KCTC 33632T = CGMCC 1.15177T). Based on the present study, it is also suggested to transfer Amphibacillus fermentum to this new genus, as Pelagirhabdus fermentum comb. nov. The type strain of Pelagirhabdus fermentum is Z-7984T = (DSM 13869T = UNIQEM 210T).
Potential Value of Major Antigenic Protein 2 for Serological Diagnosis of Heartwater and Related Ehrlichial Infections

PubMed Central

Bowie, Michael V.; Reddy, G. Roman; Semu, Shalt M.; Mahan, Suman M.; Barbet, Anthony F.

1999-01-01

Cowdria ruminantium is the etiologic agent of heartwater, a disease causing major economic loss in ruminants in sub-Saharan Africa and the Caribbean. Development of a serodiagnostic test is essential for determining the carrier status of animals from regions where heartwater is endemic, but most available tests give false-positive reactions with sera against related Erhlichia species. Current approaches rely on molecular methods to define proteins and epitopes that may allow specific diagnosis. Two major antigenic proteins (MAPs), MAP1 and MAP2, have been examined for their use as antigens in the serodiagnosis of heartwater. The objectives of this study were (i) to determine if MAP2 is conserved among five geographically divergent strains of C. ruminantium and (ii) to determine if MAP2 homologs are present in Ehrlichia canis, the causative agent of canine ehrlichiosis, and Ehrlichia chaffeensis, the organism responsible for human monocytic ehrlichiosis. These two agents are closely related to C. ruminantium. The map2 gene from four strains of C. ruminantium was cloned, sequenced, and compared with the previously reported map2 gene from the Crystal Springs strain. Only 10 nucleic acid differences between the strains were identified, and they translate to only 3 amino acid changes, indicating that MAP2 is highly conserved. Genes encoding MAP2 homologs from E. canis and E. chaffeensis also were cloned and sequenced. Amino acid analysis of MAP2 homologs of E. chaffeensis and E. canis with MAP2 of C. ruminantium revealed 83.4 and 84.4% identities, respectively. Further analysis of MAP2 and its homologs revealed that the whole protein lacks specificity for heartwater diagnosis. The development of epitope-specific assays using this sequence information may produce diagnostic tests suitable for C. ruminantium and also other related rickettsiae. PMID:10066656
Genomic and Transcriptomic Analysis of Escherichia coli Strains Associated with Persistent and Transient Bovine Mastitis and the Role of Colanic Acid.

PubMed

Lippolis, John D; Holman, Devin B; Brunelle, Brian W; Thacker, Tyler C; Bearson, Bradley L; Reinhardt, Timothy A; Sacco, Randy E; Casey, Thomas A

2018-01-01

Escherichia coli is a leading cause of bacterial mastitis in dairy cattle. It is most often transient in nature, causing an infection that lasts 2 to 3 days. However, E. coli has been shown to cause a persistent infection in a minority of cases. Mechanisms that allow for a persistent E. coli infection are not fully understood. The goal of this work was to determine differences between E. coli strains originally isolated from dairy cattle with transient and persistent mastitis. Using RNA sequencing, we show gene expression differences in nearly 200 genes when bacteria from the two clinical phenotypes are compared. We sequenced the genomes of the E. coli strains and report genes unique to the two phenotypes. Differences in the wca operon, which encodes colanic acid, were identified by DNA as well as RNA sequencing and differentiated the two phenotypes. Previous work demonstrated that E. coli strains that cause persistent infections were more motile than those that cause transient infections. Deletion of genes in the wca operon from a persistent-infection strain resulted in a reduction of motility as measured in swimming and swarming assays. Furthermore, colanic acid has been shown to protect bacteria from complement-mediated killing. We show that transient-infection E. coli strains were more sensitive to complement-mediated killing. The deletion of genes from the wca operon caused a persistent-infection E. coli strain to become sensitive to complement-mediated killing. This work identifies important differences between E. coli strains that cause persistent and transient mammary infections in dairy cattle. This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Branda, Steven S.; Lane, Todd W.; Misra, Milind

Bioweapons and emerging infectious diseases pose formidable and growing threats to our national security. Rapid advances in biotechnology and the increasing efficiency of global transportation networks virtually guarantee that the United States will face potentially devastating infectious disease outbreaks caused by novel ('unknown') pathogens either intentionally or accidentally introduced into the population. Unfortunately, our nation's biodefense and public health infrastructure is primarily designed to handle previously characterized ('known') pathogens. While modern DNA assays can identify known pathogens quickly, identifying unknown pathogens currently depends upon slow, classical microbiological methods of isolation and culture that can take weeks to produce actionable information.more » In many scenarios that delay would be costly, in terms of casualties and economic damage; indeed, it can mean the difference between a manageable public health incident and a full-blown epidemic. To close this gap in our nation's biodefense capability, we will develop, validate, and optimize a system to extract nucleic acids from unknown pathogens present in clinical samples drawn from infected patients. This system will extract nucleic acids from a clinical sample, amplify pathogen and specific host response nucleic acid sequences. These sequences will then be suitable for ultra-high-throughput sequencing (UHTS) carried out by a third party. The data generated from UHTS will then be processed through a new data assimilation and Bioinformatic analysis pipeline that will allow us to characterize an unknown pathogen in hours to days instead of weeks to months. Our methods will require no a priori knowledge of the pathogen, and no isolation or culturing; therefore it will circumvent many of the major roadblocks confronting a clinical microbiologist or virologist when presented with an unknown or engineered pathogen.« less
Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

PubMed Central

Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

2008-01-01

Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731
Cloning and Characterization of a Novel β-Transaminase from Mesorhizobium sp. Strain LUK: a New Biocatalyst for the Synthesis of Enantiomerically Pure β-Amino Acids▿

PubMed Central

Kim, Juhan; Kyung, Dohyun; Yun, Hyungdon; Cho, Byung-Kwan; Seo, Joo-Hyun; Cha, Minho; Kim, Byung-Gee

2007-01-01

A novel β-transaminase gene was cloned from Mesorhizobium sp. strain LUK. By using N-terminal sequence and an internal protein sequence, a digoxigenin-labeled probe was made for nonradioactive hybridization, and a 2.5-kb gene fragment was obtained by colony hybridization of a cosmid library. Through Southern blotting and sequence analysis of the selected cosmid clone, the structural gene of the enzyme (1,335 bp) was identified, which encodes a protein of 47,244 Da with a theoretical pI of 6.2. The deduced amino acid sequence of the β-transaminase showed the highest sequence similarity with glutamate-1-semialdehyde aminomutase of transaminase subgroup II. The β-transaminase showed higher activities toward d-β-aminocarboxylic acids such as 3-aminobutyric acid, 3-amino-5-methylhexanoic acid, and 3-amino-3-phenylpropionic acid. The β-transaminase has an unusually broad specificity for amino acceptors such as pyruvate and α-ketoglutarate/oxaloacetate. The enantioselectivity of the enzyme suggested that the recognition mode of β-aminocarboxylic acids in the active site is reversed relative to that of α-amino acids. After comparison of its primary structure with transaminase subgroup II enzymes, it was proposed that R43 interacts with the carboxylate group of the β-aminocarboxylic acids and the carboxylate group on the side chain of dicarboxylic α-keto acids such as α-ketoglutarate and oxaloacetate. R404 is another conserved residue, which interacts with the α-carboxylate group of the α-amino acids and α-keto acids. The β-transaminase was used for the asymmetric synthesis of enantiomerically pure β-aminocarboxylic acids. (3S)-Amino-3-phenylpropionic acid was produced from the ketocarboxylic acid ester substrate by coupled reaction with a lipase using 3-aminobutyric acid as amino donor. PMID:17259358
The nucleotide sequence of HLA-B{sup *}2704 reveals a new amino acid substitution in exon 4 which is also present in HLA-B{sup *}2706

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rudwaleit, M.; Bowness, P.; Wordsworth, P.

1996-12-31

The HLA-B27 subtype HLA-B{sup *}2704 is virtually absent in Caucasians but common in Orientals, where it is associated with ankylosing spondylitis. The amino acid sequence of HLA-B{sup *}2704 has been established by peptide mapping and was shown to differ by two amino acids from HLA-B{sup *}2705, HLA-B{sup *}2704 is characterized by a serine for aspartic acid substitution at position 77 and glutamic acid for valine at position 152. To date, however, no nucleotide sequence confirming these changes at the DNA level has been published. 13 refs., 2 figs.
77 FR 28541 - Request for Comments on the Recommendation for the Disclosure of Sequence Listings Using XML...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-05-15

... (EPO) as the lead, to propose a revised standard for the filing of nucleotide and/or amino acid.... ST.25 uses a controlled vocabulary of feature keys to describe nucleic acid and amino acid sequences... patent data purposes. The XML standard also includes four qualifiers for amino acids. These feature keys...

Molecular cloning of the pheromone biosynthesis-activating neuropeptide in Helicoverpa zea.

PubMed Central

Davis, M T; Vakharia, V N; Henry, J; Kempe, T G; Raina, A K

1992-01-01

Pheromone biosynthesis-activating neuropeptide (PBAN) regulates sex pheromone biosynthesis in female Helicoverpa (Heliothis) zea. Two oligonucleotide probes representing two overlapping amino acid regions of PBAN were used to screen 2.5 x 10(5) recombinant plaques, and a positive recombinant clone was isolated. Sequence analysis of the isolated clone showed that the PBAN gene is interrupted after the codon encoding amino acid 14 by a 0.63-kilobase (kb) intron. Preceding the PBAN amino acid sequence is a 10-amino acid sequence containing a pentapeptide Phe-Thr-Pro-Arg-Leu, which is followed by a Gly-Arg-Arg processing site. Immediately after the PBAN amino acid sequence is a Gly-Arg processing site and a short stretch of 10 amino acids. This 10-amino acid sequence contains a repeat of the PBAN C-terminal pentapeptide Phe-Ser-Pro-Arg-Leu and is terminated by another Gly-Arg processing site. It is suggested that the PBAN gene in H. zea might carry, besides PBAN, a 7- and an 8-residue amidated peptide, which share with PBAN the core C-terminal pentapeptide Phe-(Ser or Thr)-Pro-Arg-Leu-NH2. The C-terminal pentapeptide sequence of PBAN represents the minimum sequence required for pheromonotropic activity in H. zea and also bears a high degree of homology to the pyrokinin family of insect peptides with myotropic activity. It is possible that the putative heptapeptide and octapeptide might be new members of the pyrokinin family, with pheromonotropic and/or myotropic activities. Thus, the PBAN gene products, besides affecting sexual behavior, might have broad influence on many biological processes in H. zea. Images PMID:1729680
Host Cell Virus Entry Mediated by Australian Bat Lyssavirus Envelope G glycoprotein

DTIC Science & Technology

2013-10-24

39 Figure 7. Comparison of the amino acid sequences of Saccolaimus and Pteropus ABLV G mature protein... sequence analysis revealed that the PCR products were identical. Sequence comparisons of the ABLV N and other lyssavirus N proteins showed that ABLV...Saccolaimus flaviventris) (129). Nucleoprotein sequence comparisons revealed that the Saccolaimus N protein shared 96% amino acid homology with the Pteropus
DNA sequence similarity recognition by hybridization to short oligomers

DOEpatents

Milosavljevic, Aleksandar

1999-01-01

Methods are disclosed for the comparison of nucleic acid sequences. Data is generated by hybridizing sets of oligomers with target nucleic acids. The data thus generated is manipulated simultaneously with respect to both (i) matching between oligomers and (ii) matching between oligomers and putative reference sequences available in databases. Using data compression methods to manipulate this mutual information, sequences for the target can be constructed.
Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

ScienceCinema

Patel, Kamlesh D.

2018-01-22

Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.
Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patel, Kamlesh D.

2012-06-01

Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.
Crotoxin: Structural Studies, Mechanism of Action and Cloning of Its Gene

DTIC Science & Technology

1987-03-01

other venoms and examine their toxin neutral- izing ability. The amino acid sequences of both crotoxin subunits were determined Is a prelude to cloning...be examined for their potential as anti-idiotype vaccines The complete amino acid sequence of the basic subunit and two of the three dic subunit chains...of crotoxin from the venom of C.d. terrificus has been de rmined. Sequence comparison data suggest that the non-toxic, acidic subunit was derived
Cloning and sequence analysis of the invertase gene INV 1 from the yeast Pichia anomala.

PubMed

Pérez, J A; Rodríguez, J; Rodríguez, L; Ruiz, T

1996-02-01

A genomic library from the yeast Pichia anomala has been constructed and employed to clone the gene encoding the sucrose-hydrolysing enzyme invertase by complementation of a sucrose non-fermenting mutant of Saccharomyces cerevisiae. The cloned gene, INV1, was sequenced and found to encode a polypeptide of 550 amino acids which contained a 22 amino-acid signal sequence and ten potential glycosylation sites. The amino-acid sequence shows significant identity with other yeast invertases and also with Kluyveromyces marxianus inulinase, a yeast beta-fructofuranosidase which has a different substrate specificity. The nucleotide sequences of the 5' and 3' non-coding regions were found to contain several consensus motifs probably involved in the initiation and termination of gene transcription.
Molecular cloning of two human liver 3 alpha-hydroxysteroid/dihydrodiol dehydrogenase isoenzymes that are identical with chlordecone reductase and bile-acid binder.

PubMed Central

Deyashiki, Y; Ogasawara, A; Nakayama, T; Nakanishi, M; Miyabe, Y; Sato, K; Hara, A

1994-01-01

Human liver contains two dihydrodiol dehydrogenases, DD2 and DD4, associated with 3 alpha-hydroxysteroid dehydrogenase activity. We have raised polyclonal antibodies that cross-reacted with the two enzymes and isolated two 1.2 kb cDNA clones (C9 and C11) for the two enzymes from a human liver cDNA library using the antibodies. The clones of C9 and C11 contained coding sequences corresponding to 306 and 321 amino acid residues respectively, but lacked 5'-coding regions around the initiation codon. Sequence analyses of several peptides obtained by enzymic and chemical cleavages of the two purified enzymes verified that the C9 and C11 clones encoded DD2 and DD4 respectively, and further indicated that the sequence of DD2 had at least additional 16 residues upward from the N-terminal sequence deduced from the cDNA. There was 82% amino acid sequence identity between the two enzymes, indicating that the enzymes are genetic isoenzymes. A computer-based comparison of the cDNAs of the isoenzymes with the DNA sequence database revealed that the nucleotide and amino acid sequences of DD2 and DD4 are virtually identical with those of human bile-acid binder and human chlordecone reductase cDNAs respectively. Images Figure 1 PMID:8172617
PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.

PubMed

Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D

2017-01-04

The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

NASA Astrophysics Data System (ADS)

McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

2016-05-01

Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.
Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides.

PubMed

McMillen, Chelsea L; Wright, Patience M; Cassady, Carolyn J

2016-05-01

Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.
A New F131V Mutation in Chlamydomonas Phytoene Desaturase Locates a Cluster of Norflurazon Resistance Mutations near the FAD-Binding Site in 3D Protein Models

PubMed Central

Suarez, Julio V.; Banks, Stephen; Thomas, Paul G.; Day, Anil

2014-01-01

The green alga Chlamydomonas reinhardtii provides a tractable genetic model to study herbicide mode of action using forward genetics. The herbicide norflurazon inhibits phytoene desaturase, which is required for carotenoid synthesis. Locating amino acid substitutions in mutant phytoene desaturases conferring norflurazon resistance provides a genetic approach to map the herbicide binding site. We isolated a UV-induced mutant able to grow in very high concentrations of norflurazon (150 µM). The phytoene desaturase gene in the mutant strain contained the first resistance mutation to be localised to the dinucleotide-binding Rossmann-likedomain. A highly conserved phenylalanine amino acid at position 131 of the 564 amino acid precursor protein was changed to a valine in the mutant protein. F131, and two other amino acids whose substitution confers norflurazon resistance in homologous phytoene desaturase proteins, map to distant regions in the primary sequence of the C. reinhardtii protein (V472, L505) but in tertiary models these residues cluster together to a region close to the predicted FAD binding site. The mutant gene allowed direct 5 µM norflurazon based selection of transformants, which were tolerant to other bleaching herbicides including fluridone, flurtamone, and diflufenican but were more sensitive to beflubutamid than wild type cells. Norflurazon resistance and beflubutamid sensitivity allow either positive or negative selection against transformants expressing the mutant phytoene desaturase gene. PMID:24936791
pH-Modulated Watson-Crick duplex-quadruplex equilibria of guanine-rich and cytosine-rich DNA sequences 140 base pairs upstream of the c-kit transcription initiation site.

PubMed

Bucek, Pavel; Jaumot, Joaquim; Aviñó, Anna; Eritja, Ramon; Gargallo, Raimundo

2009-11-23

Guanine-rich regions of DNA are sequences capable of forming G-quadruplex structures. The formation of a G-quadruplex structure in a region 140 base pairs (bp) upstream of the c-kit transcription initiation site was recently proposed (Fernando et al., Biochemistry, 2006, 45, 7854). In the present study, the acid-base equilibria and the thermally induced unfolding of the structures formed by a guanine-rich region and by its complementary cytosine-rich strand in c-kit were studied by means of circular dichroism and molecular absorption spectroscopies. In addition, competition between the Watson-Crick duplex and the isolated structures was studied as a function of pH value and temperature. Multivariate data analysis methods based on both hard and soft modeling were used to allow accurate quantification of the various acid-base species present in the mixtures. Results showed that the G-quadruplex and i-motif coexist with the Watson-Crick duplex over the pH range from 3.0 to 6.5, approximately, under the experimental conditions tested in this study. At pH 7.0, the duplex is practically the only species present.
PNA-COMBO-FISH: From combinatorial probe design in silico to vitality compatible, specific labelling of gene targets in cell nuclei.

PubMed

Müller, Patrick; Rößler, Jens; Schwarz-Finsterle, Jutta; Schmitt, Eberhard; Hausmann, Michael

2016-07-01

Recently, advantages concerning targeting specificity of PCR constructed oligonucleotide FISH probes in contrast to established FISH probes, e.g. BAC clones, have been demonstrated. These techniques, however, are still using labelling protocols with DNA denaturing steps applying harsh heat treatment with or without further denaturing chemical agents. COMBO-FISH (COMBinatorial Oligonucleotide FISH) allows the design of specific oligonucleotide probe combinations in silico. Thus, being independent from primer libraries or PCR laboratory conditions, the probe sequences extracted by computer sequence data base search can also be synthesized as single stranded PNA-probes (Peptide Nucleic Acid probes) or TINA-DNA (Twisted Intercalating Nucleic Acids). Gene targets can be specifically labelled with at least about 20 probes obtaining visibly background free specimens. By using appropriately designed triplex forming oligonucleotides, the denaturing procedures can completely be omitted. These results reveal a significant step towards oligonucleotide-FISH maintaining the 3d-nanostructure and even the viability of the cell target. The method is demonstrated with the detection of Her2/neu and GRB7 genes, which are indicators in breast cancer diagnosis and therapy. Copyright © 2016. Published by Elsevier Inc.
Structure of the horseradish peroxidase isozyme C genes.

PubMed

Fujiyama, K; Takemura, H; Shibayama, S; Kobayashi, K; Choi, J K; Shinmyo, A; Takano, M; Yamada, Y; Okada, H

1988-05-02

We have isolated, cloned and characterized three cDNAs and two genomic DNAs corresponding to the mRNAs and genes for the horseradish (Armoracia rusticana) peroxidase isoenzyme C (HPR C). The amino acid sequence of HRP C1, deduced from the nucleotide sequence of one of the cDNA clone, pSK1, contained the same primary sequence as that of the purified enzyme established by Welinder [FEBS Lett. 72, 19-23 (1976)] with additional sequences at the N and C terminal. All three inserts in the cDNA clones, pSK1, pSK2 and pSK3, coded the same size of peptide (308 amino acid residues) if these are processed in the same way, and the amino acid sequence were homologous to each other by 91-94%. Functional amino acids, including His40, His170, Tyr185 and Arg183 and S-S-bond-forming Cys, were conserved in the three isozymes, but a few N-glycosylation sites were not the same. Two HRP C isoenzyme genomic genes, prxC1 and prxC2, were tandem on the chromosomal DNA and each gene consisted of four exons and three introns. The positions in the exons interrupted by introns were the same in two genes. We observed a putative promoter sequence 5' upstream and a poly(A) signal 3' downstream in both genes. The gene product of prxC1 might be processed with a signal sequence of 30 amino acid residues at the N terminus and a peptide consisting of 15 amino acid residues at the C terminus.
Useful halophilic, thermostable and ionic liquids tolerant cellulases

DOEpatents

Zhang, Tao; Datta, Supratim; Simmons, Blake A.; Rubin, Edward M.

2016-06-28

The present invention provides for an isolated or recombinant polypeptide comprising an amino acid sequence having at least 70% identity with the amino acid sequence of a Halorhabdus utahensis cellulase, such as Hu-CBH1, wherein said amino acid sequence has a halophilic thermostable and/or thermophilic cellobiohydrolase (CBH) activity. In some embodiments, the polypeptide has a CBH activity that is resistant to up to about 20% of ionic liquids. The present invention also provides for compositions comprising and methods using the isolated or recombinant polypeptide.
Diagnostics based on nucleic acid sequence variant profiling: PCR, hybridization, and NGS approaches.

PubMed

Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu

2016-10-01

Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Molecular cloning and sequence analysis of full-length growth hormone cDNAs from six important economic fishes.

PubMed

Zhang, Jing-Nan; Song, Ping; Hu, Jia-Rui; Mo, Sai-Jun; Peng, Mao-Yu; Zhou, Wei; Zou, Ji-Xing; Hu, Yin-Chang

2005-01-01

In this study,the full-length cDNAs of GH (Growth Hormone) gene was isolated from six important economic fishes, Siniperca kneri, Epinephelus coioides, Monopterus albus, Silurus asotus, Misgurnus anguillicaudatus and Carassius auratus gibelio Bloch. It is the first time to clone these GH sequences except E. coioides GH. The lengths of the above cDNAs are as follows: 953 bp, 1 023 bp, 825 bp, 1 082 bp, 1 154 bp and 1 180 bp. Each sequence includes an ORF of about 600 bp which encodes a protein of about 200 amino acid: S. kneri, E. coioides and M. albus GHs of 204 amino acid, S. asotus GH of 200 amino acid, M. anguillicaudatus and C. auratus gibelio GHs of 210 amino acid. Then detailed sequence analysis of the six GHs with many other fish sequences was performed. The six sequences all showed high homology to other sequences, especially to sequences within the same order, and many conserved residues were identified, most localized in five domains. The phylogenetic trees (MP and NJ) of many fish GH ORF sequences (including the new six) with Amia calva as outgroup were generally resolved and largely congruent with the morphology-based tree though some incongruities were observed, suggesting GH ORF should be paid more attention to in teleostean phylogeny.
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.

PubMed

Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T

1996-10-31

Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
Molecular cloning, sequence and structural analysis of dehairing Mn(2+) dependent alkaline serine protease (MASPT) of Bacillus pumilus TMS55.

PubMed

Ibrahim, Kalibulla Syed; Muniyandi, Jeyaraj; Pandian, Shunmugiah Karutha

2011-10-01

Leather industries release a large amount of pollution-causing chemicals which creates one of the major industrial pollutions. The development of enzyme based processes as a potent alternative to pollution-causing chemicals is useful to overcome this issue. Proteases are enzymes which have extensive applications in leather processing and in several bioremediation processes due to their high alkaline protease activity and dehairing efficacy. In the present study, we report cloning, characterization of a Mn2+ dependent alkaline serine protease gene (MASPT) of Bacillus pumilus TMS55. The gene encoding the protease from B. pumilus TMS55 was cloned and its nucleotide sequence was determined. This gene has an open reading frame (ORF) of 1,149 bp that encodes a polypeptide of 383 amino acid residues. Our analysis showed that this polypeptide is composed of 29 residues N-terminal signal peptide, a propeptide of 79 residues and a mature protein of 275 amino acids. We performed bioinformatics analysis to compare MASPT enzyme with other proteases. Homology modeling was employed to model three dimensional structure for MASPT. Structural analysis showed that MASPT structure is composed of nine α-helices and nine β-strands. It has 3 catalytic residues and 14 metal binding residues. Docking analysis showed that residues S223, A260, N263, T328 and S329 interact with Mn2+. This study allows initial inferences about the structure of the protease and will allow the rational design of its derivatives for structure-function studies and also for further improvement of the enzyme.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.