Singh, Aditya; Bhatia, Prateek
2016-12-01
Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.
2014-01-01
Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494
Watanabe, K; Yoshioka, K; Ito, H; Ishigami, M; Takagi, K; Utsunomiya, S; Kobayashi, M; Kishimoto, H; Yano, M; Kakumu, S
1999-11-10
Hypervariable region 1 (HVR1) proteins of hepatitis C virus (HCV) have been reported to react broadly with sera of patients with HCV infection. However, the variability of the broad reactivity of individual HVR1 proteins has not been elucidated. We assessed the reactivity of 25 different HVR1 proteins (genotype 1b) with sera of 81 patients with HCV infection (genotype 1b) by Western blot. HVR1 proteins reacted with 2-60 sera. The number of sera reactive with each HVR1 protein significantly correlated with the number of amino acid residues identical to the consensus sequence defined by Puntoriero et al. (G. Puntoriero, A. Lahm, S. Zucchelli, B. B. Ercole, R. Tafi, M. Penzzanera, M. U. Mondelli, R. Cortese, A. Tramontano, G. Galfre', and A. Nicosia. 1998. EMBO J. 17, 3521-3533. ) (r = 0.561, P < 0.005). The most widely reactive HVR1 protein, 12-22, had a sequence similar to the consensus sequence. The peptide with C-terminal 13-amino-acids sequence of HVR1 protein 12-22 (NH2-CSFTSLFTPGPSQK) was injected into rabbits as an immunogen. The rabbit immune sera reacted with 9 of 25 HVR1 proteins of genotype 1b including HVR1 protein 12-22 and with 3 of 12 proteins of genotype 2a. These results indicate that the HVR1 protein broadly reactive with patients' sera has a sequence similar to the consensus sequence, can induce broadly reactive sera, and could be one of the candidate immunogens in a prophylactic vaccine against HCV. Copyright 1999 Academic Press.
The Functional Human C-Terminome
Hedden, Michael; Lyon, Kenneth F.; Brooks, Steven B.; David, Roxanne P.; Limtong, Justin; Newsome, Jacklyn M.; Novakovic, Nemanja; Rajasekaran, Sanguthevar; Thapar, Vishal; Williams, Sean R.; Schiller, Martin R.
2016-01-01
All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new “C-terminome” database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3–10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com. PMID:27050421
Valliere-Douglass, John F; Kodama, Paul; Mujacic, Mirna; Brady, Lowell J; Wang, Wes; Wallace, Alison; Yan, Boxu; Reddy, Pranhitha; Treuheit, Michael J; Balland, Alain
2009-11-20
We report that N-linked oligosaccharide structures can be present on an asparagine residue not adhering to the consensus site motif NX(S/T), where X is not proline, described in the literature. We have observed oligosaccharides on a non-consensus asparaginyl residue in the C(H)1 constant domain of IgG1 and IgG2 antibodies. The initial findings were obtained from characterization of charge variant populations evident in a recombinant human antibody of the IgG2 subclass. HPLC-MS results indicated that cation-exchange chromatography acidic variant populations were enriched in antibody with a second glycosylation site, in addition to the well documented canonical glycosylation site located in the C(H)2 domain. Subsequent tryptic and chymotryptic peptide map data indicated that the second glycosylation site was associated with the amino acid sequence TVSWN(162)SGAL in the C(H)1 domain of the antibody. This highly atypical modification is present at levels of 0.5-2.0% on most of the recombinant antibodies that have been tested and has also been observed in IgG1 antibodies derived from human donors. Site-directed mutagenesis of the C(H)1 domain sequence in a recombinant-human IgG1 antibody resulted in an increase in non-consensus glycosylation to 3.15%, a greater than 4-fold increase over the level observed in the wild type, by changing the -1 and +1 amino acids relative to the asparagine residue at position 162. We believe that further understanding of the phenomenon of non-consensus glycosylation can be used to gain fundamental insights into the fidelity of the cellular glycosylation machinery.
CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design
Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven
2003-01-01
We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.
1987-01-01
The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crooks, Gavin E.
WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Sequesnce logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richermore » and more precise description of, for example, a binding site, than would a consensus sequence.« less
Stephens, E B; Mukherjee, S; Sahni, M; Zhuge, W; Raghavan, R; Singh, D K; Leung, K; Atkinson, B; Li, Z; Joag, S V; Liu, Z Q; Narayan, O
1997-05-12
We have examined both the sequence changes in the LTR, gag, vif, vpr, vpx, tat, rev, vpu, env, and nef genes and the cell tropism of a cell-free stock of chimeric simian-human immunodeficiency virus (SHIV) isolated from the cerebrospinal fluid of a pig-tailed macaque (PNb) that developed AIDS. This virus (SHIVKU-1) is highly pathogenic when inoculated into other macaques. DNA sequence analysis of PCR-amplified products revealed a total of 5 nucleotide changes in the LTR while vif had 2 consensus amino acid changes. The gag, vif, and vpx had no consensus amino acid substitutions, whereas vpr had 1 consensus substitution. The tat and rev genes of the HXB2 region of SHIVKU-1 had 2 and 1 consensus amino acid changes, respectively. The vpu gene of the HXB2 region of SHIV, which originally had an ACG at the beginning of the gene, reverted to an initiation ATG codon and in addition contained a consensus amino acid substitution at position 69 of this protein. As expected, the majority of the nucleotide substitutions were found in the env and nef genes. Thirteen and 5 amino acid changes were predicted for the corresponding Env and Nef proteins, respectively. In addition, one-third of the env gene clones isolated from the SHIVKU-1 stock had a 5-amino-acid deletion in the V4 region. Using three independent assays, we determined that the changes in the SHIVKU-1 were associated with an increase in the efficiency of replication in macrophages. The strikingly few consensus changes in the virus suggest that conversion of this virus to one capable of causing AIDS in pig-tailed macaques was associated with relatively few changes in the viral envelope and/or accessory genes. These results will provide the basis for the development of a pathogenic, molecular clone of SHIV capable of causing AIDS in pig-tailed macaques.
Kozutsumi, Daisuke; Tsunematsu, Masako; Yamaji, Taketo; Kino, Kohsuke
2007-01-01
Cry-consensus peptide is a linearly linked peptide of T-cell epitopes for the management of Japanese cedar (JC) pollinosis and is expected to become a new drug for immunotherapy. However, the mechanism of T-cell epitopes in allergic diseases is not well understood, and thus, a simple in vitro procedure for evaluation of its biological activity is desired. Peripheral blood mononuclear cells (PBMC) were isolated from 27 JC pollinosis patients and 10 healthy subjects, and cultured in vitro for 4 days in the presence of Cry-consensus peptide and (3)H-thymidine. The relationship between growth stimulation (stimulation index; SI) and antigen-specific IgE levels in serum was also investigated in JC pollinosis patients. Moreover, to confirm the importance of the primary sequence in Cry-consensus peptide, heat-treated Cry-consensus peptide and a mixture of the amino acids of which Cry-consensus peptide is composed, and their (3)H-thymidine uptake was compared with Cry-consensus peptide. Finally, whether Cry-consensus peptide stimulates PBMCs from healthy subjects was investigated. The mean SI of JC patients showed a good correlation with Cry-consensus peptide concentration in the culture medium; however, the SI was independent of the anti-Cry j 1 IgE level. Heat-denatured Cry-consensus peptide retained a PBMC proliferation stimulatory effect comparable to the original Cry-consensus peptide, while the mixture of amino acids constituting Cry-consensus peptide did not stimulate PBMC proliferation. PBMCs from healthy subjects did not respond to Cry-consensus peptide at all. These data indicate that the PBMC response of patients suffering from JC pollinosis to Cry-consensus peptide is specific for the sequence of T cell epitopes thereof and may be useful for the evaluation of the efficacy of Cry-consensus peptide in vivo.
Hydroxyapatite-binding peptides for bone growth and inhibition
Bertozzi, Carolyn R [Berkeley, CA; Song, Jie [Shrewsbury, MA; Lee, Seung-Wuk [Walnut Creek, CA
2011-09-20
Hydroxyapatite (HA)-binding peptides are selected using combinatorial phage library display. Pseudo-repetitive consensus amino acid sequences possessing periodic hydroxyl side chains in every two or three amino acid sequences are obtained. These sequences resemble the (Gly-Pro-Hyp).sub.x repeat of human type I collagen, a major component of extracellular matrices of natural bone. A consistent presence of basic amino acid residues is also observed. The peptides are synthesized by the solid-phase synthetic method and then used for template-driven HA-mineralization. Microscopy reveal that the peptides template the growth of polycrystalline HA crystals .about.40 nm in size.
Cloning and sequence analysis of the invertase gene INV 1 from the yeast Pichia anomala.
Pérez, J A; Rodríguez, J; Rodríguez, L; Ruiz, T
1996-02-01
A genomic library from the yeast Pichia anomala has been constructed and employed to clone the gene encoding the sucrose-hydrolysing enzyme invertase by complementation of a sucrose non-fermenting mutant of Saccharomyces cerevisiae. The cloned gene, INV1, was sequenced and found to encode a polypeptide of 550 amino acids which contained a 22 amino-acid signal sequence and ten potential glycosylation sites. The amino-acid sequence shows significant identity with other yeast invertases and also with Kluyveromyces marxianus inulinase, a yeast beta-fructofuranosidase which has a different substrate specificity. The nucleotide sequences of the 5' and 3' non-coding regions were found to contain several consensus motifs probably involved in the initiation and termination of gene transcription.
Provencher, Cathy; LaPointe, Gisèle; Sirois, Stéphane; Van Calsteren, Marie-Rose; Roy, Denis
2003-01-01
A primer design strategy named CODEHOP (consensus-degenerate hybrid oligonucleotide primer) for amplification of distantly related sequences was used to detect the priming glycosyltransferase (GT) gene in strains of the Lactobacillus casei group. Each hybrid primer consisted of a short 3′ degenerate core based on four highly conserved amino acids and a longer 5′ consensus clamp region based on six sequences of the priming GT gene products from exopolysaccharide (EPS)-producing bacteria. The hybrid primers were used to detect the priming GT gene of 44 commercial isolates and reference strains of Lactobacillus rhamnosus, L. casei, Lactobacillus zeae, and Streptococcus thermophilus. The priming GT gene was detected in the genome of both non-EPS-producing (EPS−) and EPS-producing (EPS+) strains of L. rhamnosus. The sequences of the cloned PCR products were similar to those of the priming GT gene of various gram-negative and gram-positive EPS+ bacteria. Specific primers designed from the L. rhamnosus RW-9595M GT gene were used to sequence the end of the priming GT gene in selected EPS+ strains of L. rhamnosus. Phylogenetic analysis revealed that Lactobacillus spp. form a distinctive group apart from other lactic acid bacteria for which GT genes have been characterized to date. Moreover, the sequences show a divergence existing among strains of L. rhamnosus with respect to the terminal region of the priming GT gene. Thus, the PCR approach with consensus-degenerate hybrid primers designed with CODEHOP is a practical approach for the detection of similar genes containing conserved motifs in different bacterial genomes. PMID:12788729
USDA-ARS?s Scientific Manuscript database
Lipase gene (lip) of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing bacterium P. resinovorans NRRL B-2649 was cloned, sequenced and characterized by using consensus primers and PCR-based genome walking method. The ORF of the putative Lip (314 amino acids) and its active site (Ser111, Asp...
Mosaic protein and nucleic acid vaccines against hepatitis C virus
Yusim, Karina; Korber, Bette T. M.; Kuiken, Carla L.; Fischer, William M.
2013-06-11
The invention relates to immunogenic compositions useful as HCV vaccines. Provided are HCV mosaic polypeptide and nucleic acid compositions which provide higher levels of T-cell epitope coverage while minimizing the occurrence of unnatural and rare epitopes compared to natural HCV polypeptides and consensus HCV sequences.
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.
Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C
2018-01-10
Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
Homologues of insulinase, a new superfamily of metalloendopeptidases.
Rawlings, N D; Barrett, A J
1991-01-01
On the basis of a statistical analysis of an alignment of the amino acid sequences, a new superfamily of metalloendopeptidases is proposed, consisting of human insulinase, Escherichia coli protease III and mitochondrial processing endopeptidases from Saccharomyces and Neurospora. These enzymes do not contain the 'HEXXH' consensus sequence found in all previously recognized zinc metalloendopeptidases. PMID:2025223
Bowen, David M; Lewis, Jessica A; Lu, Wenzhe; Schein, Catherine H
2012-09-14
Designing proteins that reflect the natural variability of a pathogen is essential for developing novel vaccines and drugs. Flaviviruses, including Dengue (DENV) and West Nile (WNV), evolve rapidly and can "escape" neutralizing monoclonal antibodies by mutation. Designing antigens that represent many distinct strains is important for DENV, where infection with a strain from one of the four serotypes may lead to severe hemorrhagic disease on subsequent infection with a strain from another serotype. Here, a DENV physicochemical property (PCP)-consensus sequence was derived from 671 unique sequences from the Flavitrack database. PCP-consensus proteins for domain 3 of the envelope protein (EdomIII) were expressed from synthetic genes in Escherichia coli. The ability of the purified consensus proteins to bind polyclonal antibodies generated in response to infection with strains from each of the four DENV serotypes was determined. The initial consensus protein bound antibodies from DENV-1-3 in ELISA and Western blot assays. This sequence was altered in 3 steps to incorporate regions of maximum variability, identified as significant changes in the PCPs, characteristic of DENV-4 strains. The final protein was recognized by antibodies against all four serotypes. Two amino acids essential for efficient binding to all DENV antibodies are part of a discontinuous epitope previously defined for a neutralizing monoclonal antibody. The PCP-consensus method can significantly reduce the number of experiments required to define a multivalent antigen, which is particularly important when dealing with pathogens that must be tested at higher biosafety levels. Copyright © 2012 Elsevier Ltd. All rights reserved.
Brain cDNA clone for human cholinesterase
DOE Office of Scientific and Technical Information (OSTI.GOV)
McTiernan, C.; Adkins, S.; Chatonnet, A.
1987-10-01
A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
Klein, Wolfgang; Westendorf, Carolin; Schmidt, Antje; Conill-Cortés, Mercè; Rutz, Claudia; Blohs, Marcus; Beyermann, Michael; Protze, Jonas; Krause, Gerd; Krause, Eberhard; Schülein, Ralf
2015-01-01
The cyclodepsipeptide cotransin was described to inhibit the biosynthesis of a small subset of proteins by a signal sequence-discriminatory mechanism at the Sec61 protein-conducting channel. However, it was not clear how selective cotransin is, i.e. how many proteins are sensitive. Moreover, a consensus motif in signal sequences mediating cotransin sensitivity has yet not been described. To address these questions, we performed a proteomic study using cotransin-treated human hepatocellular carcinoma cells and the stable isotope labelling by amino acids in cell culture technique in combination with quantitative mass spectrometry. We used a saturating concentration of cotransin (30 micromolar) to identify also less-sensitive proteins and to discriminate the latter from completely resistant proteins. We found that the biosynthesis of almost all secreted proteins was cotransin-sensitive under these conditions. In contrast, biosynthesis of the majority of the integral membrane proteins was cotransin-resistant. Cotransin sensitivity of signal sequences was neither related to their length nor to their hydrophobicity. Instead, in the case of signal anchor sequences, we identified for the first time a conformational consensus motif mediating cotransin sensitivity. PMID:25806945
AMS 4.0: consensus prediction of post-translational modifications in protein sequences.
Plewczynski, Dariusz; Basu, Subhadip; Saha, Indrajit
2012-08-01
We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing.
Cofactor specificity switch in Shikimate dehydrogenase by rational design and consensus engineering.
García-Guevara, Fernando; Bravo, Iris; Martínez-Anaya, Claudia; Segovia, Lorenzo
2017-08-01
Consensus engineering has been used to design more stable variants using the most frequent amino acid at each site of a multiple sequence alignment; sometimes consensus engineering modifies function, but efforts have mainly been focused on studying stability. Here we constructed a consensus Rossmann domain for the Shikimate dehydrogenase enzyme; separately we decided to switch the cofactor specificity through rational design in the Escherichia coli Shikimate dehydrogenase enzyme and then analyzed the effect of consensus mutations on top of our design. We found that consensus mutations closest to the 2' adenine moiety increased the activity in our design. Consensus engineering has been shown to result in more stable proteins and our findings suggest it could also be used as a complementary tool for increasing or modifying enzyme activity during design. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sequence and structural analyses of nuclear export signals in the NESdb database
Xu, Darui; Farmer, Alicia; Collett, Garen; Grishin, Nick V.; Chook, Yuh Min
2012-01-01
We compiled >200 nuclear export signal (NES)–containing CRM1 cargoes in a database named NESdb. We analyzed the sequences and three-dimensional structures of natural, experimentally identified NESs and of false-positive NESs that were generated from the database in order to identify properties that might distinguish the two groups of sequences. Analyses of amino acid frequencies, sequence logos, and agreement with existing NES consensus sequences revealed strong preferences for the Φ1-X3-Φ2-X2-Φ3-X-Φ4 pattern and for negatively charged amino acids in the nonhydrophobic positions of experimentally identified NESs but not of false positives. Strong preferences against certain hydrophobic amino acids in the hydrophobic positions were also revealed. These findings led to a new and more precise NES consensus. More important, three-dimensional structures are now available for 68 NESs within 56 different cargo proteins. Analyses of these structures showed that experimentally identified NESs are more likely than the false positives to adopt α-helical conformations that transition to loops at their C-termini and more likely to be surface accessible within their protein domains or be present in disordered or unobserved parts of the structures. Such distinguishing features for real NESs might be useful in future NES prediction efforts. Finally, we also tested CRM1-binding of 40 NESs that were found in the 56 structures. We found that 16 of the NES peptides did not bind CRM1, hence illustrating how NESs are easily misidentified. PMID:22833565
Zimmermann, Karel; Gibrat, Jean-François
2010-01-04
Sequence comparisons make use of a one-letter representation for amino acids, the necessary quantitative information being supplied by the substitution matrices. This paper deals with the problem of finding a representation that provides a comprehensive description of amino acid intrinsic properties consistent with the substitution matrices. We present a Euclidian vector representation of the amino acids, obtained by the singular value decomposition of the substitution matrices. The substitution matrix entries correspond to the dot product of amino acid vectors. We apply this vector encoding to the study of the relative importance of various amino acid physicochemical properties upon the substitution matrices. We also characterize and compare the PAM and BLOSUM series substitution matrices. This vector encoding introduces a Euclidian metric in the amino acid space, consistent with substitution matrices. Such a numerical description of the amino acid is useful when intrinsic properties of amino acids are necessary, for instance, building sequence profiles or finding consensus sequences, using machine learning algorithms such as Support Vector Machine and Neural Networks algorithms.
Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia
2017-01-01
Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613
Cloning and characterization of the gene encoding IMP dehydrogenase from Arabidopsis thaliana.
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Arabidopsis thaliana (At). The transcription unit of the At gene spans approximately 1900 bp and specifies a protein of 503 amino acids with a calculated relative molecular mass (M(r)) of 54,190. The gene is comprised of a minimum of four introns and five exons with all donor and acceptor splice sequences conforming to previously proposed consensus sequences. The deduced IMPDH amino-acid sequence from At shows a remarkable similarity to other eukaryotic IMPDH sequences, with a 48% identity to human Type II enzyme. Allowing for conservative substitutions, the enzyme is 69% similar to human Type II IMPDH. The putative active-site sequence of At IMPDH conforms to the IMP dehydrogenase/guanosine monophosphate reductase motif and contains an essential active-site cysteine residue.
Bricheux, G; Brugerolle, G
1997-08-01
The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
Detection of the CLOCK/BMAL1 heterodimer using a nucleic acid probe with cycling probe technology.
Nakagawa, Kazuhiro; Yamamoto, Takuro; Yasuda, Akio
2010-09-15
An isothermal signal amplification technique for specific DNA sequences, known as cycling probe technology (CPT), has enabled rapid acquisition of genomic information. Here we report an analogous technique for the detection of an activated transcription factor, a transcription element-binding assay with fluorescent amplification by apurinic/apyrimidinic (AP) site lysis cycle (TEFAL). This simple amplification assay can detect activated transcription factors by using a unique nucleic acid probe containing a consensus binding sequence and an AP site, which enables the CPT reaction with AP endonuclease. In this article, we demonstrate that this method detects the functional CLOCK/BMAL1 heterodimer via the TEFAL probe containing the E-box consensus sequence to which the CLOCK/BMAL1 heterodimer binds. Using TEFAL combined with immunoassays, we measured oscillations in the amount of CLOCK/BMAL1 heterodimer in serum-stimulated HeLa cells. Furthermore, we succeeded in measuring the circadian accumulation of the functional CLOCK/BMAL1 heterodimer in human buccal mucosa cells. TEFAL contributes greatly to the study of transcription factor activation in mammalian tissues and cell extracts and is a powerful tool for less invasive investigation of human circadian rhythms. 2010 Elsevier Inc. All rights reserved.
PipeOnline 2.0: automated EST processing and functional data sorting.
Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A
2002-11-01
Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.
Deiana, Antonio; Giansanti, Andrea
2010-04-21
Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated.
2010-01-01
Background Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. Results In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Conclusions Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated. PMID:20409339
Nucleotide Sequence of the blaRTG-2 (CARB-5) Gene and Phylogeny of a New Group of Carbenicillinases
Choury, Daniele; Szajnert, Marie-France; Joly-Guillou, Marie-Laure; Azibi, Kemal; Delpech, Marc; Paul, Gérard
2000-01-01
We determined the nucleotide sequence of the bla gene for the Acinetobacter calcoaceticus β-lactamase previously described as CARB-5. Alignment of the deduced amino acid sequence with those of known β-lactamases revealed that CARB-5 possesses an RTG triad in box VII, as described for the Proteus mirabilis GN79 enzyme, instead of the RSG consensus characteristic of the other carbenicillinases. Phylogenetic studies showed that these RTG enzymes constitute a new, separate group, possibly ancestors of the carbenicillinase family. PMID:10722515
Jerjos, Michael; Hohman, Baily; Lauterbur, M. Elise; Kistler, Logan
2017-01-01
Abstract Several taxonomically distinct mammalian groups—certain microbats and cetaceans (e.g., dolphins)—share both morphological adaptations related to echolocation behavior and strong signatures of convergent evolution at the amino acid level across seven genes related to auditory processing. Aye-ayes (Daubentonia madagascariensis) are nocturnal lemurs with a specialized auditory processing system. Aye-ayes tap rapidly along the surfaces of trees, listening to reverberations to identify the mines of wood-boring insect larvae; this behavior has been hypothesized to functionally mimic echolocation. Here we investigated whether there are signals of convergence in auditory processing genes between aye-ayes and known mammalian echolocators. We developed a computational pipeline (Basic Exon Assembly Tool) that produces consensus sequences for regions of interest from shotgun genomic sequencing data for nonmodel organisms without requiring de novo genome assembly. We reconstructed complete coding region sequences for the seven convergent echolocating bat–dolphin genes for aye-ayes and another lemur. We compared sequences from these two lemurs in a phylogenetic framework with those of bat and dolphin echolocators and appropriate nonecholocating outgroups. Our analysis reaffirms the existence of amino acid convergence at these loci among echolocating bats and dolphins; some methods also detected signals of convergence between echolocating bats and both mice and elephants. However, we observed no significant signal of amino acid convergence between aye-ayes and echolocating bats and dolphins, suggesting that aye-aye tap-foraging auditory adaptations represent distinct evolutionary innovations. These results are also consistent with a developing consensus that convergent behavioral ecology does not reliably predict convergent molecular evolution. PMID:28810710
The Malarial Host-Targeting Signal Is Conserved in the Irish Potato Famine Pathogen
Liolios, Konstantinos; Win, Joe; Kanneganti, Thirumala-Devi; Young, Carolyn; Kamoun, Sophien; Haldar, Kasturi
2006-01-01
Animal and plant eukaryotic pathogens, such as the human malaria parasite Plasmodium falciparum and the potato late blight agent Phytophthora infestans, are widely divergent eukaryotic microbes. Yet they both produce secretory virulence and pathogenic proteins that alter host cell functions. In P. falciparum, export of parasite proteins to the host erythrocyte is mediated by leader sequences shown to contain a host-targeting (HT) motif centered on an RxLx (E, D, or Q) core: this motif appears to signify a major pathogenic export pathway with hundreds of putative effectors. Here we show that a secretory protein of P. infestans, which is perceived by plant disease resistance proteins and induces hypersensitive plant cell death, contains a leader sequence that is equivalent to the Plasmodium HT-leader in its ability to export fusion of green fluorescent protein (GFP) from the P. falciparum parasite to the host erythrocyte. This export is dependent on an RxLR sequence conserved in P. infestans leaders, as well as in leaders of all ten secretory oomycete proteins shown to function inside plant cells. The RxLR motif is also detected in hundreds of secretory proteins of P. infestans, Phytophthora sojae, and Phytophthora ramorum and has high value in predicting host-targeted leaders. A consensus motif further reveals E/D residues enriched within ~25 amino acids downstream of the RxLR, which are also needed for export. Together the data suggest that in these plant pathogenic oomycetes, a consensus HT motif may reside in an extended sequence of ~25–30 amino acids, rather than in a short linear sequence. Evidence is presented that although the consensus is much shorter in P. falciparum, information sufficient for vacuolar export is contained in a region of ~30 amino acids, which includes sequences flanking the HT core. Finally, positional conservation between Phytophthora RxLR and P. falciparum RxLx (E, D, Q) is consistent with the idea that the context of their presentation is constrained. These studies provide the first evidence to our knowledge that eukaryotic microbes share equivalent pathogenic HT signals and thus conserved mechanisms to access host cells across plant and animal kingdoms that may present unique targets for prophylaxis across divergent pathogens. PMID:16733545
Takai, T; Nishita, Y; Iguchi-Ariga, S M; Ariga, H
1994-01-01
We have previously reported the human cDNA encoding MSSP-1, a sequence-specific double- and single-stranded DNA binding protein [Negishi, Nishita, Saëgusa, Kakizaki, Galli, Kihara, Tamai, Miyajima, Iguchi-Ariga and Ariga (1994) Oncogene, 9, 1133-1143]. MSSP-1 binds to a DNA replication origin/transcriptional enhancer of the human c-myc gene and has turned out to be identical with Scr2, a human protein which complements the defect of cdc2 kinase in S.pombe [Kataoka and Nojima (1994) Nucleic Acid Res., 22, 2687-2693]. We have cloned the cDNA for MSSP-2, another member of the MSSP family of proteins. The MSSP-2 cDNA shares highly homologous sequences with MSSP-1 cDNA, except for the insertion of 48 bp coding 16 amino acids near the C-terminus. Like MSSP-1, MSSP-2 has RNP-1 consensus sequences. The results of the experiments using bacterially expressed MSSP-2, and its deletion mutants, as histidine fusion proteins suggested that the binding specificity of MSSP-2 to double- and single-stranded DNA is the same as that of MSSP-1, and that the RNP consensus sequences are required for the DNA binding of the protein. MSSP-2 stimulated the DNA replication of an SV40-derived plasmid containing the binding sequence for MSSP-1 or -2. MSSP-2 is hence suggested to play an important role in regulation of DNA replication. Images PMID:7838710
NASA Technical Reports Server (NTRS)
Funderburgh, J. L.; Funderburgh, M. L.; Brown, S. J.; Vergnes, J. P.; Hassell, J. R.; Mann, M. M.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)
1993-01-01
Amino acid sequence from tryptic peptides of three different bovine corneal keratan sulfate proteoglycan (KSPG) core proteins (designated 37A, 37B, and 25) showed similarities to the sequence of a chicken KSPG core protein lumican. Bovine lumican cDNA was isolated from a bovine corneal expression library by screening with chicken lumican cDNA. The bovine cDNA codes for a 342-amino acid protein, M(r) 38,712, containing amino acid sequences identified in the 37B KSPG core protein. The bovine lumican is 68% identical to chicken lumican, with an 83% identity excluding the N-terminal 40 amino acids. Location of 6 cysteine and 4 consensus N-glycosylation sites in the bovine sequence were identical to those in chicken lumican. Bovine lumican had about 50% identity to bovine fibromodulin and 20% identity to bovine decorin and biglycan. About two-thirds of the lumican protein consists of a series of 10 amino acid leucine-rich repeats that occur in regions of calculated high beta-hydrophobic moment, suggesting that the leucine-rich repeats contribute to beta-sheet formation in these proteins. Sequences obtained from 37A and 25 core proteins were absent in bovine lumican, thus predicting a unique primary structure and separate mRNA for each of the three bovine KSPG core proteins.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Soo-Ik; Hammes, G.G.
1989-11-01
Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chickenmore » and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.« less
Titus, James K; Kay, Matthew K; Glaser, CDR Jacob J
2017-01-01
Snakebite envenomation is an important global health concern. The current standard treatment approach for snakebite envenomation relies on antibody-based antisera, which are expensive, not universally available, and can lead to adverse physiological effects. Phage display techniques offer a powerful tool for the selection of phage-expressed peptides, which can bind with high specificity and affinity towards venom components. In this research, the amino acid sequences of Phospholipase A2 (PLA2) from multiple cottonmouth species were analyzed, and a consensus peptide synthesized. Three phage display libraries were panned against this consensus peptide, crosslinked to capillary tubes, followed by a modified surface panning procedure. This high throughput selection method identified four phage clones with anti-PLA2 activity against Western cottonmouth venom, and the amino acid sequences of the displayed peptides were identified. This is the first report identifying short peptide sequences capable of inhibiting PLA2 activity of Western cottonmouth venom in vitro, using a phage display technique. Additionally, this report utilizes synthetic panning targets, designed using venom proteomic data, to mimic epitope regions. M13 phages displaying circular 7-mer or linear 12-mer peptides with antivenom activity may offer a novel alternative to traditional antibody-based therapy. PMID:29285351
Titus, James K; Kay, Matthew K; Glaser, Cdr Jacob J
2017-01-01
Snakebite envenomation is an important global health concern. The current standard treatment approach for snakebite envenomation relies on antibody-based antisera, which are expensive, not universally available, and can lead to adverse physiological effects. Phage display techniques offer a powerful tool for the selection of phage-expressed peptides, which can bind with high specificity and affinity towards venom components. In this research, the amino acid sequences of Phospholipase A 2 (PLA 2 ) from multiple cottonmouth species were analyzed, and a consensus peptide synthesized. Three phage display libraries were panned against this consensus peptide, crosslinked to capillary tubes, followed by a modified surface panning procedure. This high throughput selection method identified four phage clones with anti-PLA 2 activity against Western cottonmouth venom, and the amino acid sequences of the displayed peptides were identified. This is the first report identifying short peptide sequences capable of inhibiting PLA 2 activity of Western cottonmouth venom in vitro , using a phage display technique. Additionally, this report utilizes synthetic panning targets, designed using venom proteomic data, to mimic epitope regions. M13 phages displaying circular 7-mer or linear 12-mer peptides with antivenom activity may offer a novel alternative to traditional antibody-based therapy.
Chromosome-Encoded Broad-Spectrum Ambler Class A β-Lactamase RUB-1 from Serratia rubidaea
Didi, Jennifer; Ergani, Ayla; Lima, Sandra
2016-01-01
ABSTRACT Whole-genome sequencing of Serratia rubidaea CIP 103234T revealed a chromosomally located Ambler class A β-lactamase gene. The gene was cloned, and the β-lactamase, RUB-1, was characterized. RUB-1 displayed 74% and 73% amino acid sequence identity with the GIL-1 and TEM-1 penicillinases, respectively, and its substrate profile was similar to that of the latter β-lactamases. Analysis by 5′ rapid amplification of cDNA ends revealed promoter sequences highly divergent from the Escherichia coli σ70 consensus sequence. This work further illustrates the heterogeneity of β-lactamases among Serratia spp. PMID:27956418
Chromosome-Encoded Broad-Spectrum Ambler Class A β-Lactamase RUB-1 from Serratia rubidaea.
Bonnin, Rémy A; Didi, Jennifer; Ergani, Ayla; Lima, Sandra; Naas, Thierry
2017-02-01
Whole-genome sequencing of Serratia rubidaea CIP 103234 T revealed a chromosomally located Ambler class A β-lactamase gene. The gene was cloned, and the β-lactamase, RUB-1, was characterized. RUB-1 displayed 74% and 73% amino acid sequence identity with the GIL-1 and TEM-1 penicillinases, respectively, and its substrate profile was similar to that of the latter β-lactamases. Analysis by 5' rapid amplification of cDNA ends revealed promoter sequences highly divergent from the Escherichia coli σ 70 consensus sequence. This work further illustrates the heterogeneity of β-lactamases among Serratia spp. Copyright © 2017 American Society for Microbiology.
Molecular cloning of crustins from the hemocytes of Brazilian penaeid shrimps.
Rosa, Rafael Diego; Bandeira, Paula Terra; Barracco, Margherita Anna
2007-09-01
Crustins are antimicrobial peptides initially identified in the hemocytes of the crab Carcinus maenas (11.5-kDa peptide or carcinin) and recently also recognized in penaeid shrimps and other crustacean species. The aim of this study was to identify sequences encoding for crustins from the hemocytes of four Brazilian penaeid species: Farfantepenaeus paulensis, Farfantepenaeus subtilis, Farfantepenaeus brasiliensis and Litopenaeus schmitti. Using primers based on consensus nucleotide alignment of crustins from different crustaceans, cDNA sequences coding for crustins in all indigenous penaeid species were amplified. The obtained four crustin sequences encoded for peptides containing a hydrophobic N-terminal region rich in glycine repeats and a C-terminal part with 12 cysteine residues and a conserved whey acidic protein domain. All obtained crustin sequences showed high amino acidic similarity among each other and with crustins from litopenaeid shrimps (76-98%). This is the first report of crustins in native Brazilian penaeid shrimps.
Analysis of the regulatory region of the protease III (ptr) gene of Escherichia coli K-12.
Claverie-Martin, F; Diaz-Torres, M R; Kushner, S R
1987-01-01
The ptr gene of Escherichia coli encodes protease III (Mr 110,000) and a 50-kDa polypeptide, both of which are found in the periplasmic space. The gene is physically located between the recC and recB loci on the E. coli chromosome. The nucleotide sequence of a 1167-bp EcoRV-ClaI fragment of chromosomal DNA containing the promoter region and 885 bp of the ptr coding sequence has been determined. S1 nuclease mapping analysis showed that the major 5' end of the ptr mRNA was localized 127 bp upstream from the ATG start codon. The open reading frame (ORF), preceded by a Shine-Dalgarno sequence, extends to the end of the sequenced DNA. Downstream from the -35 and -10 regions is a sequence that strongly fits the consensus sequence of known nitrogen-regulated promoters. A signal peptide of 23 amino acids residues is present at the N terminus of the derived amino acid sequence. The cleavage site as well as the ORF were confirmed by sequencing the N terminus of mature protease III.
Regulation of the alpha-glucuronidase-encoding gene ( aguA) from Aspergillus niger.
de Vries, R P; van de Vondervoort, P J I; Hendriks, L; van de Belt, M; Visser, J
2002-09-01
The alpha-glucuronidase gene aguA from Aspergillus niger was cloned and characterised. Analysis of the promoter region of aguA revealed the presence of four putative binding sites for the major carbon catabolite repressor protein CREA and one putative binding site for the transcriptional activator XLNR. In addition, a sequence motif was detected which differed only in the last nucleotide from the XLNR consensus site. A construct in which part of the aguA coding region was deleted still resulted in production of a stable mRNA upon transformation of A. niger. The putative XLNR binding sites and two of the putative CREA binding sites were mutated individually in this construct and the effects on expression were examined in A. niger transformants. Northern analysis of the transformants revealed that the consensus XLNR site is not actually functional in the aguA promoter, whereas the sequence that diverges from the consensus at a single position is functional. This indicates that XLNR is also able to bind to the sequence GGCTAG, and the XLNR binding site consensus should therefore be changed to GGCTAR. Both CREA sites are functional, indicating that CREA has a strong influence on aguA expression. A detailed expression analysis of aguA in four genetic backgrounds revealed a second regulatory system involved in activation of aguA gene expression. This system responds to the presence of glucuronic and galacturonic acids, and is not dependent on XLNR.
Yuryev, A.; Corden, J. L.
1996-01-01
The largest subunit of RNA polymerase II contains a repetitive C-terminal domain (CTD) consisting of tandem repeats of the consensus sequence Tyr(1)Ser(2)Pro(3)Thr(4) Ser(5)Pro(6) Ser(7). Substitution of nonphosphorylatable amino acids at positions two or five of the Saccharomyces cerevisiae CTD is lethal. We developed a selection ssytem for isolating suppressors of this lethal phenotype and cloned a gene, SCA1 (suppressor of CTD alanine), which complements recessive suppressors of lethal multiple-substitution mutations. A partial deletion of SCA1 (sca1Δ::hisG) suppresses alanine or glutamate substitutions at position two of the consensus CTD sequence, and a lethal CTD truncation mutation, but SCA1 deletion does not suppress alanine or glutamate substitutions at position five. SCA1 is identical to SRB9, a suppressor of a cold-sensitive CTD truncation mutation. Strains carrying dominant SRB mutations have the same suppression properties as a sca1Δ::hisG strain. These results reveal a functional difference between positions two and five of the consensus CTD heptapeptide repeat. The ability of SCA1 and SRB mutant alleles to suppress CTD truncation mutations suggest that substitutions at position two, but not at position five, cause a defect in RNA polymerase II function similar to that introduced by CTD truncation. PMID:8725217
Bankoff, Richard J; Jerjos, Michael; Hohman, Baily; Lauterbur, M Elise; Kistler, Logan; Perry, George H
2017-07-01
Several taxonomically distinct mammalian groups-certain microbats and cetaceans (e.g., dolphins)-share both morphological adaptations related to echolocation behavior and strong signatures of convergent evolution at the amino acid level across seven genes related to auditory processing. Aye-ayes (Daubentonia madagascariensis) are nocturnal lemurs with a specialized auditory processing system. Aye-ayes tap rapidly along the surfaces of trees, listening to reverberations to identify the mines of wood-boring insect larvae; this behavior has been hypothesized to functionally mimic echolocation. Here we investigated whether there are signals of convergence in auditory processing genes between aye-ayes and known mammalian echolocators. We developed a computational pipeline (Basic Exon Assembly Tool) that produces consensus sequences for regions of interest from shotgun genomic sequencing data for nonmodel organisms without requiring de novo genome assembly. We reconstructed complete coding region sequences for the seven convergent echolocating bat-dolphin genes for aye-ayes and another lemur. We compared sequences from these two lemurs in a phylogenetic framework with those of bat and dolphin echolocators and appropriate nonecholocating outgroups. Our analysis reaffirms the existence of amino acid convergence at these loci among echolocating bats and dolphins; some methods also detected signals of convergence between echolocating bats and both mice and elephants. However, we observed no significant signal of amino acid convergence between aye-ayes and echolocating bats and dolphins, suggesting that aye-aye tap-foraging auditory adaptations represent distinct evolutionary innovations. These results are also consistent with a developing consensus that convergent behavioral ecology does not reliably predict convergent molecular evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Haynes, Barton F [Durham, NC; Gao, Feng [Durham, NC; Korber, Bette T [Los Alamos, NM; Hahn, Beatrice H [Birmingham, AL; Shaw, George M [Birmingham, AL; Kothe, Denise [Birmingham, AL; Li, Ying Ying [Hoover, AL; Decker, Julie [Alabaster, AL; Liao, Hua-Xin [Chapel Hill, NC
2011-12-06
The present invention relates, in general, to an immunogen and, in particular, to an immunogen for inducing antibodies that neutralizes a wide spectrum of HIV primary isolates and/or to an immunogen that induces a T cell immune response. The invention also relates to a method of inducing anti-HIV antibodies, and/or to a method of inducing a T cell immune response, using such an immunogen. The invention further relates to nucleic acid sequences encoding the present immunogens.
Hirotani, M; Kuroda, R; Suzuki, H; Yoshikawa, T
2000-05-01
A cDNA encoding UDP-glucose: baicalein 7-O-glucosyltransferase (UBGT) was isolated from a cDNA library from hairy root cultures of Scutellaria baicalensis Georgi probed with a partial-length cDNA clone of a UDP-glucose: flavonoid 3-O-glucosyltransferase (UFGT) from grape (Vitis vinifera L.). The heterologous probe contained a glucosyltransferase consensus amino acid sequence which was also present in the Scutellaria cDNA clones. The complete nucleotide sequence of the 1688-bp cDNA insert was determined and the deduced amino acid sequences are presented. The nucleotide sequence analysis of UBGT revealed an open reading frame encoding a polypeptide of 476 amino acids with a calculated molecular mass of 53,094 Da. The reaction product for baicalein and UDP-glucose catalyzed by recombinant UBGT in Escherichia coli was identified as authentic baicalein 7-O-glucoside using high-performance liquid chromatography and proton nuclear magnetic resonance spectroscopy. The enzyme activities of recombinant UBGT expressed in E. coli were also detected towards flavonoids such as baicalein, wogonin, apigenin, scutellarein, 7,4'-dihydroxyflavone and kaempferol, and phenolic compounds. The accumulation of UBGT mRNA in hairy roots was in response to wounding or salicylic acid treatments.
A retrotransposable element from the mosquito Anopheles gambiae .
Besansky, N J
1990-01-01
A family of middle repetitive elements from the African malaria vector Anopheles gambiae is described. Approximately 100 copies of the element, designated T1Ag, are dispersed in the genome. Full-length elements are 4.6 kilobase pairs in length, but truncation of the 5' end is common. Nucleotide sequences of one full-length, two 5'-truncated, and two 5' ends of T1Ag elements were determined and aligned to define a consensus sequence. Sequence analysis revealed two long, overlapping open reading frames followed by a polyadenylation signal, AATAAA, and a tail consisting of tandem repetitions of the motif TGAAA. No direct or inverted long terminal repeats (LTRs) were detected. The first open reading frame, 442 amino acids in length, includes a domain resembling that of nucleic acid-binding proteins. The second open reading frame, 975 amino acids long, resembles the reverse transcriptases of a category of retrotransposable elements without LTRs, variously termed class II retrotransposons, class III elements or non-LTR retrotransposons. Similarity at the sequence and structural levels places T1Ag in this category. Images PMID:1689457
Veldman, G M; Klootwijk, J; van Heerikhuizen, H; Planta, R J
1981-01-01
We have determined the nucleotide sequence of part of a cloned yeast ribosomal RNA operon extending from the 5.8S RNA gene downstream into the 5' -terminal region of the 26S RNA gene. We mapped the pertinent processing sites, viz. the 5' end of 26S rRNA and the 3'ends of 5.8S rRNA and its immediate precursor, 7S RNA. At the 3' end of 7S RNA we find the sequence UCGUUU which is very similar to the type I consensus sequence UCAUUA/U present at the 3' ends of 17S, 5.8S and 26S rRNA as well as 18S precursor rRNA in yeast. At the 5' end of the 26S RNA gene we find a sequence of thirteen nucleotides which is homologous to the type II sequence present at the 5' termini of both the 17S and the 5.8S RNA gene. These findings further support the suggestion put forward earlier (G.M. Veldman et al. (1980) Nucl. Acids Res. 8, 2907-2920) that both consensus sequences are involved in the recognition of precursor rRNA by the processing nuclease(s). We discuss a model for the processing of yeast rRNA in which a processing enzyme sequentially recognizes several combinations of a type I and a type II consensus sequence. We also describe the existence of a significant base complementarity between sequences in the 5' -terminal region of 26S rRNA and the 3' -terminal region of 5.8S rRNA. We suggest that base pairing between these sequences contributes to the binding between 5.8S and 26S rRNA. Images PMID:7312619
Hepatitis delta genotypes in chronic delta infection in the northeast of Spain (Catalonia).
Cotrina, M; Buti, M; Jardi, R; Quer, J; Rodriguez, F; Pascual, C; Esteban, R; Guardia, J
1998-06-01
Based on genetic analysis of variants obtained around the world, three genotypes of the hepatitis delta virus have been defined. Hepatitis delta virus variants have been associated with different disease patterns and geographic distributions. To determine the prevalence of hepatitis delta virus genotypes in the northeast of Spain (Catalonia) and the correlation with transmission routes and clinical disease, we studied the nucleotide divergence of the consensus sequence of HDV RNA obtained from 33 patients with chronic delta hepatitis (24 were intravenous drug users and nine had no risk factors), and four patients with acute self-limited delta infection. Serum HDV RNA was amplified by the polymerase chain reaction technique and a fragment of 350 nucleotides (nt 910 to 1259) was directly sequenced. Genetic analysis of the nucleotide consensus sequence obtained showed a high degree of conservation among sequences (93% of mean). Comparison of these sequences with those derived from different geographic areas and pertaining to genotypes I, II and III, showed a mean sequence identity of 92% with genotype I, 73% with genotype II and 61% with genotype III. At the amino acid level (aa 115 to 214), the mean identity was 87% with genotype I, 63% with genotype II and 56% with genotype III. Conserved regions included the RNA editing domain, the carboxyl terminal 19 amino acids of the hepatitis delta antigen and the polyadenylation signal of the viral mRNA. Hepatitis delta virus isolates in the northeast of Spain are exclusively genotype I, independently of the transmission route and the type of infection. No hepatitis delta virus subgenotypes were found, suggesting that the origin of hepatitis delta virus infection in our geographical area is homogeneous.
Vázquez, Martín; Ben-Dov, Claudia; Lorenzi, Hernan; Moore, Troy; Schijman, Alejandro; Levin, Mariano J.
2000-01-01
The short interspersed repetitive element (SIRE) of Trypanosoma cruzi was first detected when comparing the sequences of loci that encode the TcP2β genes. It is present in about 1,500–3,000 copies per genome, depending on the strain, and it is distributed in all chromosomes. An initial analysis of SIRE sequences from 21 genomic fragments allowed us to derive a consensus nucleotide sequence and structure for the element, consisting of three regions (I, II, and III) each harboring distinctive features. Analysis of 158 transcribed SIREs demonstrates that the consensus is highly conserved. The sequences of 51 cDNAs show that SIRE is included in the 3′ end of several mRNAs, always transcribed from the sense strand, contributing the polyadenylation site in 63% of the cases. This study led to the characterization of VIPER (vestigial interposed retroelement), a 2,326-bp-long unusual retroelement. VIPER's 5′ end is formed by the first 182 bp of SIRE, whereas its 3′ end is formed by the last 220 bp of the element. Both SIRE moieties are connected by a 1,924-bp-long fragment that carries a unique ORF encoding a complete reverse transcriptase-RNase H gene whose 15 C-terminal amino acids derive from codons specified by SIRE's region II. The amino acid sequence of VIPER's reverse transcriptase-RNase H shares significant homology to that of long terminal repeat retrotransposons. The fact that SIRE and VIPER sequences are found only in the T. cruzi genome may be of relevance for studies concerning the evolution and the genome flexibility of this protozoan parasite. PMID:10688909
WebLogo: A Sequence Logo Generator
Crooks, Gavin E.; Hon, Gary; Chandonia, John-Marc; Brenner, Steven E.
2004-01-01
WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Each logo consists of stacks of letters, one stack for each position in the sequence. The overall height of each stack indicates the sequence conservation at that position (measured in bits), whereas the height of symbols within the stack reflects the relative frequency of the corresponding amino or nucleic acid at that position. WebLogo has been enhanced recently with additional features and options, to provide a convenient and highly configurable sequence logo generator. A command line interface and the complete, open WebLogo source code are available for local installation and customization. PMID:15173120
Oba, Mami; Tsuchiaka, Shinobu; Omatsu, Tsutomu; Katayama, Yukie; Otomaru, Konosuke; Hirata, Teppei; Aoki, Hiroshi; Murata, Yoshiteru; Makino, Shinji; Nagai, Makoto; Mizutani, Tetsuya
2018-01-08
We tested usefulness of a target enrichment system SureSelect, a comprehensive viral nucleic acid detection method, for rapid identification of viral pathogens in feces samples of cattle, pigs and goats. This system enriches nucleic acids of target viruses in clinical/field samples by using a library of biotinylated RNAs with sequences complementary to the target viruses. The enriched nucleic acids are amplified by PCR and subjected to next generation sequencing to identify the target viruses. In many samples, SureSelect target enrichment method increased efficiencies for detection of the viruses listed in the biotinylated RNA library. Furthermore, this method enabled us to determine nearly full-length genome sequence of porcine parainfluenza virus 1 and greatly increased Breadth, a value indicating the ratio of the mapping consensus length in the reference genome, in pig samples. Our data showed usefulness of SureSelect target enrichment system for comprehensive analysis of genomic information of various viruses in field samples. Copyright © 2017 Elsevier Inc. All rights reserved.
Camicia, Federico; Paredes, Rodolfo; Chalar, Cora; Galanti, Norbel; Kamenetzky, Laura; Gutierrez, Ariana; Rosenzvit, Mara C
2008-03-31
We have sequenced and partially characterized an Echinococcus granulosus cDNA, termed egat1, from a protoscolex signal sequence trap (SST) cDNA library. The isolated 1627 bp long cDNA contains an ORF of 489 amino acids and shows an amino acid identity of 30% with neutral and excitatory amino acid transporters members of the Dicarboxylate/Amino Acid Na+ and/or H+ Cation Symporter family (DAACS) (TC 2.A.23). Additional bioinformatics analysis of EgAT1, confirmed the results obtained by similarity searches and showed the presence of 9 to 10 transmembrane domains, consensus sequences for N-glycosylation between the third and fourth transmembrane domain, a highly similar hydropathy profile with ASCT1 (a known member of DAACS family), high score with SDF (Sodium Dicarboxilate Family) and similar motifs with EDTRANSPORT, a fingerprint of excitatory amino acid transporters. The localization of the putative amino acid transporter was analyzed by in situ hybridization and immunofluorescence in protoscoleces and associated germinal layer. The in situ hybridization labelling indicates the distribution of egat1 mRNA throughout the tegument. EgAT1 protein, which showed in Western blots a molecular mass of approximately 60 kD, is localized in the subtegumental region of the metacestode, particularly around suckers and rostellum of protoscoleces and layers from brood capsules. The sequence and expression analyses of EgAT1 pave the way for functional analysis of amino acids transporters of E. granulosus and its evaluation as new drug targets against cystic echinococcosis.
Consensus generation and variant detection by Celera Assembler.
Denisov, Gennady; Walenz, Brian; Halpern, Aaron L; Miller, Jason; Axelrod, Nelson; Levy, Samuel; Sutton, Granger
2008-04-15
We present an algorithm to identify allelic variation given a Whole Genome Shotgun (WGS) assembly of haploid sequences, and to produce a set of haploid consensus sequences rather than a single consensus sequence. Existing WGS assemblers take a column-by-column approach to consensus generation, and produce a single consensus sequence which can be inconsistent with the underlying haploid alleles, and inconsistent with any of the aligned sequence reads. Our new algorithm uses a dynamic windowing approach. It detects alleles by simultaneously processing the portions of aligned reads spanning a region of sequence variation, assigns reads to their respective alleles, phases adjacent variant alleles and generates a consensus sequence corresponding to each confirmed allele. This algorithm was used to produce the first diploid genome sequence of an individual human. It can also be applied to assemblies of multiple diploid individuals and hybrid assemblies of multiple haploid organisms. Being applied to the individual human genome assembly, the new algorithm detects exactly two confirmed alleles and reports two consensus sequences in 98.98% of the total number 2,033311 detected regions of sequence variation. In 33,269 out of 460,373 detected regions of size >1 bp, it fixes the constructed errors of a mosaic haploid representation of a diploid locus as produced by the original Celera Assembler consensus algorithm. Using an optimized procedure calibrated against 1 506 344 known SNPs, it detects 438 814 new heterozygous SNPs with false positive rate 12%. The open source code is available at: http://wgs-assembler.cvs.sourceforge.net/wgs-assembler/
GeneSilico protein structure prediction meta-server.
Kurowski, Michal A; Bujnicki, Janusz M
2003-07-01
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
GeneSilico protein structure prediction meta-server
Kurowski, Michal A.; Bujnicki, Janusz M.
2003-01-01
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313
Gál, Zita; Hegedüs, Csilla; Szakács, Gergely; Váradi, András; Sarkadi, Balázs; Özvegy-Laczka, Csilla
2015-02-01
Human ABCG2 is a plasma membrane glycoprotein causing multidrug resistance in cancer. Membrane cholesterol and bile acids are efficient regulators of ABCG2 function, while the molecular nature of the sterol-sensing sites has not been elucidated. The cholesterol recognition amino acid consensus (CRAC, L/V-(X)(1-5)-Y-(X)(1-5)-R/K) sequence is one of the conserved motifs involved in cholesterol binding in several proteins. We have identified five potential CRAC motifs in the transmembrane domain of the human ABCG2 protein. In order to define their roles in sterol-sensing, the central tyrosines of these CRACs (Y413, 459, 469, 570 and 645) were mutated to S or F and the mutants were expressed both in insect and mammalian cells. We found that mutation in Y459 prevented protein expression; the Y469S and Y645S mutants lost their activity; while the Y570S, Y469F, and Y645F mutants retained function as well as cholesterol and bile acid sensitivity. We found that in the case of the Y413S mutant, drug transport was efficient, while modulation of the ATPase activity by cholesterol and bile acids was significantly altered. We suggest that the Y413 residue within a putative CRAC motif has a role in sterol-sensing and the ATPase/drug transport coupling in the ABCG2 multidrug transporter. Copyright © 2014. Published by Elsevier B.V.
A FRET Biosensor for ROCK Based on a Consensus Substrate Sequence Identified by KISS Technology.
Li, Chunjie; Imanishi, Ayako; Komatsu, Naoki; Terai, Kenta; Amano, Mutsuki; Kaibuchi, Kozo; Matsuda, Michiyuki
2017-01-11
Genetically-encoded biosensors based on Förster/fluorescence resonance energy transfer (FRET) are versatile tools for studying the spatio-temporal regulation of signaling molecules within not only the cells but also tissues. Perhaps the hardest task in the development of a FRET biosensor for protein kinases is to identify the kinase-specific substrate peptide to be used in the FRET biosensor. To solve this problem, we took advantage of kinase-interacting substrate screening (KISS) technology, which deduces a consensus substrate sequence for the protein kinase of interest. Here, we show that a consensus substrate sequence for ROCK identified by KISS yielded a FRET biosensor for ROCK, named Eevee-ROCK, with high sensitivity and specificity. By treating HeLa cells with inhibitors or siRNAs against ROCK, we show that a substantial part of the basal FRET signal of Eevee-ROCK was derived from the activities of ROCK1 and ROCK2. Eevee-ROCK readily detected ROCK activation by epidermal growth factor, lysophosphatidic acid, and serum. When cells stably-expressing Eevee-ROCK were time-lapse imaged for three days, ROCK activity was found to increase after the completion of cytokinesis, concomitant with the spreading of cells. Eevee-ROCK also revealed a gradual increase in ROCK activity during apoptosis. Thus, Eevee-ROCK, which was developed from a substrate sequence predicted by the KISS technology, will pave the way to a better understanding of the function of ROCK in a physiological context.
Anderson, Carl W.; Connelly, Margery A.
2004-10-12
The present invention provides a method for detecting DNA-activated protein kinase (DNA-PK) activity in a biological sample. The method includes contacting a biological sample with a detectably-labeled phosphate donor and a synthetic peptide substrate defined by the following features to provide specific recognition and phosphorylation by DNA-PK: (1) a phosphate-accepting amino acid pair which may include serine-glutamine (Ser-Gln) (SQ), threonine-glutamine (Thr-Gln) (TQ), glutamine-serine (Gln-Ser) (QS), or glutamine-threonine (Gln-Thr) (QT); (2) enhancer amino acids which may include glutamic acid or glutamine immediately adjacent at the amino- or carboxyl- side of the amino acid pair and forming an amino acid pair-enhancer unit; (3) a first spacer sequence at the amino terminus of the amino acid pair-enhancer unit; (4) a second spacer sequence at the carboxyl terminus of the amino acid pair-enhancer unit, which spacer sequences may include any combination of amino acids that does not provide a phosphorylation site consensus sequence motif; and, (5) a tag moiety, which may be an amino acid sequence or another chemical entity that permits separating the synthetic peptide from the phosphate donor. A compostion and a kit for the detection of DNA-PK activity are also provided. Methods for detecting DNA, protein phosphatases and substances that alter the activity of DNA-PK are also provided. The present invention also provides a method of monitoring protein kinase and DNA-PK activity in living cells. -A composition and a kit for monitoring protein kinase activity in vitro and a composition and a kit for monitoring DNA-PK activities in living cells are also provided. A method for identifying agents that alter protein kinase activity in vitro and a method for identifying agents that alter DNA-PK activity in living cells are also provided.
Yarimizu, Tohru; Nakamura, Mikiko; Hoshida, Hisashi; Akada, Rinji
2015-02-14
Targeting of cellular proteins to the extracellular environment is directed by a secretory signal sequence located at the N-terminus of a secretory protein. These signal sequences usually contain an N-terminal basic amino acid followed by a stretch containing hydrophobic residues, although no consensus signal sequence has been identified. In this study, simple modeling of signal sequences was attempted using Gaussia princeps secretory luciferase (GLuc) in the yeast Kluyveromyces marxianus, which allowed comprehensive recombinant gene construction to substitute synthetic signal sequences. Mutational analysis of the GLuc signal sequence revealed that the GLuc hydrophobic peptide length was lower limit for effective secretion and that the N-terminal basic residue was indispensable. Deletion of the 16th Glu caused enhanced levels of secreted protein, suggesting that this hydrophilic residue defined the boundary of a hydrophobic peptide stretch. Consequently, we redesigned this domain as a repeat of a single hydrophobic amino acid between the N-terminal Lys and C-terminal Glu. Stretches consisting of Phe, Leu, Ile, or Met were effective for secretion but the number of residues affected secretory activity. A stretch containing sixteen consecutive methionine residues (M16) showed the highest activity; the M16 sequence was therefore utilized for the secretory production of human leukemia inhibitory factor protein in yeast, resulting in enhanced secreted protein yield. We present a new concept for the provision of secretory signal sequence ability in the yeast K. marxianus, determined by the number of residues of a single hydrophobic residue located between N-terminal basic and C-terminal acidic amino acid boundaries.
Structure and expression of the attacin genes in Hyalophora cecropia.
Sun, S C; Lindström, I; Lee, J Y; Faye, I
1991-02-26
To study the regulation of the immune genes in insects, we have cloned and sequenced the attacin gene locus of the giant silk moth Hyalophora cecropia. The locus contains one acidic and one basic attacin gene as well as two pseudogenes, which are remnants of basic attacin genes. A small insertion element was found within the locus. The two functional attacin genes are transcribed in opposite directions and have two introns inserted at homologous positions. A common sequence, GGGGATTCCT, is found at nucleotide position -48 in the acidic gene and at nucleotide position -58 in the basic gene. Interestingly, this decanucleotide is similar to the consensus of the NF-k B-binding site. Expression studies revealed that both attacins are strongly induced by phorbol 12-myristate 13-acetate, lipopolysaccharide and bacteria. However, only the acidic attacin gene showed a clear response to injury.
Predicting the transmembrane secondary structure of ligand-gated ion channels.
Bertaccini, E; Trudell, J R
2002-06-01
Recent mutational analyses of ligand-gated ion channels (LGICs) have demonstrated a plausible site of anesthetic action within their transmembrane domains. Although there is a consensus that the transmembrane domain is formed from four membrane-spanning segments, the secondary structure of these segments is not known. We utilized 10 state-of-the-art bioinformatics techniques to predict the transmembrane topology of the tetrameric regions within six members of the LGIC family that are relevant to anesthetic action. They are the human forms of the GABA alpha 1 receptor, the glycine alpha 1 receptor, the 5HT3 serotonin receptor, the nicotinic AChR alpha 4 and alpha 7 receptors and the Torpedo nAChR alpha 1 receptor. The algorithms utilized were HMMTOP, TMHMM, TMPred, PHDhtm, DAS, TMFinder, SOSUI, TMAP, MEMSAT and TOPPred2. The resulting predictions were superimposed on to a multiple sequence alignment of the six amino acid sequences created using the CLUSTAL W algorithm. There was a clear statistical consensus for the presence of four alpha helices in those regions experimentally thought to span the membrane. The consensus of 10 topology prediction techniques supports the hypothesis that the transmembrane subunits of the LGICs are tetrameric bundles of alpha helices.
Fabrication of a New Lineage of Artificial Luciferases from Natural Luciferase Pools.
Kim, Sung Bae; Nishihara, Ryo; Citterio, Daniel; Suzuki, Koji
2017-09-11
The fabrication of artificial luciferases (ALucs) with unique optical properties has a fundamental impact on bioassays and molecular imaging. In this study, we developed a new lineage of ALucs with unique substrate preferences by extracting consensus amino acids from the alignment of 25 copepod luciferase sequences available in natural luciferase pools. The primary sequence was first created with a sequence logo generator resulting in a total of 11 sibling sequences. Phylogenetic analysis shows that the newly fabricated ALucs form an independent branch, genetically isolated from the natural luciferases, and from a prior series of ALucs produced by our laboratory using a smaller basis set. The new lineage of ALucs were strongly luminescent in living mammalian cells with specific substrate selectivity to native coelenterazine. A single-residue-level comparison of the C-terminal sequences of new ALucs reveals that some amino acids in the C-terminal ends are greatly influential on the optical intensities but limited in the color variance. The success of this approach guides on how to engineer and functionalize marine luciferases for bioluminescence imaging and assays.
Amino acid sequence of the human fibronectin receptor
1987-01-01
The amino acid sequence deduced from cDNA of the human placental fibronectin receptor is reported. The receptor is composed of two subunits: an alpha subunit of 1,008 amino acids which is processed into two polypeptides disulfide bonded to one another, and a beta subunit of 778 amino acids. Each subunit has near its COOH terminus a hydrophobic segment. This and other sequence features suggest a structure for the receptor in which the hydrophobic segments serve as transmembrane domains anchoring each subunit to the membrane and dividing each into a large ectodomain and a short cytoplasmic domain. The alpha subunit ectodomain has five sequence elements homologous to consensus Ca2+- binding sites of several calcium-binding proteins, and the beta subunit contains a fourfold repeat strikingly rich in cysteine. The alpha subunit sequence is 46% homologous to the alpha subunit of the vitronectin receptor. The beta subunit is 44% homologous to the human platelet adhesion receptor subunit IIIa and 47% homologous to a leukocyte adhesion receptor beta subunit. The high degree of homology (85%) of the beta subunit with one of the polypeptides of a chicken adhesion receptor complex referred to as integrin complex strongly suggests that the latter polypeptide is the chicken homologue of the fibronectin receptor beta subunit. These receptor subunit homologies define a superfamily of adhesion receptors. The availability of the entire protein sequence for the fibronectin receptor will facilitate studies on the functions of these receptors. PMID:2958481
Jurka, Jerzy W.
1997-01-01
Enhanced homologous recombination is obtained by employing a consensus sequence which has been found to be associated with integration of repeat sequences, such as Alu and ID. The consensus sequence or sequence having a single transition mutation determines one site of a double break which allows for high efficiency of integration at the site. By introducing single or double stranded DNA having the consensus sequence flanking region joined to a sequence of interest, one can reproducibly direct integration of the sequence of interest at one or a limited number of sites. In this way, specific sites can be identified and homologous recombination achieved at the site by employing a second flanking sequence associated with a sequence proximal to the 3'-nick.
The gamma subunit of transducin is farnesylated.
Lai, R K; Perez-Sala, D; Cañada, F J; Rando, R R
1990-01-01
Protein prenylation with farnesyl or geranylgeranyl moieties is an important posttranslational modification that affects the activity of such diverse proteins as the nuclear lamins, the yeast mating factor mata, and the ras oncogene products. In this article, we show that whole retinal cultures incorporate radioactive mevalonic acid into proteins of 23-26 kDa and one of 8 kDa. The former proteins are probably the "small" guanine nucleotide-binding regulatory proteins (G proteins) and the 8-kDa protein is the gamma subunit of the well-studied retinal heterotrimeric G protein (transducin). After deprenylating purified transducin and its subunits with Raney nickel or methyl iodide/base, the adducted prenyl group can be identified as an all-trans-farnesyl moiety covalently linked to a cysteine residue. Thus far, prenylation reactions have been found to occur at cysteine in a carboxyl-terminal consensus CAAX sequence, where C is the cysteine, A is an aliphatic amino acid, and X is undefined. Both the alpha and gamma subunits of transducin have this consensus sequence, but only the gamma subunit is prenylated. Therefore, the CAAX motif is not necessary and sufficient to direct prenylation. Finally, since transducin is the best understood G protein, both structurally and mechanistically, the discovery that it is farnesylated should allow for a quantitative understanding of this post-translational modification. Images PMID:2217200
To Clone or Not To Clone: Method Analysis for Retrieving Consensus Sequences In Ancient DNA Samples
Winters, Misa; Barta, Jodi Lynn; Monroe, Cara; Kemp, Brian M.
2011-01-01
The challenges associated with the retrieval and authentication of ancient DNA (aDNA) evidence are principally due to post-mortem damage which makes ancient samples particularly prone to contamination from “modern” DNA sources. The necessity for authentication of results has led many aDNA researchers to adopt methods considered to be “gold standards” in the field, including cloning aDNA amplicons as opposed to directly sequencing them. However, no standardized protocol has emerged regarding the necessary number of clones to sequence, how a consensus sequence is most appropriately derived, or how results should be reported in the literature. In addition, there has been no systematic demonstration of the degree to which direct sequences are affected by damage or whether direct sequencing would provide disparate results from a consensus of clones. To address this issue, a comparative study was designed to examine both cloned and direct sequences amplified from ∼3,500 year-old ancient northern fur seal DNA extracts. Majority rules and the Consensus Confidence Program were used to generate consensus sequences for each individual from the cloned sequences, which exhibited damage at 31 of 139 base pairs across all clones. In no instance did the consensus of clones differ from the direct sequence. This study demonstrates that, when appropriate, cloning need not be the default method, but instead, should be used as a measure of authentication on a case-by-case basis, especially when this practice adds time and cost to studies where it may be superfluous. PMID:21738625
Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach
Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.
2007-01-01
We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853
Fuller, James R; Pitzer, Joshua E; Godwin, Ulla; Albertino, Mark; Machon, Benjamin D; Kearse, Kelly P; McConnell, Thomas J
2004-05-17
Folding and assembly of MHC molecules in mammals occurs in the endoplasmic reticulum (ER), but has not been studied in teleosts. Calnexin (CNX) is an ER chaperone that associates with glycoproteins bearing a monoglucosylated N-linked oligosaccharide side chain. Here we report the first identification and characterization of a full-length CNX cDNA clone in a teleost, and the association of the CNX chaperone with MHC class II in a channel catfish T cell line. The 1.8 kb CNX clone encodes a protein of 607 amino acids that is 72% identical to the consensus sequence of mammalian CNXs. The association of CNX with class II is of particular interest because the native MHC class II alpha chain of Ictalurus punctatus does not bear any N-linked oligosaccharide consensus glycosylation sequences. Thus the assembly of class II molecules in the catfish probably proceeds via different steps than occurs in mammals. Copyright 2003 Elsevier Ltd.
Nicolotti, Orazio; Miscioscia, Teresa Fabiola; Leonetti, Francesco; Muncipinto, Giovanni; Carotti, Angelo
2007-01-01
A total of 142 matrix metalloproteinase (MMP) X-ray crystallographic structures were retrieved from the Protein Data Bank (PDB) and analyzed by an automated and efficient routine, developed in-house, with a series of bioinformatic tools. Highly informative heat maps and hierarchical clusterograms provided a reliable and comprehensive representation of the relationships existing among MMPs, enlarging and complementing the current knowledge in the field. Multiple sequence and structural alignments permitted better location and display of key MMP motifs and quantification of the residue consensus at each amino acid position in the most critical binding subsites of MMPs. The MMP active site consensus sequences, the C-alpha root-mean-square deviation (RMSd) analysis of diverse enzymatic subsites, and the examination of the chemical nature, binding topologies, and zinc binding groups (ZBGs) of ligands extracted from crystallographic complexes provided useful insights on the structural arrangements of the most potent MMP inhibitors.
Iimura, Yosuke; Tatsumi, Kenji
2002-07-01
We isolated and analysed two genomic DNAs that encode the heat-shock protein Hsp30 from Coriolus versicolor. The amino acid sequences substitute only three amino acid substitutions. The promoter regions contain the consensus heat-shock element, a xenobiotic-response element, a stress-response element, and a metal-response element. The levels of mRNAs for Hsp30 increased markedly after exposure of C. versicolor to pentachlorophenol and levels were higher than those after heat shock.
Giardina, P; Cannio, R; Martirani, L; Marzullo, L; Palmieri, G; Sannia, G
1995-01-01
The gene (pox1) encoding a phenol oxidase from Pleurotus ostreatus, a lignin-degrading basidiomycete, was cloned and sequenced, and the corresponding pox1 cDNA was also synthesized and sequenced. The isolated gene consists of 2,592 bp, with the coding sequence being interrupted by 19 introns and flanked by an upstream region in which putative CAAT and TATA consensus sequences could be identified at positions -174 and -84, respectively. The isolation of a second cDNA (pox2 cDNA), showing 84% similarity, and of the corresponding truncated genomic clones demonstrated the existence of a multigene family coding for isoforms of laccase in P. ostreatus. PCR amplifications of specific regions on the DNA of isolated monokaryons proved that the two genes are not allelic forms. The POX1 amino acid sequence deduced was compared with those of other known laccases from different fungi. PMID:7793961
Cuypers, Lize; Li, Guangdi; Libin, Pieter; Piampongsant, Supinya; Vandamme, Anne-Mieke; Theys, Kristof
2015-09-16
Treatment with pan-genotypic direct-acting antivirals, targeting different viral proteins, is the best option for clearing hepatitis C virus (HCV) infection in chronically infected patients. However, the diversity of the HCV genome is a major obstacle for the development of antiviral drugs, vaccines, and genotyping assays. In this large-scale analysis, genome-wide diversity and selective pressure was mapped, focusing on positions important for treatment, drug resistance, and resistance testing. A dataset of 1415 full-genome sequences, including genotypes 1-6 from the Los Alamos database, was analyzed. In 44% of all full-genome positions, the consensus amino acid was different for at least one genotype. Focusing on positions sharing the same consensus amino acid in all genotypes revealed that only 15% was defined as pan-genotypic highly conserved (≥99% amino acid identity) and an additional 24% as pan-genotypic conserved (≥95%). Despite its large genetic diversity, across all genotypes, codon positions were rarely identified to be positively selected (0.23%-0.46%) and predominantly found to be under negative selective pressure, suggesting mainly neutral evolution. For NS3, NS5A, and NS5B, respectively, 40% (6/15), 33% (3/9), and 14% (2/14) of the resistance-related positions harbored as consensus the amino acid variant related to resistance, potentially impeding treatment. For example, the NS3 variant 80K, conferring resistance to simeprevir used for treatment of HCV1 infected patients, was present in 39.3% of the HCV1a strains and 0.25% of HCV1b strains. Both NS5A variants 28M and 30S, known to be associated with resistance to the pan-genotypic drug daclatasvir, were found in a significant proportion of HCV4 strains (10.7%). NS5B variant 556G, known to confer resistance to non-nucleoside inhibitor dasabuvir, was observed in 8.4% of the HCV1b strains. Given the large HCV genetic diversity, sequencing efforts for resistance testing purposes may need to be genotype-specific or geographically tailored.
Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico
2018-02-01
To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.
Prokop, Jeremy W.; Santos, Robson A. S.; Milsted, Amy
2013-01-01
The renin-angiotensin system is involved in multiple conditions ranging from cardiovascular disorders to cancer. Components of the pathway, including ACE, renin and angiotensin receptors are targets for disease treatment. This study addresses three receptors of the pathway: AT1, AT2, and MAS and how the receptors are similar and differ in activation by angiotensin peptides. Combining biochemical and amino acid variation data with multiple species sequence alignments, structural models, and docking site predictions allows for visualization of how angiotensin peptides may bind and activate the receptors; allowing identification of conserved and variant mechanisms in the receptors. MAS differs from AT1 favoring Ang-(1–7) and not Ang II binding, while AT2 recently has been suggested to preferentially bind Ang III. A new model of Ang peptide binding to AT1 and AT2 is proposed that correlates data from site directed mutagenesis and photolabled experiments that were previously considered conflicting. Ang II binds AT1 and AT2 through a conserved initial binding mode involving amino acids 111 (consensus 325) of AT1 (Asn) interacting with Tyr (4) of Ang II and 199 and 256 (consensus 512 and 621, a Lys and His respectively) interacting with Phe (8) of Ang II. In MAS these sites are not conserved, leading to differential binding and activation by Ang-(1–7). In both AT1 and AT2, the Ang II peptide may internalize through Phe (8) of Ang II propagating through the receptors’ conserved aromatic amino acids to the final photolabled positioning relative to either AT1 (amino acid 294, Asn, consensus 725) or AT2 (138, Leu, consensus 336). Understanding receptor activation provides valuable information for drug design and identification of other receptors that can potentially bind Ang peptides. PMID:23755216
Primary and secondary structural analyses of glutathione S-transferase pi from human placenta.
Ahmad, H; Wilson, D E; Fritz, R R; Singh, S V; Medh, R D; Nagle, G T; Awasthi, Y C; Kurosky, A
1990-05-01
The primary structure of glutathione S-transferase (GST) pi from a single human placenta was determined. The structure was established by chemical characterization of tryptic and cyanogen bromide peptides as well as automated sequence analysis of the intact enzyme. The structural analysis indicated that the protein is comprised of 209 amino acid residues and gave no evidence of post-translational modifications. The amino acid sequence differed from that of the deduced amino acid sequence determined by nucleotide sequence analysis of a cDNA clone (Kano, T., Sakai, M., and Muramatsu, M., 1987, Cancer Res. 47, 5626-5630) at position 104 which contained both valine and isoleucine whereas the deduced sequence from nucleotide sequence analysis identified only isoleucine at this position. These results demonstrated that in the one individual placenta studied at least two GST pi genes are coexpressed, probably as a result of allelomorphism. Computer assisted consensus sequence evaluation identified a hydrophobic region in GST pi (residues 155-181) that was predicted to be either a buried transmembrane helical region or a signal sequence region. The significance of this hydrophobic region was interpreted in relation to the mode of action of the enzyme especially in regard to the potential involvement of a histidine in the active site mechanism. A comparison of the chemical similarity of five known human GST complete enzyme structures, one of pi, one of mu, two of alpha, and one microsomal, gave evidence that all five enzymes have evolved by a divergent evolutionary process after gene duplication, with the microsomal enzyme representing the most divergent form.
Jiang, W; Woitach, J T; Gupta, D; Bhavanandan, V P
1998-10-20
Secreted epithelial mucins are extremely large and heterogeneous glycoproteins. We report the 5 kilobase DNA sequence of a second gene, BSM2, which encodes bovine submaxillary mucin. The determined nucleotide and deduced amino acid sequences of BSM2 are 95.2% and 92. 2% identical, respectively, to those of the previously described BSM1 gene isolated from the same cow. Further, the five predicted protein domains of the two genes are 100%, 94%, 93%, 77%, and 88% identical. Based on the above results, we propose that expression of multiple homologous core proteins from a single animal is a factor in generating diversity of saccharides in mucins and in providing resistance of the molecules to proteolysis. In addition, this work raises several important issues in mucin cloning such as assembling sequences from seemingly overlapping clones and deducing consensus sequences for nearly identical tandem repeats. Copyright 1998 Academic Press.
van Verk, Marcel C; Pappaioannou, Dimitri; Neeleman, Lyda; Bol, John F; Linthorst, Huub J M
2008-04-01
PR-1a is a salicylic acid-inducible defense gene of tobacco (Nicotiana tabacum). One-hybrid screens identified a novel tobacco WRKY transcription factor (NtWRKY12) with specific binding sites in the PR-1a promoter at positions -564 (box WK(1)) and -859 (box WK(2)). NtWRKY12 belongs to the class of transcription factors in which the WRKY sequence is followed by a GKK rather than a GQK sequence. The binding sequence of NtWRKY12 (WK box TTTTCCAC) deviated significantly from the consensus sequence (W box TTGAC[C/T]) shown to be recognized by WRKY factors with the GQK sequence. Mutation of the GKK sequence in NtWRKY12 into GQK or GEK abolished binding to the WK box. The WK(1) box is in close proximity to binding sites in the PR-1a promoter for transcription factors TGA1a (as-1 box) and Myb1 (MBSII box). Expression studies with PR-1a promoterbeta-glucuronidase (GUS) genes in stably and transiently transformed tobacco indicated that NtWRKY12 and TGA1a act synergistically in PR-1a expression induced by salicylic acid and bacterial elicitors. Cotransfection of Arabidopsis thaliana protoplasts with 35SNtWRKY12 and PR-1aGUS promoter fusions showed that overexpression of NtWRKY12 resulted in a strong increase in GUS expression, which required functional WK boxes in the PR-1a promoter.
In silico analysis of β-1,3-glucanase from a psychrophilic yeast, Glaciozyma antarctica PI12
NASA Astrophysics Data System (ADS)
Mohammadi, Salimeh; Bakar, Farah Diba Abu; Rabu, Amir; Murad, Abdul Munir Abdul
2014-09-01
1,3-beta-glucanase is an industrially important enzyme having wide range of applications especially in food industry. It is crucial to gain an understanding about the structure and functional aspects of various beta-1,3-glucanase produced from diverse sources. In this, study a cDNA encoding β-1,3-glucanase (GaExg55) was isolated from a psychrophilic yeast, Glaciozyma antarctica PI12. The cDNA sequence has been submitted to Genbank with an accession number (KJ436377). Subsequently, the perdition protein was analyzed using various bioinformatics tools to explore the properties of the protein. GaEXG55 is consisting of 1,440-bp nucleotides encoding 480 amino acid residues. Alignment of the deduced amino acid for GaExg55 with other exo-β-1,3-glucanase available at the NCBI database indicate that deduced amino acids shared a consensus motif NEP, which is signature pattern of GH5 hydrolases. Predicted molecular weight of GaExg55 is 53.66 kDa. GaExg55 sequences possesses signal peptide sequence and it is highly conserved with other fungal exo-beta-1,3 glucanase.
Embedding strategies for effective use of information from multiple sequence alignments.
Henikoff, S.; Henikoff, J. G.
1997-01-01
We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452
Gomes, S L; Gober, J W; Shapiro, L
1990-01-01
Caulobacter crescentus has a single dnaK gene that is highly homologous to the hsp70 family of heat shock genes. Analysis of the cloned and sequenced dnaK gene has shown that the deduced amino acid sequence could encode a protein of 67.6 kilodaltons that is 68% identical to the DnaK protein of Escherichia coli and 49% identical to the Drosophila and human hsp70 protein family. A partial open reading frame 165 base pairs 3' to the end of dnaK encodes a peptide of 190 amino acids that is 59% identical to DnaJ of E. coli. Northern blot analysis revealed a single 4.0-kilobase mRNA homologous to the cloned fragment. Since the dnaK coding region is 1.89 kilobases, dnaK and dnaJ may be transcribed as a polycistronic message. S1 mapping and primer extension experiments showed that transcription initiated at two sites 5' to the dnaK coding sequence. A single start site of transcription was identified during heat shock at 42 degrees C, and the predicted promoter sequence conformed to the consensus heat shock promoters of E. coli. At normal growth temperature (30 degrees C), a different start site was identified 3' to the heat shock start site that conformed to the E. coli sigma 70 promoter consensus sequence. S1 protection assays and analysis of expression of the dnaK gene fused to the lux transcription reporter gene showed that expression of dnaK is temporally controlled under normal physiological conditions and that transcription occurs just before the initiation of DNA replication. Thus, in both human cells (I. K. L. Milarski and R. I. Morimoto, Proc. Natl. Acad. Sci. USA 83:9517-9521, 1986) and in a simple bacterium, the transcription of a hsp70 gene is temporally controlled as a function of the cell cycle under normal growth conditions. Images PMID:2345134
Sugimura; Sawabe; Ezura
2000-01-01
The alginate lyase-coding genes of Vibrio halioticoli IAM 14596(T), which was isolated from the gut of the abalone Haliotis discus hannai, were cloned using plasmid vector pUC 18, and expressed in Escherichia coli. Three alginate lyase-positive clones, pVHB, pVHC, and pVHE, were obtained, and all clones expressed the enzyme activity specific for polyguluronate. Three genes, alyVG1, alyVG2, and alyVG3, encoding polyguluronate lyase were sequenced: alyVG1 from pVHB was composed of a 1056-bp open reading frame (ORF) encoding 352 amino acid residues; alyVG2 gene from pVHC was composed of a 993-bp ORF encoding 331 amino acid residues; and alyVG3 gene from pVHE was composed of a 705-bp ORF encoding 235 amino acid residues. Comparison of nucleotide and deduced amino acid sequences among AlyVG1, AlyVG2, and AlyVG3 revealed low homologies. The identity value between AlyVG1 and AlyVG2 was 18.7%, and that between AlyVG2 and AlyVG3 was 17.0%. A higher identity value (26.0%) was observed between AlyVG1 and AlyVG3. Sequence comparison among known polyguluronate lyases including AlyVG1, AlyVG2, and AlyVG3 also did not reveal an identical region in these sequences. However, AlyVG1 showed the highest identity value (36.2%) and the highest similarity (73.3%) to AlyA from Klebsiella pneumoniae. A consensus region comprising nine amino acid (YFKAGXYXQ) in the carboxy-terminal region previously reported by Mallisard and colleagues was observed only in AlyVG1 and AlyVG2.
Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR.
D'Souza, T M; Boominathan, K; Reddy, C A
1996-01-01
Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. PMID:8837429
Cook, W B; Walker, J C
1992-01-01
A cDNA encoding a nuclear-encoded chloroplast nucleic acid-binding protein (NBP) has been isolated from maize. Identified as an in vitro DNA-binding activity, NBP belongs to a family of nuclear-encoded chloroplast proteins which share a common domain structure and are thought to be involved in posttranscriptional regulation of chloroplast gene expression. NBP contains an N-terminal chloroplast transit peptide, a highly acidic domain and a pair of ribonucleoprotein consensus sequence domains. NBP is expressed in a light-dependent, organ-specific manner which is consistent with its involvement in chloroplast biogenesis. The relationship of NBP to the other members of this protein family and their possible regulatory functions are discussed. Images PMID:1346929
Singh, Reema; Schilde, Christina; Schaap, Pauline
2016-11-17
Dictyostelia are a well-studied group of organisms with colonial multicellularity, which are members of the mostly unicellular Amoebozoa. A phylogeny based on SSU rDNA data subdivided all Dictyostelia into four major groups, but left the position of the root and of six group-intermediate taxa unresolved. Recent phylogenies inferred from 30 or 213 proteins from sequenced genomes, positioned the root between two branches, each containing two major groups, but lacked data to position the group-intermediate taxa. Since the positions of these early diverging taxa are crucial for understanding the evolution of phenotypic complexity in Dictyostelia, we sequenced six representative genomes of early diverging taxa. We retrieved orthologs of 47 housekeeping proteins with an average size of 890 amino acids from six newly sequenced and eight published genomes of Dictyostelia and unicellular Amoebozoa and inferred phylogenies from single and concatenated protein sequence alignments. Concatenated alignments of all 47 proteins, and four out of five subsets of nine concatenated proteins all produced the same consensus phylogeny with 100% statistical support. Trees inferred from just two out of the 47 proteins, individually reproduced the consensus phylogeny, highlighting that single gene phylogenies will rarely reflect correct species relationships. However, sets of two or three concatenated proteins again reproduced the consensus phylogeny, indicating that a small selection of genes suffices for low cost classification of as yet unincorporated or newly discovered dictyostelid and amoebozoan taxa by gene amplification. The multi-locus consensus phylogeny shows that groups 1 and 2 are sister clades in branch I, with the group-intermediate taxon D. polycarpum positioned as outgroup to group 2. Branch II consists of groups 3 and 4, with the group-intermediate taxon Polysphondylium violaceum positioned as sister to group 4, and the group-intermediate taxon Dictyostelium polycephalum branching at the base of that whole clade. Given the data, the approximately unbiased test rejects all alternative topologies favoured by SSU rDNA and individual proteins with high statistical support. The test also rejects monophyletic origins for the genera Acytostelium, Polysphondylium and Dictyostelium. The current position of Acytostelium ellipticum in the consensus phylogeny indicates that somatic cells were lost twice in Dictyostelia.
Grötzinger, Stefan W.; Alam, Intikhab; Ba Alawi, Wail; Bajic, Vladimir B.; Stingl, Ulrich; Eppinger, Jörg
2014-01-01
Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website. PMID:24778629
Multiple splicing defects in an intronic false exon.
Sun, H; Chasin, L A
2000-09-01
Splice site consensus sequences alone are insufficient to dictate the recognition of real constitutive splice sites within the typically large transcripts of higher eukaryotes, and large numbers of pseudoexons flanked by pseudosplice sites with good matches to the consensus sequences can be easily designated. In an attempt to identify elements that prevent pseudoexon splicing, we have systematically altered known splicing signals, as well as immediately adjacent flanking sequences, of an arbitrarily chosen pseudoexon from intron 1 of the human hprt gene. The substitution of a 5' splice site that perfectly matches the 5' consensus combined with mutation to match the CAG/G sequence of the 3' consensus failed to get this model pseudoexon included as the central exon in a dhfr minigene context. Provision of a real 3' splice site and a consensus 5' splice site and removal of an upstream inhibitory sequence were necessary and sufficient to confer splicing on the pseudoexon. This activated context also supported the splicing of a second pseudoexon sequence containing no apparent enhancer. Thus, both the 5' splice site sequence and the polypyrimidine tract of the pseudoexon are defective despite their good agreement with the consensus. On the other hand, the pseudoexon body did not exert a negative influence on splicing. The introduction into the pseudoexon of a sequence selected for binding to ASF/SF2 or its replacement with beta-globin exon 2 only partially reversed the effect of the upstream negative element and the defective polypyrimidine tract. These results support the idea that exon-bridging enhancers are not a prerequisite for constitutive exon definition and suggest that intrinsically defective splice sites and negative elements play important roles in distinguishing the real splicing signal from the vast number of false splicing signals.
NASA Astrophysics Data System (ADS)
Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna
2017-02-01
Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.
Sasazawa, Yukiko; Sato, Natsumi; Suzuki, Takehiro; Dohmae, Naoshi; Simizu, Siro
The thrombopoietin receptor, also known as c-Mpl, is a member of the cytokine superfamily, which regulates the differentiation of megakaryocytes and formation of platelets by binding to its ligand, thrombopoietin (TPO), through Janus kinase (JAK)-signal transducer and activator of transcription (STAT) signaling. The loss-of-function mutations of c-Mpl cause severe thrombocytopenia due to impaired megakaryocytopoiesis, and gain-of-function mutations cause thrombocythemia. c-Mpl contains two Trp-Ser-Xaa-Trp-Ser (Xaa represents any amino acids) sequences, which are characteristic sequences of type I cytokine receptors, corresponding to C-mannosylation consensus sequences: Trp-Xaa-Xaa-Trp/Cys. C-mannosylation is a post-translational modification of tryptophan residue in which one mannose is attached to the first tryptophan residue in the consensus sequence via C-C linkage. Although c-Mpl contains some C-mannosylation sequences, whether c-Mpl is C-mannosylated or not has been uninvestigated. We identified that c-Mpl is C-mannosylated not only at Trp(269) and Trp(474), which are putative C-mannosylation site, but also at Trp(272), Trp(416), and Trp(477). Using C-mannosylation defective mutant of c-Mpl, the C-mannosylated tryptophan residues at four sites (Trp(269), Trp(272), Trp(474), and Trp(477)) are essential for c-Mpl-mediated JAK-STAT signaling. Our findings suggested that C-mannosylation of c-Mpl is a possible therapeutic target for platelet disorders. Copyright © 2015 Elsevier Inc. All rights reserved.
Enterocin T, a novel class IIa bacteriocin produced by Enterococcus sp. 812.
Chen, Yi-Sheng; Yu, Chi-Rong; Ji, Si-Hua; Liou, Min-Shiuan; Leong, Kun-Hon; Pan, Shwu-Fen; Wu, Hui-Chung; Lin, Yu-Hsuan; Yu, Bi; Yanagida, Fujitoshi
2013-09-01
Enterococcus sp. 812, isolated from fresh broccoli, was previously found to produce a bacteriocin active against a number of Gram-positive bacteria, including Listeria monocytogenes. Bacteriocin activity decreased slightly after autoclaving (121 °C for 15 min), but was inactivated by protease K. Mass spectrometry analysis revealed the bacteriocin mass to be approximately 4,521.34 Da. N-terminal amino acid sequencing yielded a partial sequence, NH2-ATYYGNGVYXDKKKXWVEWGQA, by Edman degradation, which contained the consensus class IIa bacteriocin motif YGNGV in the N-terminal region. The obtained partial sequence showed high homology with some enterococcal bacteriocins; however, no identical peptide or protein was found. This peptide was therefore considered to be a novel bacteriocin produced by Enterococcus sp. 812 and was termed enterocin T.
Peptide Array X-Linking (PAX): A New Peptide-Protein Identification Approach
Okada, Hirokazu; Uezu, Akiyoshi; Soderblom, Erik J.; Moseley, M. Arthur; Gertler, Frank B.; Soderling, Scott H.
2012-01-01
Many protein interaction domains bind short peptides based on canonical sequence consensus motifs. Here we report the development of a peptide array-based proteomics tool to identify proteins directly interacting with ligand peptides from cell lysates. Array-formatted bait peptides containing an amino acid-derived cross-linker are photo-induced to crosslink with interacting proteins from lysates of interest. Indirect associations are removed by high stringency washes under denaturing conditions. Covalently trapped proteins are subsequently identified by LC-MS/MS and screened by cluster analysis and domain scanning. We apply this methodology to peptides with different proline-containing consensus sequences and show successful identifications from brain lysates of known and novel proteins containing polyproline motif-binding domains such as EH, EVH1, SH3, WW domains. These results suggest the capacity of arrayed peptide ligands to capture and subsequently identify proteins by mass spectrometry is relatively broad and robust. Additionally, the approach is rapid and applicable to cell or tissue fractions from any source, making the approach a flexible tool for initial protein-protein interaction discovery. PMID:22606326
Sequence analysis and expression of the M1 and M2 matrix protein genes of hirame rhabdovirus (HIRRV)
Nishizawa, T.; Kurath, G.; Winton, J.R.
1997-01-01
We have cloned and sequenced a 2318 nucleotide region of the genomic RNA of hirame rhabdovirus (HIRRV), an important viral pathogen of Japanese flounder Paralichthys olivaceus. This region comprises approximately two-thirds of the 3' end of the nucleocapsid protein (N) gene and the complete matrix protein (M1 and M2) genes with the associated intergenic regions. The partial N gene sequence was 812 nucleotides in length with an open reading frame (ORF) that encoded the carboxyl-terminal 250 amino acids of the N protein. The M1 and M2 genes were 771 and 700 nucleotides in length, respectively, with ORFs encoding proteins of 227 and 193 amino acids. The M1 gene sequence contained an additional small ORF that could encode a highly basic, arginine-rich protein of 25 amino acids. Comparisons of the N, M1, and M2 gene sequences of HIRRV with the corresponding sequences of the fish rhabdoviruses, infectious hematopoietic necrosis virus (IHNV) or viral hemorrhagic septicemia virus (VHSV) indicated that HIRRV was more closely related to IHNV than to VHSV, but was clearly distinct from either. The putative consensus gene termination sequence for IHNV and VHSV, AGAYAG(A)(7), was present in the N-M1, M1-M2, and M2-G intergenic regions of HIRRV as were the putative transcription initiation sequences YGGCAC and AACA. An Escherichia coli expression system was used to produce recombinant proteins from the M1 and M2 genes of HIRRV. These were the same size as the authentic M1 and M2 proteins and reacted with anti-HIRRV rabbit serum in western blots. These reagents can be used for further study of the fish immune response and to test novel control methods.
Polypeptide p41 of a Norwalk-Like Virus Is a Nucleic Acid-Independent Nucleoside Triphosphatase
Pfister, Thomas; Wimmer, Eckard
2001-01-01
Southampton virus (SHV) is a member of the Norwalk-like viruses (NLVs), one of four genera of the family Caliciviridae. The genome of SHV contains three open reading frames (ORFs). ORF 1 encodes a polyprotein that is autocatalytically processed into six proteins, one of which is p41. p41 shares sequence motifs with protein 2C of picornaviruses and superfamily 3 helicases. We have expressed p41 of SHV in bacteria. Purified p41 exhibited nucleoside triphosphate (NTP)-binding and NTP hydrolysis activities. The NTPase activity was not stimulated by single-stranded nucleic acids. SHV p41 had no detectable helicase activity. Protein sequence comparison between the consensus sequences of NLV p41 and enterovirus protein 2C revealed regions of high similarity. According to secondary structure prediction, the conserved regions were located within a putative central domain of alpha helices and beta strands. This study reveals for the first time an NTPase activity associated with a calicivirus-encoded protein. Based on enzymatic properties and sequence information, a functional relationship between NLV p41 and enterovirus 2C is discussed in regard to the role of 2C-like proteins in virus replication. PMID:11160659
Genomic Structure of the Luciferase Gene from the Bioluminescent Beetle, Nyctophila cf. Caucasica
Day, John C.; Chaichi, Mohammad J.; Najafil, Iraj; Whiteley, Andrew S.
2006-01-01
The gene coding for beetle luciferase, the enzyme responsible for bioluminescence in over two thousand coleopteran species has, to date, only been characterized from one Palearctic species of Lampyridae. Here we report the characterization of the luciferase gene from a female beetle of an Iranian lampyrid species, Nyctophila cf. caucasica (Coleoptera:Lampyridae). The luciferase gene was composed of seven exons, coding for 547 amino acids, separated by six introns spanning 1976 bp of genomic DNA. The deduced amino acid sequences of the luciferase gene of N. caucasica showed 98.9% homology to that of the Palearctic species Lampyris noctiluca. Analysis of the 810 bp upstream region of the luciferase gene revealed three TATA boxes and several other consensus transcriptional factor recognition sequences presenting evidence for a putative core promoter region conserved in Lampyrinae from -190 through to -155 upstream of the luciferase start codon. Along with the core promoter region the luciferase gene was compared with orthologous sequences from other lampyrid species and found to have greatest identity to Lampyris turkistanicus and Lampyris noctiluca. The significant sequence identity to the former is discussed in relation to taxonomic issues of Iranian lampyrids. PMID:20298115
Ito, Y; Ikeuchi, A; Imamura, C
2013-01-01
We aimed at constructing thermostable cellulase variants of cellobiohydrolase II, derived from the mesophilic fungus Phanerochaete chrysosporium, by using an advanced evolutionary molecular engineering method. By aligning the amino acid sequences of the catalytic domains of five thermophilic fungal CBH2 and PcCBH2 proteins, we identified 45 positions where the PcCBH2 genes differ from the consensus sequence of two to five thermophilic fungal CBH2s. PcCBH2 variants with the consensus mutations were obtained by a cell-free translation system that was chosen for easy evaluation of thermostability. From the small library of consensus mutations, advantageous mutations for improving thermostability were found to occur with much higher frequency relative to a random library. To further improve thermostability, advantageous mutations were accumulated within the wild-type gene. Finally, we obtained the most thermostable variant Mall4, which contained all 15 advantageous mutations found in this study. This variant had the same specific cellulase activity as the wild type and retained sufficient activity at 50°C for >72 h, whereas wild-type PcCBH2 retained much less activity under the same conditions. The history of the accumulation process indicated that evolution of PcCBH2 toward improved thermostability was ideally and rapidly accomplished through the evolutionary process employed in this study.
Takagi, M; Kobayashi, N; Sugimoto, M; Fujii, T; Watari, J; Yano, K
1987-01-01
The expression of a LEU gene from Candida maltosa (designated as C-LEU2) isolated previously (Kawamura et al. 1983) was shown to be regulated, when transferred into Saccharomyces cerevisiae, by leucine and threonine in the medium, as in the case of LEU2 gene of S. cerevisiae. The coding region together with the regulatory region was subcloned and the nucleotide sequence was determined. When the sequence of the coding region was compared with that of LEU2, the homology was 72% for base pairs and 76% for deduced amino acids. Comparison of the regulatory region of C-LEU2 with those of LEU1 and LEU2 suggested a few short consensus sequences which are involved in regulation of gene expression by leucine and threonine in the medium.
Li, Fan; Ma, Liying; Feng, Yi; Hu, Jing; Ni, Na; Ruan, Yuhua; Shao, Yiming
2017-06-01
HIV-1 transmission in intravenous drug users (IDUs) has been characterized by high genetic multiplicity and suggests a greater challenge for HIV-1 infection blocking. We investigated a total of 749 sequences of full-length gp160 gene obtained by single genome sequencing (SGS) from 22 HIV-1 early infected IDUs in Xinjiang province, northwest China, and generated a transmitted and founder virus (T/F virus) consensus sequence (IDU.CON). The T/F virus was classified as subtype CRF07_BC and predicted to be CCR5-tropic virus. The variable region (V1, V2, and V4 loop) of IDU.CON showed length variation compared with the heterosexual T/F virus consensus sequence (HSX.CON) and homosexual T/F virus consensus sequence (MSM.CON). A total of 26 N-linked glycosylation sites were discovered in the IDU.CON sequence, which is less than that of MSM.CON and HSX.CON. Characterization of T/F virus from IDUs highlights the genetic make-up and complexity of virus near the moment of transmission or in early infection preceding systemic dissemination and is important toward the development of an effective HIV-1 preventive methods, including vaccines.
Enterocin TW21, a novel bacteriocin from dochi-isolated Enterococcus faecium D081821.
Chang, S-Y; Chen, Y-S; Pan, S-F; Lee, Y-S; Chang, C-H; Chang, C-H; Yu, B; Wu, H-C
2013-09-01
Purification and characterization of a novel bacteriocin produced by strain Enterococcus faecium D081821. Enterococcus faecium D081821, isolated from the traditional Taiwanese fermented food dochi (fermented black beans), was previously found to produce a bacteriocin against Listeria monocytogenes and some Gram-positive bacteria. This bacteriocin, termed enterocin TW21, was purified from culture supernatant by ammonium sulfate precipitation, Sep-Pak C18 cartridge, ion-exchange and gel filtration chromatography. Mass spectrometry analysis showed the mass of the peptide to be approximately 5300·6 Da. The N-terminal amino acid sequencing yielded a partial sequence NH2 -ATYYGNGVYxNTQK by Edman degradation, and it contains the consensus class IIa bacteriocin motif YGNGV in the N-terminal region. The open reading frame (ORF) encoding the bacteriocin was identified from the draft genome sequence of Enterococcus faecium D081821, and sequence analysis of this peptide indicated that enterocin TW21 is a novel bacteriocin. Enterococcus faecium D081821 produced a bacteriocin named enterocin TW21, the molecular weight and amino acid sequence both revealed it to be a novel bacteriocin. A new member of class IIa bacteriocin was identified. This bacteriocin shows great inhibitory ability against L. monocytogenes and could be applied as a natural food preservative. © 2013 The Society for Applied Microbiology.
Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E
1985-01-01
The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
Zhou, Gaofeng; Jian, Jianbo; Wang, Penghao; Li, Chengdao; Tao, Ye; Li, Xuan; Renshaw, Daniel; Clements, Jonathan; Sweetingham, Mark; Yang, Huaan
2018-01-01
An ultra-high density genetic map containing 34,574 sequence-defined markers was developed in Lupinus angustifolius. Markers closely linked to nine genes of agronomic traits were identified. A physical map was improved to cover 560.5 Mb genome sequence. Lupin (Lupinus angustifolius L.) is a recently domesticated legume grain crop. In this study, we applied the restriction-site associated DNA sequencing (RADseq) method to genotype an F 9 recombinant inbred line population derived from a wild type × domesticated cultivar (W × D) cross. A high density linkage map was developed based on the W × D population. By integrating sequence-defined DNA markers reported in previous mapping studies, we established an ultra-high density consensus genetic map, which contains 34,574 markers consisting of 3508 loci covering 2399 cM on 20 linkage groups. The largest gap in the entire consensus map was 4.73 cM. The high density W × D map and the consensus map were used to develop an improved physical map, which covered 560.5 Mb of genome sequence data. The ultra-high density consensus linkage map, the improved physical map and the markers linked to genes of breeding interest reported in this study provide a common tool for genome sequence assembly, structural genomics, comparative genomics, functional genomics, QTL mapping, and molecular plant breeding in lupin.
Musinova, Yana R; Kananykhina, Eugenia Y; Potashnikova, Daria M; Lisitsyna, Olga M; Sheval, Eugene V
2015-01-01
The majority of known nucleolar proteins are freely exchanged between the nucleolus and the surrounding nucleoplasm. One way proteins are retained in the nucleoli is by the presence of specific amino acid sequences, namely nucleolar localization signals (NoLSs). The mechanism by which NoLSs retain proteins inside the nucleoli is still unclear. Here, we present data showing that the charge-dependent (electrostatic) interactions of NoLSs with nucleolar components lead to nucleolar accumulation as follows: (i) known NoLSs are enriched in positively charged amino acids, but the NoLS structure is highly heterogeneous, and it is not possible to identify a consensus sequence for this type of signal; (ii) in two analyzed proteins (NF-κB-inducing kinase and HIV-1 Tat), the NoLS corresponds to a region that is enriched for positively charged amino acid residues; substituting charged amino acids with non-charged ones reduced the nucleolar accumulation in proportion to the charge reduction, and nucleolar accumulation efficiency was strongly correlated with the predicted charge of the tested sequences; and (iii) sequences containing only lysine or arginine residues (which were referred to as imitative NoLSs, or iNoLSs) are accumulated in the nucleoli in a charge-dependent manner. The results of experiments with iNoLSs suggested that charge-dependent accumulation inside the nucleoli was dependent on interactions with nucleolar RNAs. The results of this work are consistent with the hypothesis that nucleolar protein accumulation by NoLSs can be determined by the electrostatic interaction of positively charged regions with nucleolar RNAs rather than by any sequence-specific mechanism. Copyright © 2014 Elsevier B.V. All rights reserved.
Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N
2001-08-15
This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.
Böer, Erik; Bode, Rüdiger; Mock, Hans-Peter; Piontek, Michael; Kunze, Gotthard
2009-06-01
The tannase-encoding Arxula adeninivorans gene ATAN1 was isolated from genomic DNA by PCR, using as primers oligonucleotide sequences derived from peptides obtained after tryptic digestion of the purified tannase protein. The gene harbours an ORF of 1764 bp, encoding a 587-amino acid protein, preceded by an N-terminal secretion sequence comprising 28 residues. The deduced amino acid sequence was similar to those of tannases from Aspergillus oryzae (50% identity), A. niger (48%) and putative tannases from A. fumigatus (52%) and A. nidulans (50%). The sequence contains the consensus pentapeptide motif (-Gly-X-Ser-X-Gly-) which forms part of the catalytic centre of serine hydrolases. Expression of ATAN1 is regulated by the carbon source. Supplementation with tannic acid or gallic acid leads to induction of ATAN1, and accumulation of the native tannase enzyme in the medium. The enzymes recovered from both wild-type and recombinant strains were essentially indistinguishable. A molecular mass of approximately 320 kDa was determined, indicating that the native, glycosylated tannase consists of four identical subunits. The enzyme has a temperature optimum at 35-40 degrees C and a pH optimum at approximately 6.0. The enzyme is able to remove gallic acid from both condensed and hydrolysable tannins. The wild-type strain LS3 secreted amounts of tannase equivalent to 100 U/l under inducing conditions, while the transformant strain, which overexpresses the ATAN1 gene from the strong, constitutively active A. adeninivorans TEF1 promoter, produced levels of up to 400 U/l when grown in glucose medium in shake flasks. Copyright (c) 2009 John Wiley & Sons, Ltd.
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
López-Bueno, Alberto; Segovia, José C; Bueren, Juan A; O'Sullivan, M Gerard; Wang, Feng; Tattersall, Peter; Almendral, José M
2008-02-01
Very little is known about the role that evolutionary dynamics plays in diseases caused by mammalian DNA viruses. To address this issue in a natural host model, we compared the pathogenesis and genetics of the attenuated fibrotropic and the virulent lymphohematotropic strains of the parvovirus minute virus of mice (MVM), and of two invasive fibrotropic MVM (MVMp) variants carrying the I362S or K368R change in the VP2 major capsid protein, in the infection of severe combined immunodeficient (SCID) mice. By 14 to 18 weeks after oronasal inoculation, the I362S and K368R viruses caused lethal leukopenia characterized by tissue damage and inclusion bodies in hemopoietic organs, a pattern of disease found by 7 weeks postinfection with the lymphohematotropic MVM (MVMi) strain. The MVMp populations emerging in leukopenic mice showed consensus sequence changes in the MVMi genotype at residues G321E and A551V of VP2 in the I362S virus infections or A551V and V575A changes in the K368R virus infections, as well as a high level of genetic heterogeneity within a capsid domain at the twofold depression where these residues lay. Amino acids forming this capsid domain are important MVM tropism determinants, as exemplified by the switch in MVMi host range toward mouse fibroblasts conferred by coordinated changes of some of these residues and by the essential character of glutamate at residue 321 for maintaining MVMi tropism toward primary hemopoietic precursors. The few viruses within the spectrum of mutants from mice that maintained the respective parental 321G and 575V residues were infectious in a plaque assay, whereas the viruses with the main consensus sequences exhibited low levels of fitness in culture. Consistent with this finding, a recombinant MVMp virus carrying the consensus sequence mutations arising in the K368R virus background in mice failed to initiate infection in cell lines of different tissue origins, even though it caused rapid-course lethal leukopenia in SCID mice. The parental consensus genotype prevailed during leukopenia development, but plaque-forming viruses with the reversion of the 575A residue to valine emerged in affected organs. The disease caused by the DNA virus in mice, therefore, involves the generation of heterogeneous viral populations that may cooperatively interact for the hemopoietic syndrome. The evolutionary changes delineate a sector of the surface of the capsid that determines tropism and that surrounds the sialic acid receptor binding domain.
Characterization of sams genes of Amoeba proteus and the endosymbiotic X-bacteria.
Jeon, Taeck J; Jeon, Kwang W
2003-01-01
As a result of harboring obligatory bacterial endosymbionts, the xD strain of Amoeba proteus no longer produces its own S-adenosylmethionine synthetase (SAMS). When symbiont-free D amoebae are infected with symbionts (X-bacteria), the amount of amoeba SAMS decreases to a negligible level within four weeks, but about 47% of the SAMS activity, which apparently comes from another source, is still detected. Complete nucleotide sequences of sams genes of D and xD amoebae are presented and show that there are no differences between the two. Long-established xD amoebae contain an intact sams gene and thus the loss of xD amoeba's SAMS is not due to the loss of the gene itself. The open reading frame of the amoeba's sams gene has 1,281 nucleotides, encoding SAMS of 426 amino acids with a mass of 48 kDa and pI of 6.5. The amino acid sequence of amoeba SAMS is longer than the SAMS of other organisms by having an extra internal stretch of 28 amino acids. The 5'-flanking region of amoeba sams contains consensus-binding sites for several transcription factors that are related to the regulation of sams genes in E. coli and yeast. The complete nucleotide sequence of the symbiont's sams gene is also presented. The open reading frame of X-bacteria sams is 1,146 nucleotides long, encoding SAMS of 381 amino acids with a mass of 41 kDa and pI of 6.0. The X-bacteria SAMS has 45% sequence identity with that of A. proteus.
Matsui, Daisuke; Nakano, Shogo; Dadashipour, Mohammad; Asano, Yasuhisa
2017-08-25
Insolubility of proteins expressed in the Escherichia coli expression system hinders the progress of both basic and applied research. Insoluble proteins contain residues that decrease their solubility (aggregation hotspots). Mutating these hotspots to optimal amino acids is expected to improve protein solubility. To date, however, the identification of these hotspots has proven difficult. In this study, using a combination of approaches involving directed evolution and primary sequence analysis, we found two rules to help inductively identify hotspots: the α-helix rule, which focuses on the hydrophobicity of amino acids in the α-helix structure, and the hydropathy contradiction rule, which focuses on the difference in hydrophobicity relative to the corresponding amino acid in the consensus protein. By properly applying these two rules, we succeeded in improving the probability that expressed proteins would be soluble. Our methods should facilitate research on various insoluble proteins that were previously difficult to study due to their low solubility.
Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David
2001-01-01
Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR
DOE Office of Scientific and Technical Information (OSTI.GOV)
D`Souza, T.M.; Boominathan, K.; Reddy, C.A.
1996-10-01
Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum,more » Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.« less
Anbar, Michael; Gul, Ozgur; Lamed, Raphael; Sezerman, Ugur O.
2012-01-01
The use of thermostable cellulases is advantageous for the breakdown of lignocellulosic biomass toward the commercial production of biofuels. Previously, we have demonstrated the engineering of an enhanced thermostable family 8 cellulosomal endoglucanase (EC 3.2.1.4), Cel8A, from Clostridium thermocellum, using random error-prone PCR and a combination of three beneficial mutations, dominated by an intriguing serine-to-glycine substitution (M. Anbar, R. Lamed, E. A. Bayer, ChemCatChem 2:997–1003, 2010). In the present study, we used a bioinformatics-based approach involving sequence alignment of homologous family 8 glycoside hydrolases to create a library of consensus mutations in which residues of the catalytic module are replaced at specific positions with the most prevalent amino acids in the family. One of the mutants (G283P) displayed a higher thermal stability than the wild-type enzyme. Introducing this mutation into the previously engineered Cel8A triple mutant resulted in an optimized enzyme, increasing the half-life of activity by 14-fold at 85°C. Remarkably, no loss of catalytic activity was observed compared to that of the wild-type endoglucanase. The structural changes were simulated by molecular dynamics analysis, and specific regions were identified that contributed to the observed thermostability. Intriguingly, most of the proteins used for sequence alignment in determining the consensus residues were derived from mesophilic bacteria, with optimal temperatures well below that of C. thermocellum Cel8A. PMID:22389377
Nanda, Kumiko; Taniguchi, Mariko; Ujike, Satoshi; Ishihara, Nobuhiro; Mori, Hirotaka; Ono, Hisayo; Murooka, Yoshikatsu
2001-01-01
Bacterial strains were isolated from samples of Japanese rice vinegar (komesu) and unpolished rice vinegar (kurosu) fermented by the traditional static method. Fermentations have never been inoculated with a pure culture since they were started in 1907. A total of 178 isolates were divided into groups A and B on the basis of enterobacterial repetitive intergenic consensus-PCR and random amplified polymorphic DNA fingerprinting analyses. The 16S ribosomal DNA sequences of strains belonging to each group showed similarities of more than 99% with Acetobacter pasteurianus. Group A strains overwhelmingly dominated all stages of fermentation of both types of vinegar. Our results indicate that appropriate strains of acetic acid bacteria have spontaneously established almost pure cultures during nearly a century of komesu and kurosu fermentation. PMID:11157275
Berstein, R M; Schluter, S F; Shen, S; Marchalonis, J J
1996-04-16
All immunoglobulins and T-cell receptors throughout phylogeny share regions of highly conserved amino acid sequence. To identify possible primitive immunoglobulins and immunoglobulin-like molecules, we utilized 3' RACE (rapid amplification of cDNA ends) and a highly conserved constant region consensus amino acid sequence to isolate a new immunoglobulin class from the sandbar shark Carcharhinus plumbeus. The immunoglobulin, termed IgW, in its secreted form consists of 782 amino acids and is expressed in both the thymus and the spleen. The molecule overall most closely resembles mu chains of the skate and human and a new putative antigen binding molecule isolated from the nurse shark (NAR). The full-length IgW chain has a variable region resembling human and shark heavy-chain (VH) sequences and a novel joining segment containing the WGXGT motif characteristic of H chains. However, unlike any other H-chain-type molecule, it contains six constant (C) domains. The first C domain contains the cysteine residue characteristic of C mu1 that would allow dimerization with a light (L) chain. The fourth and sixth domains also contain comparable cysteines that would enable dimerization with other H chains or homodimerization. Comparison of the sequences of IgW V and C domains shows homology greater than that found in comparisons among VH and C mu or VL, or CL thereby suggesting that IgW may retain features of the primordial immunoglobulin in evolution.
Is There Scientific Consensus on Acid Rain? -- Excerpts from Six Governmental Reports.
ERIC Educational Resources Information Center
Environmental Education Report and Newsletter, 1986
1986-01-01
Compiles a series of direct quotations from six governmental reports that reflect a scientific consensus on major aspects of acid deposition. Presents the statements in a question and answer format. Also reviews the sources, extent, and effects of acid rain. (ML)
Arif, Rabia; Akram, Faiza; Jamil, Tazeen; Lee, Siu Fai
2017-01-01
Posttranslational modifications (PTMs) occur in all essential proteins taking command of their functions. There are many domains inside proteins where modifications take place on side-chains of amino acids through various enzymes to generate different species of proteins. In this manuscript we have, for the first time, predicted posttranslational modifications of frequency clock and mating type a-1 proteins in Sordaria fimicola collected from different sites to see the effect of environment on proteins or various amino acids pickings and their ultimate impact on consensus sequences present in mating type proteins using bioinformatics tools. Furthermore, we have also measured and walked through genomic DNA of various Sordaria strains to determine genetic diversity by genotyping the short sequence repeats (SSRs) of wild strains of S. fimicola collected from contrasting environments of two opposing slopes (harsh and xeric south facing slope and mild north facing slope) of Evolution Canyon (EC), Israel. Based on the whole genome sequence of S. macrospora, we targeted 20 genomic regions in S. fimicola which contain short sequence repeats (SSRs). Our data revealed genetic variations in strains from south facing slope and these findings assist in the hypothesis that genetic variations caused by stressful environments lead to evolution. PMID:28717646
Arif, Rabia; Akram, Faiza; Jamil, Tazeen; Mukhtar, Hamid; Lee, Siu Fai; Saleem, Muhammad
2017-01-01
Posttranslational modifications (PTMs) occur in all essential proteins taking command of their functions. There are many domains inside proteins where modifications take place on side-chains of amino acids through various enzymes to generate different species of proteins. In this manuscript we have, for the first time, predicted posttranslational modifications of frequency clock and mating type a-1 proteins in Sordaria fimicola collected from different sites to see the effect of environment on proteins or various amino acids pickings and their ultimate impact on consensus sequences present in mating type proteins using bioinformatics tools. Furthermore, we have also measured and walked through genomic DNA of various Sordaria strains to determine genetic diversity by genotyping the short sequence repeats (SSRs) of wild strains of S. fimicola collected from contrasting environments of two opposing slopes (harsh and xeric south facing slope and mild north facing slope) of Evolution Canyon (EC), Israel. Based on the whole genome sequence of S. macrospora , we targeted 20 genomic regions in S. fimicola which contain short sequence repeats (SSRs). Our data revealed genetic variations in strains from south facing slope and these findings assist in the hypothesis that genetic variations caused by stressful environments lead to evolution.
Chutiwitoonchai, Nopporn; Kakisaka, Michinori; Yamada, Kazunori; Aida, Yoko
2014-01-01
The assembly of influenza virus progeny virions requires machinery that exports viral genomic ribonucleoproteins from the cell nucleus. Currently, seven nuclear export signal (NES) consensus sequences have been identified in different viral proteins, including NS1, NS2, M1, and NP. The present study examined the roles of viral NES consensus sequences and their significance in terms of viral replication and nuclear export. Mutation of the NP-NES3 consensus sequence resulted in a failure to rescue viruses using a reverse genetics approach, whereas mutation of the NS2-NES1 and NS2-NES2 sequences led to a strong reduction in viral replication kinetics compared with the wild-type sequence. While the viral replication kinetics for other NES mutant viruses were also lower than those of the wild-type, the difference was not so marked. Immunofluorescence analysis after transient expression of NP-NES3, NS2-NES1, or NS2-NES2 proteins in host cells showed that they accumulated in the cell nucleus. These results suggest that the NP-NES3 consensus sequence is mostly required for viral replication. Therefore, each of the hydrophobic (Φ) residues within this NES consensus sequence (Φ1, Φ2, Φ3, or Φ4) was mutated, and its viral replication and nuclear export function were analyzed. No viruses harboring NP-NES3 Φ2 or Φ3 mutants could be rescued. Consistent with this, the NP-NES3 Φ2 and Φ3 mutants showed reduced binding affinity with CRM1 in a pull-down assay, and both accumulated in the cell nucleus. Indeed, a nuclear export assay revealed that these mutant proteins showed lower nuclear export activity than the wild-type protein. Moreover, the Φ2 and Φ3 residues (along with other Φ residues) within the NP-NES3 consensus were highly conserved among different influenza A viruses, including human, avian, and swine. Taken together, these results suggest that the Φ2 and Φ3 residues within the NP-NES3 protein are important for its nuclear export function during viral replication.
How to Fabricate Functional Artificial Luciferases for Bioassays.
Kim, Sung-Bae; Fujii, Rika
2016-01-01
The present protocol introduces fabrication of artificial luciferases (ALuc(®)) by extracting the consensus amino acids from the alignment of copepod luciferase sequences. The made ALucs have unique sequential identities that are phylogenetically distinctive from those of any existing copepod luciferase. Some ALucs exhibited heat stability, and strong and greatly prolonged optical intensities. The made ALucs are applicable to various bioassays as an optical readout, including live cell imaging, single-chain probes, and bioluminescent tags of antibodies. The present protocol guides on how to fabricate a unique artificial luciferase with designed optical properties and functionalities.
Coronado, Liani; Liniger, Matthias; Muñoz-González, Sara; Postel, Alexander; Pérez, Lester Josue; Pérez-Simó, Marta; Perera, Carmen Laura; Frías-Lepoureau, Maria Teresa; Rosell, Rosa; Grundhoff, Adam; Indenbirken, Daniela; Alawi, Malik; Fischer, Nicole; Becher, Paul; Ruggli, Nicolas; Ganges, Llilianne
2017-03-01
In this study, we compared the virulence in weaner pigs of the Pinar del Rio isolate and the virulent Margarita strain. The latter caused the Cuban classical swine fever (CSF) outbreak of 1993. Our results showed that the Pinar del Rio virus isolated during an endemic phase is clearly of low virulence. We analysed the complete nucleotide sequence of the Pinar del Rio virus isolated after persistence in newborn piglets, as well as the genome sequence of the inoculum. The consensus genome sequence of the Pinar del Rio virus remained completely unchanged after 28days of persistent infection in swine. More importantly, a unique poly-uridine tract was discovered in the 3'UTR of the Pinar del Rio virus, which was not found in the Margarita virus or any other known CSFV sequences. Based on RNA secondary structure prediction, the poly-uridine tract results in a long single-stranded intervening sequence (SS) between the stem-loops I and II of the 3'UTR, without major changes in the stem- loop structures when compared to the Margarita virus. The possible implications of this novel insertion on persistence and attenuation remain to be investigated. In addition, comparison of the amino acid sequence of the viral proteins E rns , E1, E2 and p7 of the Margarita and Pinar del Rio viruses showed that all non-conservative amino acid substitutions acquired by the Pinar del Rio isolate clustered in E2, with two of them being located within the B/C domain. Immunisation and cross-neutralisation experiments in pigs and rabbits suggest differences between these two viruses, which may be attributable to the amino acid differences observed in E2. Altogether, these data provide fresh insights into viral molecular features which might be associated with the attenuation and adaptation of CSFV for persistence in the field. Copyright © 2017 Elsevier B.V. All rights reserved.
An artificial intelligence approach fit for tRNA gene studies in the era of big sequence data.
Iwasaki, Yuki; Abe, Takashi; Wada, Kennosuke; Wada, Yoshiko; Ikemura, Toshimichi
2017-09-12
Unsupervised data mining capable of extracting a wide range of knowledge from big data without prior knowledge or particular models is a timely application in the era of big sequence data accumulation in genome research. By handling oligonucleotide compositions as high-dimensional data, we have previously modified the conventional self-organizing map (SOM) for genome informatics and established BLSOM, which can analyze more than ten million sequences simultaneously. Here, we develop BLSOM specialized for tRNA genes (tDNAs) that can cluster (self-organize) more than one million microbial tDNAs according to their cognate amino acid solely depending on tetra- and pentanucleotide compositions. This unsupervised clustering can reveal combinatorial oligonucleotide motifs that are responsible for the amino acid-dependent clustering, as well as other functionally and structurally important consensus motifs, which have been evolutionarily conserved. BLSOM is also useful for identifying tDNAs as phylogenetic markers for special phylotypes. When we constructed BLSOM with 'species-unknown' tDNAs from metagenomic sequences plus 'species-known' microbial tDNAs, a large portion of metagenomic tDNAs self-organized with species-known tDNAs, yielding information on microbial communities in environmental samples. BLSOM can also enhance accuracy in the tDNA database obtained from big sequence data. This unsupervised data mining should become important for studying numerous functionally unclear RNAs obtained from a wide range of organisms.
Palzkill, T G; Oliver, S G; Newlon, C S
1986-01-01
Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
Mexican consensus on lysosomal acid lipase deficiency diagnosis.
Vázquez-Frias, R; García-Ortiz, J E; Valencia-Mayoral, P F; Castro-Narro, G E; Medina-Bravo, P G; Santillán-Hernández, Y; Flores-Calderón, J; Mehta, R; Arellano-Valdés, C A; Carbajal-Rodríguez, L; Navarrete-Martínez, J I; Urbán-Reyes, M L; Valadez-Reyes, M T; Zárate-Mondragón, F; Consuelo-Sánchez, A
Lysosomal acid lipase deficiency (LAL-D) causes progressive cholesteryl ester and triglyceride accumulation in the lysosomes of hepatocytes and monocyte-macrophage system cells, resulting in a systemic disease with various manifestations that may go unnoticed. It is indispensable to recognize the deficiency, which can present in patients at any age, so that specific treatment can be given. The aim of the present review was to offer a guide for physicians in understanding the fundamental diagnostic aspects of LAL-D, to successfully aid in its identification. The review was designed by a group of Mexican experts and is presented as an orienting algorithm for the pediatrician, internist, gastroenterologist, endocrinologist, geneticist, pathologist, radiologist, and other specialists that could come across this disease in their patients. An up-to-date review of the literature in relation to the clinical manifestations of LAL-D and its diagnosis was performed. The statements were formulated based on said review and were then voted upon. The structured quantitative method employed for reaching consensus was the nominal group technique. A practical algorithm of the diagnostic process in LAL-D patients was proposed, based on clinical and laboratory data indicative of the disease and in accordance with the consensus established for each recommendation. The algorithm provides a sequence of clinical actions from different studies for optimizing the diagnostic process of patients suspected of having LAL-D. Copyright © 2017 Asociación Mexicana de Gastroenterología. Publicado por Masson Doyma México S.A. All rights reserved.
Amexis, Georgios; Rubin, Steven; Chatterjee, Nando; Carbone, Kathryn; Chumakov, Kostantin
2003-06-01
A single clinical isolate of mumps virus designated 88-1961 was obtained from a patient hospitalized with a clinical history of upper respiratory tract infection, parotitis, severe headache, fever and lymphadenopathy. We have sequenced the full-length genome of 88-1961 and compared it against all available full-length sequences of mumps virus. Based upon its nucleotide sequence of the SH gene 88-1961 was identified as a genotype H mumps strain. The overall extent of nucleotide and amino acid differences between each individual gene and protein of 88-1961 and the full-length mumps samples showed that the missense to silent ratios were unevenly distributed. Upon evaluation of the consensus sequence of 88-1961, four positions were found to be clearly heterogeneous at the nucleotide level (NP 315C/T, NP 318C/T, F 271A/C, and HN 855C/T). Sequence analysis revealed that the amino acid sequences for the NP, M, and the L protein were the most conserved, whereas the SH protein exhibited the highest variability among the compared mumps genotypes A, B, and G. No identifying molecular patterns in the non-coding (intergenic) or coding regions of 88-1961 were found when we compared it against relatively virulent (Urabe AM9 B, Glouc1/UK96, 87-1004 and 87-1005) and non-virulent mumps strains (Jeryl Lynn and all Urabe Am9 A substrains). Copyright 2003 Wiley-Liss, Inc.
Liu, Yanbin; Koh, Chong Mei John; Ngoh, Si Te; Ji, Lianghui
2015-10-26
Rhodosporidium and Rhodotorula are two genera of oleaginous red yeast with great potential for industrial biotechnology. To date, there is no effective method for inducible expression of proteins and RNAs in these hosts. We have developed a luciferase gene reporter assay based on a new codon-optimized LUC2 reporter gene (RtLUC2), which is flanked with CAR2 homology arms and can be integrated into the CAR2 locus in the nuclear genome at >90 % efficiency. We characterized the upstream DNA sequence of a D-amino acid oxidase gene (DAO1) from R. toruloides ATCC 10657 by nested deletions. By comparing the upstream DNA sequences of several putative DAO1 homologs of Basidiomycetous fungi, we identified a conserved DNA motif with a consensus sequence of AGGXXGXAGX11GAXGAXGG within a 0.2 kb region from the mRNA translation initiation site. Deletion of this motif led to strong mRNA transcription under non-inducing conditions. Interestingly, DAO1 promoter activity was enhanced about fivefold when the 108 bp intron 1 was included in the reporter construct. We identified a conserved CT-rich motif in the intron with a consensus sequence of TYTCCCYCTCCYCCCCACWYCCGA, deletion or point mutations of which drastically reduced promoter strength under both inducing and non-inducing conditions. Additionally, we created a selection marker-free DAO1-null mutant (∆dao1e) which displayed greatly improved inducible gene expression, particularly when both glucose and nitrogen were present in high levels. To avoid adding unwanted peptide to proteins to be expressed, we converted the original translation initiation codon to ATC and re-created a translation initiation codon at the start of exon 2. This promoter, named P DAO1-in1m1 , showed very similar luciferase activity to the wild-type promoter upon induction with D-alanine. The inducible system was tunable by adjusting the levels of inducers, carbon source and nitrogen source. The intron 1-containing DAO1 promoters coupled with a DAO1 null mutant makes an efficient and tight D-amino acid-inducible gene expression system in Rhodosporidium and Rhodotorula genera. The system will be a valuable tool for metabolic engineering and enzyme expression in these yeast hosts.
Shanehbandi, Dariush; Majidi, Jafar; Kazemi, Tohid; Baradaran, Behzad; Aghebati-Maleki, Leili
2017-01-01
CD20-based targeting of B-cells in hematologic malignancies and autoimmune disorders is associated with outstanding clinical outcomes. Isolation and characterization of VH and VL cDNAs encoding the variable regions of the heavy and light chains of monoclonal antibodies (MAb) is necessary to produce next generation MAbs and their derivatives such as bispecific antibodies (bsAb) and single-chain variable fragments (scFv). This study was aimed at cloning and characterization of the VH and VL cDNAs from a hybridoma cell line producing an anti-CD20 MAb. VH and VL fragments were amplified, cloned and characterized. Furthermore, amino acid sequences of VH, VL and corresponding complementarity-determining regions (CDR) were determined and compared with those of four approved MAbs including Rituximab (RTX), Ibritumomab tiuxetan, Ofatumumab and GA101. The cloned VH and VL cDNAs were found to be functional and follow a consensus pattern. Amino acid sequences corresponding to the VH and VL fragments also indicated noticeable homologies to those of RTX and Ibritumomab. Furthermore, amino acid sequences of the relating CDRs had remarkable similarities to their counterparts in RTX and Ibritumomab. Successful recovery of VH and VL fragments encourages the development of novel CD20 targeting bsAbs, scFvs, antibody conjugates and T-cells armed with chimeric antigen receptors.
enoLOGOS: a versatile web tool for energy normalized sequence logos
Workman, Christopher T.; Yin, Yutong; Corcoran, David L.; Ideker, Trey; Stormo, Gary D.; Benos, Panayiotis V.
2005-01-01
enoLOGOS is a web-based tool that generates sequence logos from various input sources. Sequence logos have become a popular way to graphically represent DNA and amino acid sequence patterns from a set of aligned sequences. Each position of the alignment is represented by a column of stacked symbols with its total height reflecting the information content in this position. Currently, the available web servers are able to create logo images from a set of aligned sequences, but none of them generates weighted sequence logos directly from energy measurements or other sources. With the advent of high-throughput technologies for estimating the contact energy of different DNA sequences, tools that can create logos directly from binding affinity data are useful to researchers. enoLOGOS generates sequence logos from a variety of input data, including energy measurements, probability matrices, alignment matrices, count matrices and aligned sequences. Furthermore, enoLOGOS can represent the mutual information of different positions of the consensus sequence, a unique feature of this tool. Another web interface for our software, C2H2-enoLOGOS, generates logos for the DNA-binding preferences of the C2H2 zinc-finger transcription factor family members. enoLOGOS and C2H2-enoLOGOS are accessible over the web at . PMID:15980495
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.
Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A
1993-01-01
The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Liew, Steven; Sundaram, Hema; De Boulle, Koenraad L.; Goodman, Greg J.; Monheit, Gary; Wu, Yan; Trindade de Almeida, Ada R.; Swift, Arthur; Vieira Braz, André
2016-01-01
Background: Although the safety profile of hyaluronic acid fillers is favorable, adverse reactions can occur. Clinicians and patients can benefit from ongoing guidance on adverse reactions to hyaluronic acid fillers and their management. Methods: A multinational, multidisciplinary group of experts in cosmetic medicine convened the Global Aesthetics Consensus Group to review the properties and clinical uses of Hylacross and Vycross hyaluronic acid products and develop updated consensus recommendations for early and late complications associated with hyaluronic acid fillers. Results: The consensus panel provided specific recommendations focusing on early and late complications of hyaluronic acid fillers and their management. The impact of patient-, product-, and technique-related factors on such reactions was described. Most of these were noted to be mild and transient. Serious adverse events are rare. Early adverse reactions to hyaluronic acid fillers include vascular infarction and compromise; inflammatory reactions; injection-related events; and inappropriate placement of filler material. Among late reactions are nodules, granulomas, and skin discoloration. Most adverse events can be avoided with proper planning and technique. Detailed understanding of facial anatomy, proper patient and product selection, and appropriate technique can further reduce the risks. Should adverse reactions occur, the clinician must be prepared and have tools available for effective treatment. Conclusions: Adverse reactions with hyaluronic acid fillers are uncommon. Clinicians should take steps to further reduce the risk and be prepared to treat any complications that arise. PMID:27219265
Unusual glycosylation of proteins: Beyond the universal sequon and other amino acids.
Dutta, Devawati; Mandal, Chhabinath; Mandal, Chitra
2017-12-01
Glycosylation of proteins is the most common, multifaceted co- and post-translational modification responsible for many biological processes and cellular functions. Significant alterations and aberrations of these processes are related to various pathological conditions, and often turn out to be disease biomarkers. Conventional N-glycosylation occurs through the recognition of the consensus sequon, asparagine (Asn)-X-serine (Ser)/threonine (Thr), where X is any amino acid except for proline, with N-acetylglucosamine (GlcNAc) as the first glycosidic linkage. Usually, O-glycosylation adds a glycan to the hydroxyl group of Ser or Thr beginning with N-acetylgalactosamine (GalNAc). Protein glycosylation is further governed by additional diversifications in sequon and structure, which are yet to be fully explored. This review mainly focuses on the occurrence of N-glycosylation in non-consensus motifs, where Ser/Thr at the +2 position is substituted by other amino acids. Additionally, N-glycosylation is also observed in other amide/amine group-containing amino acids. Similarly, O-glycosylation occurs at hydroxyl group-containing amino acids other than serine/threonine. The neighbouring amino acids and local structural features around the potential glycosylation site also play a significant role in determining the extent of glycosylation. All of these phenomena that yield glycosylation at the atypical sites are reported in a variety of biological systems, including different pathological conditions. Therefore, the discovery of more novel sequence patterns for N- and O-glycosylation may help in understanding the functions of complex biological processes and cellular functions. Taken together, all these information provided in this review would be helpful for the biological readers. Copyright © 2017 Elsevier B.V. All rights reserved.
Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D
1998-08-15
Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2.
Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D
1998-01-01
Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2. PMID:9693111
Shivange, Amol V; Hoeffken, Hans Wolfgang; Haefner, Stefan; Schwaneberg, Ulrich
2016-12-01
Protein consensus-based surface engineering (ProCoS) is a simple and efficient method for directed protein evolution combining computational analysis and molecular biology tools to engineer protein surfaces. ProCoS is based on the hypothesis that conserved residues originated from a common ancestor and that these residues are crucial for the function of a protein, whereas highly variable regions (situated on the surface of a protein) can be targeted for surface engineering to maximize performance. ProCoS comprises four main steps: ( i ) identification of conserved and highly variable regions; ( ii ) protein sequence design by substituting residues in the highly variable regions, and gene synthesis; ( iii ) in vitro DNA recombination of synthetic genes; and ( iv ) screening for active variants. ProCoS is a simple method for surface mutagenesis in which multiple sequence alignment is used for selection of surface residues based on a structural model. To demonstrate the technique's utility for directed evolution, the surface of a phytase enzyme from Yersinia mollaretii (Ymphytase) was subjected to ProCoS. Screening just 1050 clones from ProCoS engineering-guided mutant libraries yielded an enzyme with 34 amino acid substitutions. The surface-engineered Ymphytase exhibited 3.8-fold higher pH stability (at pH 2.8 for 3 h) and retained 40% of the enzyme's specific activity (400 U/mg) compared with the wild-type Ymphytase. The pH stability might be attributed to a significantly increased (20 percentage points; from 9% to 29%) number of negatively charged amino acids on the surface of the engineered phytase.
Kim, Seong K; Shakya, Akhalesh K; O'Callaghan, Dennis J
2016-01-04
The immediate-early protein (IEP) of equine herpesvirus 1 (EHV-1) has extensive homology to the IEP of alphaherpesviruses and possesses domains essential for trans-activation, including an acidic trans-activation domain (TAD) and binding domains for DNA, TFIIB, and TBP. Our data showed that the IEP directly interacted with transcription factor TFIIA, which is known to stabilize the binding of TBP and TFIID to the TATA box of core promoters. When the TATA box of the EICP0 promoter was mutated to a nonfunctional TATA box, IEP-mediated trans-activation was reduced from 22-fold to 7-fold. The IEP trans-activated the viral promoters in a TATA motif-dependent manner. Our previous data showed that the IEP is able to repress its own promoter when the IEP-binding sequence (IEBS) is located within 26-bp from the TATA box. When the IEBS was located at 100 bp upstream of the TATA box, IEP-mediated trans-activation was very similar to that of the minimal IE(nt -89 to +73) promoter lacking the IEBS. As the distance from the IEBS to the TATA box decreased, IEP-mediated trans-activation progressively decreased, indicating that the IEBS located within 100 bp from the TATA box sequence functions as a distance-dependent repressive element. These results indicated that IEP-mediated full trans-activation requires a consensus TATA box of core promoters, but not its binding to the cognate sequence (IEBS). Copyright © 2015 Elsevier B.V. All rights reserved.
Kim, Seong K.; Shakya, Akhalesh K.; O'Callaghan, Dennis J.
2015-01-01
The immediate-early protein (IEP) of equine herpesvirus 1 (EHV-1) has extensive homology to the IEP of alphaherpesviruses and possesses domains essential for trans-activation, including an acidic trans-activation domain (TAD) and binding domains for DNA, TFIIB, and TBP. Our data showed that the IEP directly interacted with transcription factor TFIIA, which is known to stabilize the binding of TBP and TFIID to the TATA box of core promoters. When the TATA box of the EICP0 promoter was mutated to a nonfunctional TATA box, IEP-mediated trans-activation was reduced from 22-fold to 7-fold. The IEP trans-activated the viral promoters in a TATA motif-dependent manner. Our previous data showed that the IEP is able to repress its own promoter when the IEP-binding sequence (IEBS) is located within 26-bp from the TATA box. When the IEBS was located at 100 bp upstream of the TATA box, IEP-mediated trans-activation was very similar to that of the minimal IE(nt −89 to +73) promoter lacking the IEBS. As the distance from the IEBS to the TATA box decreased, IEP-mediated trans-activation progressively decreased, indicating that the IEBS located within 100 bp from the TATA box sequence functions as a distance-dependent repressive element. These results indicated that IEP-mediated full trans-activation requires a consensus TATA box of core promoters, but not its binding to the cognate sequence (IEBS). PMID:26541315
Automated sequence analysis and editing software for HIV drug resistance testing.
Struck, Daniel; Wallis, Carole L; Denisov, Gennady; Lambert, Christine; Servais, Jean-Yves; Viana, Raquel V; Letsoalo, Esrom; Bronze, Michelle; Aitken, Sue C; Schuurman, Rob; Stevens, Wendy; Schmit, Jean Claude; Rinke de Wit, Tobias; Perez Bercoff, Danielle
2012-05-01
Access to antiretroviral treatment in resource-limited-settings is inevitably paralleled by the emergence of HIV drug resistance. Monitoring treatment efficacy and HIV drugs resistance testing are therefore of increasing importance in resource-limited settings. Yet low-cost technologies and procedures suited to the particular context and constraints of such settings are still lacking. The ART-A (Affordable Resistance Testing for Africa) consortium brought together public and private partners to address this issue. To develop an automated sequence analysis and editing software to support high throughput automated sequencing. The ART-A Software was designed to automatically process and edit ABI chromatograms or FASTA files from HIV-1 isolates. The ART-A Software performs the basecalling, assigns quality values, aligns query sequences against a set reference, infers a consensus sequence, identifies the HIV type and subtype, translates the nucleotide sequence to amino acids and reports insertions/deletions, premature stop codons, ambiguities and mixed calls. The results can be automatically exported to Excel to identify mutations. Automated analysis was compared to manual analysis using a panel of 1624 PR-RT sequences generated in 3 different laboratories. Discrepancies between manual and automated sequence analysis were 0.69% at the nucleotide level and 0.57% at the amino acid level (668,047 AA analyzed), and discordances at major resistance mutations were recorded in 62 cases (4.83% of differences, 0.04% of all AA) for PR and 171 (6.18% of differences, 0.03% of all AA) cases for RT. The ART-A Software is a time-sparing tool for pre-analyzing HIV and viral quasispecies sequences in high throughput laboratories and highlighting positions requiring attention. Copyright © 2012 Elsevier B.V. All rights reserved.
History of retinoic acid receptors.
Benbrook, Doris M; Chambon, Pierre; Rochette-Egly, Cécile; Asson-Batres, Mary Ann
2014-01-01
The discovery of retinoic acid receptors arose from research into how vitamins are essential for life. Early studies indicated that Vitamin A was metabolized into an active factor, retinoic acid (RA), which regulates RNA and protein expression in cells. Each step forward in our understanding of retinoic acid in human health was accomplished by the development and application of new technologies. Development cDNA cloning techniques and discovery of nuclear receptors for steroid hormones provided the basis for identification of two classes of retinoic acid receptors, RARs and RXRs, each of which has three isoforms, α, β and ɣ. DNA manipulation and crystallographic studies revealed that the receptors contain discrete functional domains responsible for binding to DNA, ligands and cofactors. Ligand binding was shown to induce conformational changes in the receptors that cause release of corepressors and recruitment of coactivators to create functional complexes that are bound to consensus promoter DNA sequences called retinoic acid response elements (RAREs) and that cause opening of chromatin and transcription of adjacent genes. Homologous recombination technology allowed the development of mice lacking expression of retinoic acid receptors, individually or in various combinations, which demonstrated that the receptors exhibit vital, but redundant, functions in fetal development and in vision, reproduction, and other functions required for maintenance of adult life. More recent advancements in sequencing and proteomic technologies reveal the complexity of retinoic acid receptor involvement in cellular function through regulation of gene expression and kinase activity. Future directions will require systems biology approaches to decipher how these integrated networks affect human stem cells, health, and disease.
Knierim, Dennis; Maiss, Edgar; Kenyon, Lawrence; Winter, Stephan; Menzel, Wulf
2015-10-01
Luffa aphid-borne yellows virus (LABYV) was proposed as the name for a previously undescribed polerovirus based on partial genome sequences obtained from samples of cucurbit plants collected in Thailand between 2008 and 2013. In this study, we determined the first full-length genome sequence of LABYV. Based on phylogenetic analysis and genome properties, it is clear that this virus represents a distinct species in the genus Polerovirus. Analysis of sequences from sample TH24, which was collected in 2010 from a luffa plant in Thailand, reveals the presence of two different full-length genome consensus sequences.
Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L
2018-01-01
Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.
Zeng, Lu; Kortschak, R. Daniel; Raison, Joy M.
2018-01-01
Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package. PMID:29538441
Singh, B N; Mudgil, Yashwanti; Sopory, S K; Reddy, M K
2003-07-01
We have successfully expressed enzymatically active plant topoisomerase II in Escherichia coli for the first time, which has enabled its biochemical characterization. Using a PCR-based strategy, we obtained a full-length cDNA and the corresponding genomic clone of tobacco topoisomerase II. The genomic clone has 18 exons interrupted by 17 introns. Most of the 5' and 3' splice junctions follow the typical canonical consensus dinucleotide sequence GU-AG present in other plant introns. The position of introns and phasing with respect to primary amino acid sequence in tobacco TopII and Arabidopsis TopII are highly conserved, suggesting that the two genes are evolved from the common ancestral type II topoisomerase gene. The cDNA encodes a polypeptide of 1482 amino acids. The primary amino acid sequence shows a striking sequence similarity, preserving all the structural domains that are conserved among eukaryotic type II topoisomerases in an identical spatial order. We have expressed the full-length polypeptide in E. coli and purified the recombinant protein to homogeneity. The full-length polypeptide relaxed supercoiled DNA and decatenated the catenated DNA in a Mg(2+)- and ATP-dependent manner, and this activity was inhibited by 4'-(9-acridinylamino)-3'-methoxymethanesulfonanilide (m-AMSA). The immunofluorescence and confocal microscopic studies, with antibodies developed against the N-terminal region of tobacco recombinant topoisomerase II, established the nuclear localization of topoisomerase II in tobacco BY2 cells. The regulated expression of tobacco topoisomerase II gene under the GAL1 promoter functionally complemented a temperature-sensitive TopII(ts) yeast mutant.
Genetic map of artichoke × wild cardoon: toward a consensus map for Cynara cardunculus.
Sonnante, Gabriella; Gatto, Angela; Morgese, Anita; Montemurro, Francesco; Sarli, Giulio; Blanco, Emanuela; Pignone, Domenico
2011-11-01
An integrated consensus linkage map is proposed for globe artichoke. Maternal and paternal genetic maps were constructed on the basis of an F(1) progeny derived from crossing an artichoke genotype (Mola) with its progenitor, the wild cardoon (Tolfa), using EST-derived SSRs, genomic SSRs, AFLPs, ten genes, and two morphological traits. For most genes, mainly belonging to the chlorogenic acid pathway, new markers were developed. Five of these were SNP markers analyzed through high-resolution melt technology. From the maternal (Mola) and paternal (Tolfa) maps, an integrated map was obtained, containing 337 molecular and one morphological markers ordered in 17 linkage groups (LGs), linked between Mola and Tolfa. The integrated map covers 1,488.8 cM, with an average distance of 4.4 cM between markers. The map was aligned with already existing maps for artichoke, and 12 LGs were linked via 31 bridge markers. LG numbering has been proposed. A total of 124 EST-SSRs and two genes were mapped here for the first time, providing a framework for the construction of a functional map in artichoke. The establishment of a consensus map represents a necessary condition to plan a complete sequencing of the globe artichoke genome.
Fine-tuning structural RNA alignments in the twilight zone.
Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert
2010-04-30
A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.
Scammell, Jonathan G.; Funkhouser, Jane D.; Moyer, Felricia S.; Gibson, Susan V.; Willis, Donna L.
2008-01-01
The goal of this study was to characterize the gonadotropins expressed in pituitary glands of the New World squirrel monkey (Saimiri sp.) and owl monkey (Aotus sp.). The various subunits were amplified from total RNA from squirrel monkey and owl monkey pituitary glands by reverse transcription-polymerase chain reaction and the deduced amino acid sequences compared to those of other species. Mature squirrel monkey and owl monkey glycoprotein hormone α-polypeptides (96 amino acids in length) were determined to be 80% homologous to the human sequence. The sequences of mature β subunits of follicle stimulating hormone (FSHβ) from squirrel monkey and owl monkey (111 amino acids in length) are 92% homologous to human FSHβ. New World primate glycoprotein hormone α-polypeptides and FSHβ subunits showed conservation of all cysteine residues and consensus N-linked glycosylation sites. Attempts to amplify the β-subunit of luteinizing hormone from squirrel monkey and owl monkey pituitary glands were unsuccessful. Rather, the β-subunit of chorionic gonadotropin (CG) was amplified from pituitaries of both New World primates. Squirrel monkey and owl monkey CGβ are 143 and 144 amino acids in length and 77% homologous with human CGβ. The greatest divergence is in the C terminus, where all four sites for O-linked glycosylation in human CGβ, responsible for delayed metabolic clearance, are predicted to be absent in New World primate CGβs. It is likely that CG secreted from pituitary of New World primates exhibits a relatively short half-life compared to human CG. PMID:17897645
Scammell, Jonathan G; Funkhouser, Jane D; Moyer, Felricia S; Gibson, Susan V; Willis, Donna L
2008-02-01
The goal of this study was to characterize the gonadotropins expressed in pituitary glands of the New World squirrel monkey (Saimiri sp.) and owl monkey (Aotus sp.). The various subunits were amplified from total RNA from squirrel monkey and owl monkey pituitary glands by reverse transcription-polymerase chain reaction and the deduced amino acid sequences compared to those of other species. Mature squirrel monkey and owl monkey glycoprotein hormone alpha-polypeptides (96 amino acids in length) were determined to be 80% homologous to the human sequence. The sequences of mature beta subunits of follicle stimulating hormone (FSHbeta) from squirrel monkey and owl monkey (111 amino acids in length) are 92% homologous to human FSHbeta. New World primate glycoprotein hormone alpha-polypeptides and FSHbeta subunits showed conservation of all cysteine residues and consensus N-linked glycosylation sites. Attempts to amplify the beta-subunit of luteinizing hormone from squirrel monkey and owl monkey pituitary glands were unsuccessful. Rather, the beta-subunit of chorionic gonadotropin (CG) was amplified from pituitaries of both New World primates. Squirrel monkey and owl monkey CGbeta are 143 and 144 amino acids in length and 77% homologous with human CGbeta. The greatest divergence is in the C terminus, where all four sites for O-linked glycosylation in human CGbeta, responsible for delayed metabolic clearance, are predicted to be absent in New World primate CGbetas. It is likely that CG secreted from pituitary of New World primates exhibits a relatively short half-life compared to human CG.
González-Pedrajo, B; Ballado, T; Campos, A; Sockett, R E; Camarena, L; Dreyfus, G
1997-01-01
Motility in the photosynthetic bacterium Rhodobacter sphaeroides is achieved by the unidirectional rotation of a single subpolar flagellum. In this study, transposon mutagenesis was used to obtain nonmotile flagellar mutants from this bacterium. We report here the isolation and characterization of a mutant that shows a polyhook phenotype. Morphological characterization of the mutant was done by electron microscopy. Polyhooks were obtained by shearing and were used to purify the hook protein monomer (FlgE). The apparent molecular mass of the hook protein was 50 kDa. N-terminal amino acid sequencing and comparisons with the hook proteins of other flagellated bacteria indicated that the Rhodobacter hook protein has consensus sequences common to axial flagellar components. A 25-kb fragment from an R. sphaeroides WS8 cosmid library restored wild-type flagellation and motility to the mutant. Using DNA adjacent to the inserted transposon as a probe, we identified a 4.6-kb SalI restriction fragment that contained the gene responsible for the polyhook phenotype. Nucleotide sequence analysis of this region revealed an open reading frame with a deduced amino acid sequence that was 23.4% identical to that of FliK of Salmonella typhimurium, the polypeptide responsible for hook length control in that enteric bacterium. The relevance of a gene homologous to fliK in the uniflagellated bacterium R. sphaeroides is discussed. PMID:9352903
Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou
2011-01-01
DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738
Lam, Kathy N; Charles, Trevor C
2015-01-01
Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.
Zhu, J K; Bressan, R A; Hasegawa, P M
1993-09-15
We demonstrate that ANJ1, a higher plant homolog of the bacterial molecular chaperone DnaJ, is a substrate in vitro for protein farnesyl- and geranylgeranyl-transferase activities present in cell extracts of the plant Atriplex nummularia and yeast Saccharomyces cerevisiae. Isoprenylation did not occur when cysteine was replaced by serine in the CAQQ motif at the carboxyl terminus of ANJ1, indicating that this sequence functions as a CaaX consensus sequence for polyisoprenylation (where C is cysteine, a is an aliphatic residue, and X is any amino acid residue). Substitution of leucine for the terminal glutamine did not result in the expected geranylgeranylation as occurs with mammalian proteins containing a carboxyl-terminal leucine. Unlike the wild-type ANJ1, neither of the proteins containing these amino acid substitutions could functionally complement the yeast temperature-sensitive mutant mas5. Farnesylation enhanced the association of ANJ1 with A. nummularia microsomal membranes. Electrophoretic mobility of ANJ1 from the plant indicated that the protein is isoprenylated in vivo.
Zhu, J K; Bressan, R A; Hasegawa, P M
1993-01-01
We demonstrate that ANJ1, a higher plant homolog of the bacterial molecular chaperone DnaJ, is a substrate in vitro for protein farnesyl- and geranylgeranyl-transferase activities present in cell extracts of the plant Atriplex nummularia and yeast Saccharomyces cerevisiae. Isoprenylation did not occur when cysteine was replaced by serine in the CAQQ motif at the carboxyl terminus of ANJ1, indicating that this sequence functions as a CaaX consensus sequence for polyisoprenylation (where C is cysteine, a is an aliphatic residue, and X is any amino acid residue). Substitution of leucine for the terminal glutamine did not result in the expected geranylgeranylation as occurs with mammalian proteins containing a carboxyl-terminal leucine. Unlike the wild-type ANJ1, neither of the proteins containing these amino acid substitutions could functionally complement the yeast temperature-sensitive mutant mas5. Farnesylation enhanced the association of ANJ1 with A. nummularia microsomal membranes. Electrophoretic mobility of ANJ1 from the plant indicated that the protein is isoprenylated in vivo. Images Fig. 1 Fig. 2 Fig. 3 Fig. 5 Fig. 6 Fig. 7 PMID:8378331
Ghosh, Jayadri Sekhar; Bhattacharya, Samik; Pal, Amita
2017-06-01
The unavailability of the reproductive structure and unpredictability of vegetative characters for the identification and phylogenetic study of bamboo prompted the application of molecular techniques for greater resolution and consensus. We first employed internal transcribed spacer (ITS1, 5.8S rRNA and ITS2) sequences to construct the phylogenetic tree of 21 tropical bamboo species. While the sequence alone could grossly reconstruct the traditional phylogeny amongst the 21-tropical species studied, some anomalies were encountered that prompted a further refinement of the phylogenetic analyses. Therefore, we integrated the secondary structure of the ITS sequences to derive individual sequence-structure matrix to gain more resolution on the phylogenetic reconstruction. The results showed that ITS sequence-structure is the reliable alternative to the conventional phenotypic method for the identification of bamboo species. The best-fit topology obtained by the sequence-structure based phylogeny over the sole sequence based one underscores closer clustering of all the studied Bambusa species (Sub-tribe Bambusinae), while Melocanna baccifera, which belongs to Sub-Tribe Melocanneae, disjointedly clustered as an out-group within the consensus phylogenetic tree. In this study, we demonstrated the dependability of the combined (ITS sequence+structure-based) approach over the only sequence-based analysis for phylogenetic relationship assessment of bamboo.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stenger, Drake C., E-mail: drake.stenger@ars.usda.
Population structure of Homalodisca coagulata Virus-1 (HoCV-1) among and within field-collected insects sampled from a single point in space and time was examined. Polymorphism in complete consensus sequences among single-insect isolates was dominated by synonymous substitutions. The mutant spectrum of the C2 helicase region within each single-insect isolate was unique and dominated by nonsynonymous singletons. Bootstrapping was used to correct the within-isolate nonsynonymous:synonymous arithmetic ratio (N:S) for RT-PCR error, yielding an N:S value ~one log-unit greater than that of consensus sequences. Probability of all possible single-base substitutions for the C2 region predicted N:S values within 95% confidence limits of themore » corrected within-isolate N:S when the only constraint imposed was viral polymerase error bias for transitions over transversions. These results indicate that bottlenecks coupled with strong negative/purifying selection drive consensus sequences toward neutral sequence space, and that most polymorphism within single-insect isolates is composed of newly-minted mutations sampled prior to selection. -- Highlights: •Sampling protocol minimized differential selection/history among isolates. •Polymorphism among consensus sequences dominated by negative/purifying selection. •Within-isolate N:S ratio corrected for RT-PCR error by bootstrapping. •Within-isolate mutant spectrum dominated by new mutations yet to undergo selection.« less
Zhu, X; Naz, R K
1999-03-01
The deduced ZP3 amino acid (aa) sequences of 13 vertebrate species namely mouse, hamster, rabbit, pig, porcine, cow, dog, cat, human, bonnet, marmoset, carp, and frog were compared using the PILEUP and PRETTY alignment programs (GCG, Wisconsin, USA). The published aa sequences obtained from 13 vertebrate species indicated the overall evolutionarily conservation in the N-terminus, central region, and C-terminus of the ZP3 polypeptide. More variations of ZP3 polypeptide sequences were seen in the alignments of carp and frog from the 11 mammalian species making the leader sequence more prominent. The canonical furin proteolytic processing signal at the C-terminus was found in all the ZP3 polypeptide sequences except of carp and frog. In the central region, the ZP3 deduced aa sequences of all the 13 vertebrate species aligned well, and six relatively conserved sequences were found. There are 11 conserved cysteine residues in the central region across all species including carp and frog, indicating that these residues have longer evolutionary history. The ZP3 aa sequence similarities were examined using the GAP program (GCG). The highest aa similarities are observed between the members of the same order within the class mammalia, and also (95.4%) between pig (ungulata) and rabbit (lagomorpha). The deduced ZP3 aa sequences per se may not be enough to build a phylogenetic tree.
Transcriptomic analysis of rice aleurone cells identified a novel abscisic acid response element.
Watanabe, Kenneth A; Homayouni, Arielle; Gu, Lingkun; Huang, Kuan-Ying; Ho, Tuan-Hua David; Shen, Qingxi J
2017-09-01
Seeds serve as a great model to study plant responses to drought stress, which is largely mediated by abscisic acid (ABA). The ABA responsive element (ABRE) is a key cis-regulatory element in ABA signalling. However, its consensus sequence (ACGTG(G/T)C) is present in the promoters of only about 40% of ABA-induced genes in rice aleurone cells, suggesting other ABREs may exist. To identify novel ABREs, RNA sequencing was performed on aleurone cells of rice seeds treated with 20 μM ABA. Gibbs sampling was used to identify enriched elements, and particle bombardment-mediated transient expression studies were performed to verify the function. Gene ontology analysis was performed to predict the roles of genes containing the novel ABREs. This study revealed 2443 ABA-inducible genes and a novel ABRE, designated as ABREN, which was experimentally verified to mediate ABA signalling in rice aleurone cells. Many of the ABREN-containing genes are predicted to be involved in stress responses and transcription. Analysis of other species suggests that the ABREN may be monocot specific. This study also revealed interesting expression patterns of genes involved in ABA metabolism and signalling. Collectively, this study advanced our understanding of diverse cis-regulatory sequences and the transcriptomes underlying ABA responses in rice aleurone cells. © 2017 John Wiley & Sons Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lai, Xiaokuang; Davis, F.C.; Ingram, L.O.
1997-02-01
Genomic libraries from nine cellobiose-metabolizing bacteria were screened for cellobiose utilization. Positive clones were recovered from six libraries, all of which encode phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) proteins. Clones from Bacillus subtilis, Butyrivibrio fibrisolvens, and Klebsiella oxytoca allowed the growth of recombinant Escherichia coli in cellobiose-M9 minimal medium. The K. oxytoca clone, pLOI1906, exhibited an unusually broad substrate range (cellobiose, arbutin, salicin, and methylumbelliferyl derivatives of glucose, cellobiose, mannose, and xylose) and was sequenced. The insert in this plasmid encoded the carboxy-terminal region of a putative regulatory protein, cellobiose permease (single polypeptide), and phospho-{beta}-glucosidase, which appear to form an operon (casRAB).more » Subclones allowed both casA and casB to be expressed independently, as evidenced by in vitro complementation. An analysis of the translated sequences from the EIIC domains of cellobiose, aryl-{beta}-glucoside, and other disaccharide permeases allowed the identification of a 50-amino-acid conserved region. A disaccharide consensus sequence is proposed for the most conserved segment (13 amino acids), which may represent part of the EIIC active site for binding and phosphorylation. 63 refs., 4 figs., 4 tabs.« less
Production, purification, sequencing and activity spectra of mutacins D-123.1 and F-59.1
2011-01-01
Background The increase in bacterial resistance to antibiotics impels the development of new anti-bacterial substances. Mutacins (bacteriocins) are small antibacterial peptides produced by Streptococcus mutans showing activity against bacterial pathogens. The objective of the study was to produce and characterise additional mutacins in order to find new useful antibacterial substances. Results Mutacin F-59.1 was produced in liquid media by S. mutans 59.1 while production of mutacin D-123.1 by S. mutans 123.1 was obtained in semi-solid media. Mutacins were purified by hydrophobic chromatography. The amino acid sequences of the mutacins were obtained by Edman degradation and their molecular mass was determined by mass spectrometry. Mutacin F-59.1 consists of 25 amino acids, containing the YGNGV consensus sequence of pediocin-like bacteriocins with a molecular mass calculated at 2719 Da. Mutacin D-123.1 has an identical molecular mass (2364 Da) with the same first 9 amino acids as mutacin I. Mutacins D-123.1 and F-59.1 have wide activity spectra inhibiting human and food-borne pathogens. The lantibiotic mutacin D-123.1 possesses a broader activity spectrum than mutacin F-59.1 against the bacterial strains tested. Conclusion Mutacin F-59.1 is the first pediocin-like bacteriocin identified and characterised that is produced by Streptococcus mutans. Mutacin D-123.1 appears to be identical to mutacin I previously identified in different strains of S. mutans. PMID:21477375
Production, purification, sequencing and activity spectra of mutacins D-123.1 and F-59.1.
Nicolas, Guillaume G; LaPointe, Gisèle; Lavoie, Marc C
2011-04-10
The increase in bacterial resistance to antibiotics impels the development of new anti-bacterial substances. Mutacins (bacteriocins) are small antibacterial peptides produced by Streptococcus mutans showing activity against bacterial pathogens. The objective of the study was to produce and characterise additional mutacins in order to find new useful antibacterial substances. Mutacin F-59.1 was produced in liquid media by S. mutans 59.1 while production of mutacin D-123.1 by S. mutans 123.1 was obtained in semi-solid media. Mutacins were purified by hydrophobic chromatography. The amino acid sequences of the mutacins were obtained by Edman degradation and their molecular mass was determined by mass spectrometry. Mutacin F-59.1 consists of 25 amino acids, containing the YGNGV consensus sequence of pediocin-like bacteriocins with a molecular mass calculated at 2719 Da. Mutacin D-123.1 has an identical molecular mass (2364 Da) with the same first 9 amino acids as mutacin I. Mutacins D-123.1 and F-59.1 have wide activity spectra inhibiting human and food-borne pathogens. The lantibiotic mutacin D-123.1 possesses a broader activity spectrum than mutacin F-59.1 against the bacterial strains tested. Mutacin F-59.1 is the first pediocin-like bacteriocin identified and characterised that is produced by Streptococcus mutans. Mutacin D-123.1 appears to be identical to mutacin I previously identified in different strains of S. mutans.
Samal, Sweety; Kumar, Sachin; Khattar, Sunil K; Samal, Siba K
2011-10-01
A key determinant of Newcastle disease virus (NDV) virulence is the amino acid sequence at the fusion (F) protein cleavage site. The NDV F protein is synthesized as an inactive precursor, F(0), and is activated by proteolytic cleavage between amino acids 116 and 117 to produce two disulfide-linked subunits, F(1) and F(2). The consensus sequence of the F protein cleavage site of virulent [(112)(R/K)-R-Q-(R/K)-R↓F-I(118)] and avirulent [(112)(G/E)-(K/R)-Q-(G/E)-R↓L-I(118)] strains contains a conserved glutamine residue at position 114. Recently, some NDV strains from Africa and Madagascar were isolated from healthy birds and have been reported to contain five basic residues (R-R-R-K-R↓F-I/V or R-R-R-R-R↓F-I/V) at the F protein cleavage site. In this study, we have evaluated the role of this conserved glutamine residue in the replication and pathogenicity of NDV by using the moderately pathogenic Beaudette C strain and by making Q114R, K115R and I118V mutants of the F protein in this strain. Our results showed that changing the glutamine to a basic arginine residue reduced viral replication and attenuated the pathogenicity of the virus in chickens. The pathogenicity was further reduced when the isoleucine at position 118 was substituted for valine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gacias, Mar; Perez-Marti, Albert; Pujol-Vidal, Magdalena
Highlights: Black-Right-Pointing-Pointer The Cact gene is induced in mouse skeletal muscle after 24 h of fasting. Black-Right-Pointing-Pointer The Cact gene contains a functional consensus sequence for ERR. Black-Right-Pointing-Pointer This sequence binds ERR{alpha} both in vivo and in vitro. Black-Right-Pointing-Pointer This ERRE is required for the activation of Cact expression by the PGC-1/ERR axis. Black-Right-Pointing-Pointer Our results add Cact as a genuine gene target of these transcriptional regulators. -- Abstract: Carnitine/acylcarnitine translocase (CACT) is a mitochondrial-membrane carrier proteins that mediates the transport of acylcarnitines into the mitochondrial matrix for their oxidation by the mitochondrial fatty acid-oxidation pathway. CACT deficiency causes amore » variety of pathological conditions, such as hypoketotic hypoglycemia, cardiac arrest, hepatomegaly, hepatic dysfunction and muscle weakness, and it can be fatal in newborns and infants. Here we report that expression of the Cact gene is induced in mouse skeletal muscle after 24 h of fasting. To gain insight into the control of Cact gene expression, we examine the transcriptional regulation of the mouse Cact gene. We show that the 5 Prime -flanking region of this gene is transcriptionally active and contains a consensus sequence for the estrogen-related receptor (ERR), a member of the nuclear receptor family of transcription factors. This sequence binds ERR{alpha}in vivo and in vitro and is required for the activation of Cact expression by the peroxisome proliferator-activated receptor gamma coactivator (PGC)-1/ERR axis. We also demonstrate that XTC790, the inverse agonist of ERR{alpha}, specifically blocks Cact activation by PGC-1{beta} in C2C12 cells.« less
Litwin, Christine M.; Byrne, Burke L.
1998-01-01
Vibrio vulnificus is a halophilic, marine pathogen that has been associated with septicemia and serious wound infections in patients with iron overload and preexisting liver disease. For V. vulnificus, the ability to acquire iron from the host has been shown to correlate with virulence. V. vulnificus is able to use host iron sources such as hemoglobin and heme. We previously constructed a fur mutant of V. vulnificus which constitutively expresses at least two iron-regulated outer membrane proteins, of 72 and 77 kDa. The N-terminal amino acid sequence of the 77-kDa protein purified from the V. vulnificus fur mutant had 67% homology with the first 15 amino acids of the mature protein of the Vibrio cholerae heme receptor, HutA. In this report, we describe the cloning, DNA sequence, mutagenesis, and analysis of transcriptional regulation of the structural gene for HupA, the heme receptor of V. vulnificus. DNA sequencing of hupA demonstrated a single open reading frame of 712 amino acids that was 50% identical and 66% similar to the sequence of V. cholerae HutA and similar to those of other TonB-dependent outer membrane receptors. Primer extension analysis localized one promoter for the V. vulnificus hupA gene. Analysis of the promoter region of V. vulnificus hupA showed a sequence homologous to the consensus Fur box. Northern blot analysis showed that the transcript was strongly regulated by iron. An internal deletion in the V. vulnificus hupA gene, done by using marker exchange, resulted in the loss of expression of the 77-kDa protein and the loss of the ability to use hemin or hemoglobin as a source of iron. The hupA deletion mutant of V. vulnificus will be helpful in future studies of the role of heme iron in V. vulnificus pathogenesis. PMID:9632577
Shotgun Protein Sequencing with Meta-contig Assembly*
Guthals, Adrian; Clauser, Karl R.; Bandeira, Nuno
2012-01-01
Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings. PMID:22798278
Shotgun protein sequencing with meta-contig assembly.
Guthals, Adrian; Clauser, Karl R; Bandeira, Nuno
2012-10-01
Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.
Low molecular weight squash trypsin inhibitors from Sechium edule seeds.
Laure, Hélen J; Faça, Vítor M; Izumi, Clarice; Padovan, Júlio C; Greene, Lewis J
2006-02-01
Nine chromatographic components containing trypsin inhibitor activity were isolated from Sechium edule seeds by acetone fractionation, gel filtration, affinity chromatography and RP-HPLC in an overall yield of 46% of activity and 0.05% of protein. The components obtained with highest yield of total activity and highest specific activity were sequenced by Edman degradation and their molecular masses determined by mass spectrometry. The inhibitors contained 31, 32 and 27 residues per molecule and their sequences were: SETI-IIa, EDRKCPKILMRCKRDSDCLAKCTCQESGYCG; SETI-IIb, EEDRKCPKILMRCKRDSDCLAKCTCQESGYCG and SETI-V, CPRILMKCKLDTDCFPTCTCRPSGFCG. SETI-IIa and SETI-IIb, which differed by an amino-terminal E in the IIb form, were not separable under the conditions employed. The sequences are consistent with consensus sequences obtained from 37 other inhibitors: CPriI1meCk_DSDCla_C_C_G_CG, where capital letters are invariant amino acid residues and lower case letters are the most preserved in this position. SETI-II and SETI-V form complexes with trypsin with a 1:1 stoichiometry and have dissociation constants of 5.4x10(-11)M and 1.1x10(-9)M, respectively.
Fine-tuning structural RNA alignments in the twilight zone
2010-01-01
Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706
Merino, Susana; Knirel, Yuriy A.; Regué, Miguel; Tomás, Juan M.
2013-01-01
We experimentally identified the activities of six predicted heptosyltransferases in Actinobacillus pleuropneumoniae genome serotype 5b strain L20 and serotype 3 strain JL03. The initial identification was based on a bioinformatic analysis of the amino acid similarity between these putative heptosyltrasferases with others of known function from enteric bacteria and Aeromonas. The putative functions of all the Actinobacillus pleuropneumoniae heptosyltrasferases were determined by using surrogate LPS acceptor molecules from well-defined A. hydrophyla AH-3 and A. salmonicida A450 mutants. Our results show that heptosyltransferases APL_0981 and APJL_1001 are responsible for the transfer of the terminal outer core D-glycero-D-manno-heptose (D,D-Hep) residue although they are not currently included in the CAZY glycosyltransferase 9 family. The WahF heptosyltransferase group signature sequence [S(T/S)(GA)XXH] differs from the heptosyltransferases consensus signature sequence [D(TS)(GA)XXH], because of the substitution of D261 for S261, being unique. PMID:23383222
Listorti, Valeria; Laconi, Andrea; Catelli, Elena; Cecchinato, Mattia; Lupini, Caterina; Naylor, Clive J
2017-10-09
IBV genotype QX causes sufficient disease in Europe for several commercial companies to have started developing live attenuated vaccines. Here, one of those vaccines (L1148) was fully consensus sequenced alongside its progenitor field strain (1148-A) to determine vaccine markers, thereby enabling detection on farms. Twenty-eight single nucleotide substitutions were associated with the 1148-A attenuation, of which any combination can identify vaccine L1148 in the field. Sixteen substitutions resulted in amino acid coding changes of which half were in spike. One change in the 1b gene altered the normally highly conserved final 5 nucleotides of the transcription regulatory sequence of the S gene, common to all IBV QX genes. No mutations can currently be associated with the attenuation process. Field vaccination strategies would greatly benefit by such comparative sequence data being mandatorily submitted to regulators prior to vaccine release following a successful registration process. Copyright © 2017. Published by Elsevier Ltd.
de la Rosa, Guillermo; Corrales-García, Ligia L; Rodriguez-Ruiz, Ximena; López-Vera, Estuardo; Corzo, Gerardo
2018-07-01
The three-fingered toxin family and more precisely short-chain α-neurotoxins (also known as Type I α-neurotoxins) are crucial in defining the elapid envenomation process, but paradoxically, they are barely neutralized by current elapid snake antivenoms. This work has been focused on the primary structural identity among Type I neurotoxins in order to create a consensus short-chain α-neurotoxin with conserved characteristics. A multiple sequence alignment considering the twelve most toxic short-chain α-neurotoxins reported from the venoms of the elapid genera Acanthophis, Oxyuranus, Walterinnesia, Naja, Dendroaspis and Micrurus led us to propose a short-chain consensus α-neurotoxin, here named ScNtx. The synthetic ScNtx gene was de novo constructed and cloned into the expression vector pQE30 containing a 6His-Tag and an FXa proteolytic cleavage region. Escherichia coli Origami cells transfected with the pQE30/ScNtx vector expressed the recombinant consensus neurotoxin in a soluble form with a yield of 1.5 mg/L of culture medium. The 60-amino acid residue ScNtx contains canonical structural motifs similar to α-neurotoxins from African elapids and its LD 50 of 3.8 µg/mice is similar to the most toxic short-chain α-neurotoxins reported from elapid venoms. Furthermore, ScNtx was also able to antagonize muscular, but not neuronal, nicotinic acetylcholine receptors (nAChR). Rabbits immunized with ScNtx were able to immune-recognize short-chain α-neurotoxins within whole elapid venoms. Type I neurotoxins are difficult to isolate and purify from natural sources; therefore, the heterologous expression of molecules such ScNtx, bearing crucial motifs and key amino acids, is a step forward to create common immunogens for developing cost-effective antivenoms with a wider spectrum of efficacy, quality and strong therapeutic value.
SAM-VI RNAs selectively bind S-adenosylmethionine and exhibit similarities to SAM-III riboswitches.
Mirihana Arachchilage, Gayan; Sherlock, Madeline E; Weinberg, Zasha; Breaker, Ronald R
2018-03-04
Five distinct riboswitch classes that regulate gene expression in response to the cofactor S-adenosylmethionine (SAM) or its metabolic breakdown product S-adenosylhomocysteine (SAH) have been reported previously. Collectively, these SAM- or SAH-sensing RNAs constitute the most abundant collection of riboswitches, and are found in nearly every major bacterial lineage. Here, we report a potential sixth member of this pervasive riboswitch family, called SAM-VI, which is predominantly found in Bifidobacterium species. SAM-VI aptamers selectively bind the cofactor SAM and strongly discriminate against SAH. The consensus sequence and structural model for SAM-VI share some features with the consensus model for the SAM-III riboswitch class, whose members are mainly found in lactic acid bacteria. However, there are sufficient differences between the two classes such that current bioinformatics methods separately cluster representatives of the two motifs. These findings highlight the abundance of RNA structures that can form to selectively recognize SAM, and showcase the ability of RNA to utilize diverse strategies to perform similar biological functions.
Characterization and chromosomal mapping of the human TFG gene involved in thyroid carcinoma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mencinger, M.; Panagopoulos, I.; Andreasson, P.
1997-05-01
Homology searches in the Expressed Sequence Tag Database were performed using SPYGQ-rich regions as query sequences to find genes encoding protein regions similar to the N-terminal parts of the sarcoma-associated EWS and FUS proteins. Clone 22911 (T74973), encoding a SPYGQ-rich region in its 5{prime} end, and several other clones that overlapped 22911 were selected. The combined data made it possible to assemble a full-length cDNA sequence. This cDNA sequence is 1677 bp, containing an initiation codon ATG, an open reading frame of 400 amino acids, a poly(A) signal, and a poly(A) tail. We found 100% identity between the 5{prime} partmore » of the consensus sequence and the 598-bp-long sequence named TFG. The TFG sequence is fused to the 3{prime} end of NTRK1, generating the TRK-T3 fusion transcript found in papillary thyroid carcinoma. The cDNA therefore represents the full-length transcript of the TFG gene. TFG was localized to 3q11-q12 by fluorescence in situ hybridization. The 3{prime} and the 5{prime} ends of the TFG cDNA probe hybridized to a 2.2-kb band on Northern blot filters in all tissues examined. 28 refs., 5 figs., 1 tab.« less
Savidor, Alon; Barzilay, Rotem; Elinger, Dalia; Yarden, Yosef; Lindzen, Moshit; Gabashvili, Alexandra; Adiv Tal, Ophir; Levin, Yishai
2017-06-01
Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Chromosome-encoded narrow-spectrum Ambler class A beta-lactamase GIL-1 from Citrobacter gillenii.
Naas, Thierry; Aubert, Daniel; Ozcan, Ayla; Nordmann, Patrice
2007-04-01
A novel beta-lactamase gene was cloned from the whole-cell DNA of an enterobacterial Citrobacter gillenii reference strain that displayed a weak narrow-spectrum beta-lactam-resistant phenotype and was expressed in Escherichia coli. It encoded a clavulanic acid-inhibited Ambler class A beta-lactamase, GIL-1, with a pI value of 7.5 and a molecular mass of ca. 29 kDa. GIL-1 had the highest percent amino acid sequence identity with TEM-1 and SHV-1, 77%, and 67%, respectively, and only 46%, 31%, and 32% amino acid sequence identity with CKO-1 (C. koseri), CdiA1 (C. diversus), and SED-1 (C. sedlaki), respectively. The substrate profile of the purified GIL-1 was similar to that of beta-lactamases TEM-1 and SHV-1. The blaGIL-1 gene was chromosomally located, as revealed by I-CeuI experiments, and was constitutively expressed at a low level in C. gillenii. No gene homologous to the regulatory ampR genes of chromosomal class C beta-lactamases was found upstream of the blaGIL-1 gene, which fits the noninducibility of beta-lactamase expression in C. gillenii. Rapid amplification of DNA 5' ends analysis of the promoter region revealed putative promoter sequences that diverge from what has been identified as the consensus sequence in E. coli. The blaGIL-1 gene was part of a 5.5-kb DNA fragment bracketed by a 9-bp duplication and inserted between the d-lactate dehydrogenase gene and the ydbH genes; this DNA fragment was absent in other Citrobacter species. This work further illustrates the heterogeneity of beta-lactamases in Citrobacter spp., which may indicate that the variability of Citrobacter species is greater than expected.
Rogan, P K; Schneider, T D
1995-01-01
Predicting the effects of nucleotide substitutions in human splice sites has been based on analysis of consensus sequences. We used a graphic representation of sequence conservation and base frequency, the sequence logo, to demonstrate that a change in a splice acceptor of hMSH2 (a gene associated with familial nonpolyposis colon cancer) probably does not reduce splicing efficiency. This confirms a population genetic study that suggested that this substitution is a genetic polymorphism. The information theory-based sequence logo is quantitative and more sensitive than the corresponding splice acceptor consensus sequence for detection of true mutations. Information analysis may potentially be used to distinguish polymorphisms from mutations in other types of transcriptional, translational, or protein-coding motifs.
Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J
2015-09-18
La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hamilton, P T; Reeve, J N
1985-01-01
DNA fragments cloned from the methanogenic archaebacterium Methanobrevibacter smithii which complement mutations in the purE and proC genes of E. coli have been sequenced. Sequence analyses, transposon mutagenesis and expression in E. coli minicells indicate that purE and proC complementations result from the synthesis of M. smithii polypeptides with molecular weights of 36,697 and 27,836 respectively. The encoding genes appear to be located in operons. The M. smithii genome contains 69% A/T basepairs (bp) which is reflected in unusual codon usages and intergenic regions containing approximately 85% A/T bp. An insertion element, designated ISM1, was found within the cloned M. smithii DNA located adjacent to the proC complementing region. ISM1 is 1381 bp in length, has 29 bp terminal inverted repeat sequences and contains one major ORF encoded in 87% of the ISM1 sequence. ISM1 is mobile, present in approximately 10 copies per genome and integration duplicates 8 bp at the site of insertion. The duplicated sequences show homology with sequences within the 29 bp terminal repeat sequence of ISM1. Comparison of our data with sequences from halophilic archaebacteria suggests that 5'GAANTTTCA and 5'TTTTAATATAAA may be consensus promoter sequences for archaebacteria. These sequences closely resemble the consensus sequences which precede Drosophila heat-shock genes (Pelham 1982; Davidson et al. 1983). Methanogens appear to employ the eubacterial system of mRNA: 16SrRNA hybridization to ensure initiation of translation; the consensus ribosome binding sequence is 5'AGGTGA.
El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Hajj, Hazem; Kobeissy, Firas H
2017-01-01
Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been investigated. Among the major proteases assessed, calpain and caspase proteases have been associated with the execution phases of the pro-apoptotic and pro-necrotic cell death, generating caspase/calpain-specific cleaved fragments. The distinction between calpain and caspase protein fragments has been applied to distinguish injury mechanisms. Advanced proteomics technology has been used to identify these BDPs experimentally. However, it has been a challenge to identify these BDPs with high precision and efficiency, especially if we are targeting a number of proteins at one time. In this chapter, we present a novel bioinfromatic detection method that identifies BDPs accurately and efficiently with validation against experimental data. This method aims at predicting the consensus sequence occurrences and their variants in a large set of experimentally detected protein sequences based on state-of-the-art sequence matching and alignment algorithms. After detection, the method generates all the potential cleaved fragments by a specific protease. This space and time-efficient algorithm is flexible to handle the different orientations that the consensus sequence and the protein sequence can take before cleaving. It is O(mn) in space complexity and O(Nmn) in time complexity, with N number of protein sequences, m length of the consensus sequence, and n length of each protein sequence. Ultimately, this knowledge will subsequently feed into the development of a novel tool for researchers to detect diverse types of selected BDPs as putative disease markers, contributing to the diagnosis and treatment of related disorders.
Cao, Jingyuan; Zhou, Wenting; Yi, Yao; Jia, Zhiyuan; Bi, Shengli
2013-01-01
Hepatitis A virus (HAV) is the most common cause of infectious hepatitis throughout the world, spread largely by the fecal-oral route. To characterize the genetic diversity of the virus circulating in China where HAV in endemic, we selected the outbreak cases with identical sequences in VP1-2A junction region and compiled a panel of 42 isolates. The VP3-VP1-2A regions of the HAV capsid-coding genes were further sequenced and analyzed. The quasispecies distribution was evaluated by cloning the VP3 and VP1-2A genes in three clinical samples. Phylogenetic analysis demonstrated that the same genotyping results could be obtained whether using the complete VP3, VP1, or partial VP1-2A genes for analysis in this study, although some differences did exist. Most isolates clustered in sub-genotype IA, and fewer in sub-genotype IB. No amino acid mutations were found at the published neutralizing epitope sites, however, several unique amino acid substitutions in the VP3 or VP1 region were identified, with two amino acid variants closely located to the immunodominant site. Quasispecies analysis showed the mutation frequencies were in the range of 7.22x10-4 -2.33x10-3 substitutions per nucleotide for VP3, VP1, or VP1-2A. When compared with the consensus sequences, mutated nucleotide sites represented the minority of all the analyzed sequences sites. HAV replicated as a complex distribution of closely genetically related variants referred to as quasispecies, and were under negative selection. The results indicate that diverse HAV strains and quasispecies inside the viral populations are presented in China, with unique amino acid substitutions detected close to the immunodominant site, and that the possibility of antigenic escaping mutants cannot be ruled out and needs to be further analyzed. PMID:24069343
2012-01-01
Background Invertebrate biominerals are characterized by their extraordinary functionality and physical properties, such as strength, stiffness and toughness that by far exceed those of the pure mineral component of such composites. This is attributed to the organic matrix, secreted by specialized cells, which pervades and envelops the mineral crystals. Despite the obvious importance of the protein fraction of the organic matrix, only few in-depth proteomic studies have been performed due to the lack of comprehensive protein sequence databases. The recent public release of the gastropod Lottia gigantea genome sequence and the associated protein sequence database provides for the first time the opportunity to do a state-of-the-art proteomic in-depth analysis of the organic matrix of a mollusc shell. Results Using three different sodium hypochlorite washing protocols before shell demineralization, a total of 569 proteins were identified in Lottia gigantea shell matrix. Of these, 311 were assembled in a consensus proteome comprising identifications contained in all proteomes irrespective of shell cleaning procedure. Some of these proteins were similar in amino acid sequence, amino acid composition, or domain structure to proteins identified previously in different bivalve or gastropod shells, such as BMSP, dermatopontin, nacrein, perlustrin, perlucin, or Pif. In addition there were dozens of previously uncharacterized proteins, many containing repeated short linear motifs or homorepeats. Such proteins may play a role in shell matrix construction or control of mineralization processes. Conclusions The organic matrix of Lottia gigantea shells is a complex mixture of proteins comprising possible homologs of some previously characterized mollusc shell proteins, but also many novel proteins with a possible function in biomineralization as framework building blocks or as regulatory components. We hope that this data set, the most comprehensive available at present, will provide a platform for the further exploration of biomineralization processes in molluscs. PMID:22540284
Yasukawa, Hiro; Kuroita, Toshihiro; Tamura, Kentaro; Yamaguchi, Kazuo
2003-07-01
Penicillin binding proteins (PBPs) are penicillin-sensitive DD-peptidases catalyzing the terminal stages of bacterial cell wall assembly. We identified a Dictyostelium discoideum gene that encodes a protein of 522 amino acids showing similarity to Escherichia coli PBP4. The D. discoideum protein conserves three consensus sequences (SXXK, SXN and KTG) that are responsible for the catalytic activities of PBPs. The gene product prepared in the cell-free translation system showed carboxypeptidase activity but the activity was not detected in the presence of penicillin G. These results demonstrate that the D. discoideum gene encodes a eukaryotic form of penicillin-sensitive carboxypeptidase.
Stabilizing IkappaBalpha by "consensus" design.
Ferreiro, Diego U; Cervantes, Carla F; Truhlar, Stephanie M E; Cho, Samuel S; Wolynes, Peter G; Komives, Elizabeth A
2007-01-26
IkappaBalpha is the major regulator of transcription factor NF-kappaB function. The ankyrin repeat region of IkappaBalpha mediates specific interactions with NF-kappaB dimers, but ankyrin repeats 1, 5 and 6 display a highly dynamic character when not in complex with NF-kappaB. Using chemical denaturation, we show here that IkappaBalpha displays two folding transitions: a non-cooperative conversion under weak perturbation, and a major cooperative folding phase upon stronger insult. Taking advantage of a native Trp residue in ankyrin repeat (AR) 6 and engineered Trp residues in AR2, AR4 and AR5, we show that the cooperative transition involves AR2 and AR3, while the non-cooperative transition involves AR5 and AR6. The major structural transition can be affected by single amino acid substitutions converging to the "consensus" ankyrin repeat sequence, increasing the native state stability significantly. We further characterized the structural and dynamic properties of the native state ensemble of IkappaBalpha and the stabilized mutants by H/(2)H exchange mass spectrometry and NMR. The solution experiments were complemented with molecular dynamics simulations to elucidate the microscopic origins of the stabilizing effect of the consensus substitutions, which can be traced to the fast conformational dynamics of the folded ensemble.
Seo, H S; Kim, H Y; Jeong, J Y; Lee, S Y; Cho, M J; Bahk, J D
1995-03-01
A cDNA clone, RGA1, was isolated by using a GPA1 cDNA clone of Arabidopsis thaliana G protein alpha subunit as a probe from a rice (Oryza sativa L. IR-36) seedling cDNA library from roots and leaves. Sequence analysis of genomic clone reveals that the RGA1 gene has 14 exons and 13 introns, and encodes a polypeptide of 380 amino acid residues with a calculated molecular weight of 44.5 kDa. The encoded protein exhibits a considerable degree of amino acid sequence similarity to all the other known G protein alpha subunits. A putative TATA sequence (ATATGA), a potential CAAT box sequence (AGCAATAC), and a cis-acting element, CCACGTGG (ABRE), known to be involved in ABA induction are found in the promoter region. The RGA1 protein contains all the consensus regions of G protein alpha subunits except the cysteine residue near the C-terminus for ADP-ribosylation by pertussis toxin. The RGA1 polypeptide expressed in Escherichia coli was, however, ADP-ribosylated by 10 microM [adenylate-32P] NAD and activated cholera toxin. Southern analysis indicates that there are no other genes similar to the RGA1 gene in the rice genome. Northern analysis reveals that the RGA1 mRNA is 1.85 kb long and expressed in vegetative tissues, including leaves and roots, and that its expression is regulated by light.
Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E
1985-01-01
The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
Pester, Michael; Rattei, Thomas; Flechl, Stefan; Gröngröft, Alexander; Richter, Andreas; Overmann, Jörg; Reinhold-Hurek, Barbara; Loy, Alexander; Wagner, Michael
2012-01-01
Ammonia-oxidizing archaea (AOA) play an important role in nitrification and many studies exploit their amoA genes as marker for their diversity and abundance. We present an archaeal amoA consensus phylogeny based on all publicly available sequences (status June 2010) and provide evidence for the diversification of AOA into four previously recognized clusters and one newly identified major cluster. These clusters, for which we suggest a new nomenclature, harboured 83 AOA species-level OTU (using an inferred species threshold of 85% amoA identity). 454 pyrosequencing of amoA amplicons from 16 soils sampled in Austria, Costa Rica, Greenland and Namibia revealed that only 2% of retrieved sequences had no database representative on the species-level and represented 30–37 additional species-level OTUs. With the exception of an acidic soil from which mostly amoA amplicons of the Nitrosotalea cluster were retrieved, all soils were dominated by amoA amplicons from the Nitrososphaera cluster (also called group I.1b), indicating that the previously reported AOA from the Nitrosopumilus cluster (also called group I.1a) are absent or represent minor populations in soils. AOA richness estimates on the species level ranged from 8–83 co-existing AOAs per soil. Presence/absence of amoA OTUs (97% identity level) correlated with geographic location, indicating that besides contemporary environmental conditions also dispersal limitation across different continents and/or historical environmental conditions might influence AOA biogeography in soils. PMID:22141924
Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K
2001-01-24
We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
R2R--software to speed the depiction of aesthetic consensus RNA secondary structures.
Weinberg, Zasha; Breaker, Ronald R
2011-01-04
With continuing identification of novel structured noncoding RNAs, there is an increasing need to create schematic diagrams showing the consensus features of these molecules. RNA structural diagrams are typically made either with general-purpose drawing programs like Adobe Illustrator, or with automated or interactive programs specific to RNA. Unfortunately, the use of applications like Illustrator is extremely time consuming, while existing RNA-specific programs produce figures that are useful, but usually not of the same aesthetic quality as those produced at great cost in Illustrator. Additionally, most existing RNA-specific applications are designed for drawing single RNA molecules, not consensus diagrams. We created R2R, a computer program that facilitates the generation of aesthetic and readable drawings of RNA consensus diagrams in a fraction of the time required with general-purpose drawing programs. Since the inference of a consensus RNA structure typically requires a multiple-sequence alignment, the R2R user annotates the alignment with commands directing the layout and annotation of the RNA. R2R creates SVG or PDF output that can be imported into Adobe Illustrator, Inkscape or CorelDRAW. R2R can be used to create consensus sequence and secondary structure models for novel RNA structures or to revise models when new representatives for known RNA classes become available. Although R2R does not currently have a graphical user interface, it has proven useful in our efforts to create 100 schematic models of distinct noncoding RNA classes. R2R makes it possible to obtain high-quality drawings of the consensus sequence and structural models of many diverse RNA structures with a more practical amount of effort. R2R software is available at http://breaker.research.yale.edu/R2R and as an Additional file.
Polyphasic taxonomy, a consensus approach to bacterial systematics.
Vandamme, P; Pot, B; Gillis, M; de Vos, P; Kersters, K; Swings, J
1996-01-01
Over the last 25 years, a much broader range of taxonomic studies of bacteria has gradually replaced the former reliance upon morphological, physiological, and biochemical characterization. This polyphasic taxonomy takes into account all available phenotypic and genotypic data and integrates them in a consensus type of classification, framed in a general phylogeny derived from 16S rRNA sequence analysis. In some cases, the consensus classification is a compromise containing a minimum of contradictions. It is thought that the more parameters that will become available in the future, the more polyphasic classification will gain stability. In this review, the practice of polyphasic taxonomy is discussed for four groups of bacteria chosen for their relevance, complexity, or both: the genera Xanthomonas and Campylobacter, the lactic acid bacteria, and the family Comamonadaceae. An evaluation of our present insights, the conclusions derived from it, and the perspectives of polyphasic taxonomy are discussed, emphasizing the keystone role of the species. Taxonomists did not succeed in standardizing species delimitation by using percent DNA hybridization values. Together with the absence of another "gold standard" for species definition, this has an enormous repercussion on bacterial taxonomy. This problem is faced in polyphasic taxonomy, which does not depend on a theory, a hypothesis, or a set of rules, presenting a pragmatic approach to a consensus type of taxonomy, integrating all available data maximally. In the future, polyphasic taxonomy will have to cope with (i) enormous amounts of data, (ii) large numbers of strains, and (iii) data fusion (data aggregation), which will demand efficient and centralized data storage. In the future, taxonomic studies will require collaborative efforts by specialized laboratories even more than now is the case. Whether these future developments will guarantee a more stable consensus classification remains an open question. PMID:8801440
Benmansour, A; Brahimi, M; Tuffereau, C; Coulon, P; Lafay, F; Flamand, A
1992-03-01
The sequence of the glycoprotein gene of a street rabies virus was determined directly using fragments of a rabid dog brain after PCR amplification. Compared with that of the prototype strain CVS, this sequence displayed 10% divergence in overall amino acid composition. However only 6% divergence was noted in the ectodomain suggesting that structural constraints are exerted on this portion of the glycoprotein. A human strain isolated on cell culture from the saliva of a patient with clinical rabies had only five amino acid differences with the canine isolate, an indication of their close relatedness. These differences could have originated during transmission from dog to dog, or from dog to man, or during isolation on cell culture; they are nonetheless indicative of a genetic evolution of street rabies virus. This evolution was further evidenced by the selection of cell-adapted variants which displayed new amino acid substitutions in the glycoprotein. One of them concerned antigenic site III where arginine at position 333 was replaced by glutamine. As expected this substitution conferred resistance to a site IIIa monoclonal antibody (MAb), but surprisingly did not abolish neurovirulence for adult mice. However, a decrease in the neurovirulence of the cell-adapted variant in the presence of a site IIIa specific MAb was noted, suggesting that neurovirulence was due to a subpopulation neutralizable by the MAb. Simultaneous presence of both the parental and variant sequences was indeed evidenced in the brain of a mouse inoculated with the cell-adapted variant; during multiplication in the mouse brain, the frequency of the parental sequence rose from less than 10% to nearly 50%, indicating the selective advantage conferred by arginine 333 in nervous tissue. Altogether these results were suggestive of an intrinsic heterogeneity of street rabies virus. This heterogeneity was further demonstrated by the sequencing of molecular clones of the glycoprotein gene, which revealed that only one-third of the viral genomes present in the brain of a rabid dog had the consensus sequence. Two-thirds of the clones analyzed displayed from one to three amino acid substitutions. Such heterogeneous populations have been referred to as quasispecies, a concept which implies heterogeneous populations kept together in a dynamic equilibrium. This equilibrium could be rapidly displaced, giving the virus the capacity to adapt easily to new environmental conditions.
Iwasaki, H; Shiba, T; Makino, K; Nakata, A; Shinagawa, H
1989-01-01
The ruvA and ruvB genes of Escherichia coli constitute an operon which belongs to the SOS regulon. Genetic evidence suggests that the products of the ruv operon are involved in DNA repair and recombination. To begin biochemical characterization of these proteins, we developed a plasmid system that overproduced RuvB protein to 20% of total cell protein. Starting from the overproducing system, we purified RuvB protein. The purified RuvB protein behaved like a monomer in gel filtration chromatography and had an apparent relative molecular mass of 38 kilodaltons in sodium dodecyl sulfate-polyacrylamide gel electrophoresis, which agrees with the value predicted from the DNA sequence. The amino acid sequence of the amino-terminal region of the purified protein was analyzed, and the sequence agreed with the one deduced from the DNA sequence. Since the deduced sequence of RuvB protein contained the consensus sequence for ATP-binding proteins, we examined the ATP-binding and ATPase activities of the purified RuvB protein. RuvB protein had a stronger affinity to ADP than to ATP and weak ATPase activity. The results suggest that the weak ATPase activity of RuvB protein is at least partly due to end product inhibition by ADP. Images PMID:2529252
Abo, Ryan P; Ducar, Matthew; Garcia, Elizabeth P; Thorner, Aaron R; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M; Hahn, William C; Meyerson, Matthew; Lindeman, Neal I; Van Hummelen, Paul; MacConaill, Laura E
2015-02-18
Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
3'-terminal sequence of a small round structured virus (SRSV) in Japan.
Utagawa, E T; Takeda, N; Inouye, S; Kasuga, K; Yamazaki, S
1994-01-01
We determined the nucleotide sequence of about 1,000 bases from the 3'-terminus of a small round structured virus (SRSV), which caused a gastroenteritis outbreak in Chiba Prefecture, Japan, in 1987. The sequence was compared with the corresponding sequence region of Norwalk virus; it consisted of a part of the open reading frame 2 (ORF2), whole ORF3, and 3'-noncoding region (NCR). The 624-base-long ORF3 had sequence homology of 68% with the corresponding region of Norwalk virus. (The amino acid sequence homology was 74%.) The 94-base-long NCR had 65% homology with Norwalk virus. We then selected two consensus-sequence portions in the above sequence between Chiba and Norwalk viruses for primers in the reverse transcriptase-polymerase chain reaction (RT-PCR). Using this primer set, we detected 669-bp bands in agarose gel electrophoresis of RT-PCR products from feces containing Chiba or Norwalk viruses. Furthermore, in Southern hybridization with Chiba probes which were labeled with digoxigenin-dUTP in PCR, the bands of the two viruses were clearly stained under a low stringency condition. Since both Chiba and Norwalk viruses were detected by the above primer set although they are geographically and chronologically different viruses, our primer-pair may be useful for detection of a broad range of SRSVs which cause gastroenteritis in different areas.
Pujar, Shashikant; O'Leary, Nuala A; Farrell, Catherine M; Loveland, Jane E; Mudge, Jonathan M; Wallin, Craig; Girón, Carlos G; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; Martin, Fergal J; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Suner, Marie-Marthe; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bruford, Elspeth A; Bult, Carol J; Frankish, Adam; Murphy, Terence; Pruitt, Kim D
2018-01-04
The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
Adult Schistosoma mansoni express cathepsin L proteinase activity.
Smith, A M; Dalton, J P; Clough, K A; Kilbane, C L; Harrop, S A; Hole, N; Brindley, P J
1994-09-01
This report presents the deduced amino acid sequence of a novel cathepsin L proteinase from Schistosoma mansoni, and describes cathepsin L-like activity in extracts of adult schistosomes. Using consensus primers specific for cysteine proteinases, gene fragments were amplified from adult S. mansoni cDNA by PCR and cloned. One of these fragments showed marked identity to Sm31, the cathepsin B cysteine proteinase of adult S. mansoni, whereas another differed from Sm31 and was employed as a probe to isolate two cDNAs from an adult S. mansoni gene library. Together these cDNAs encoded a novel preprocathepsin L of 319 amino acids; this zymogen is predicted to be processed in vivo into a mature, active cathepsin L proteinase of 215 amino acids. Closest homologies were with cathepsins L from rat, mouse, and chicken (46-47% identity). Southern hybridization analysis suggested that only one or a few copies of the gene was present per genome, demonstrated that its locus was distinct from that of Sm31, and that a homologous sequence was present in Schistosoma japonicum. Because these results indicated that schistosomes expressed a cathepsin L proteinase, extracts of adult S. mansoni were examined for acidic, cysteine proteinase activity. Based on rates of cleavage of peptidyl substrates employed to discriminate between classes of cysteine proteinases, namely cathepsin L (Z-phe-arg-AMC), cathepsin B (Z-arg-arg-AMC) and cathepsin H (Bz-arg-AMC), the extracts were found to contain vigorous cathepsin L-like activity.(ABSTRACT TRUNCATED AT 250 WORDS)
Gastric acid secretion: activation and inhibition.
Sachs, G.; Prinz, C.; Loo, D.; Bamberg, K.; Besancon, M.; Shin, J. M.
1994-01-01
Peripheral regulation of gastric acid secretion is initiated by the release of gastrin from the G cell. Gastrin then stimulates the cholecystokinin-B receptor on the enterochromaffin-like cell beginning a calcium signaling cascade. An exocytotic release of histamine follows with concomitant activation of a C1- current. The released histamine begins the H2-receptor mediated sequence of events in the parietal cell, which results in activation of the gastric H+/K+ - ATPase. This enzyme is the final common pathway of acid secretion. The H+/K+ - ATPase is composed of two subunits: the larger alpha-subunit couples ion transport to hydrolysis of ATP, the smaller beta-subunit is required for appropriate assembly of the holoenzyme. Both the membrane and extracytoplasmic domain contain the ion transport pathway, and therefore, this region is the target for the antisecretory drugs of the post-H2 era. The 100 kDa alpha-subunit has probably 10 membrane spanning segments with, therefore, five extracytoplasmic loops. The 35 kDA beta-subunit has a single membrane spanning segment, and most of this protein is extracytoplasmic with the six or seven N glycosylation consensus sequences occupied. Omeprazole is an acid-accumulated, acid-activated, prodrug that binds covalently to two cysteine residues at positions 813 (or 822) and 892, accessible from the acidic face of the pump. Lansoprazole binds to cys321, 813 (or 822) and 892; pantoprazole binds to cys813 and 822. The common binding site for these drugs (cys813 or 822) is responsible for the inhibition of acid transport. Covalent inhibition of the acid pump improves control of acid secretion, but since the effective half life of the inhibition in man is about 48 hr, full inhibition of acid secretion, perhaps necessary for eradication of Helicobacter pylori in combination with a single antibiotic, will require prolongation of the effect of this class of drug. PMID:7502535
Hunt, C; Morimoto, R I
1985-01-01
We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Sequence patterns mediating functions of disordered proteins.
Exarchos, Konstantinos P; Kourou, Konstantina; Exarchos, Themis P; Papaloukas, Costas; Karamouzis, Michalis V; Fotiadis, Dimitrios I
2015-01-01
Disordered proteins lack specific 3D structure in their native state and have been implicated with numerous cellular functions as well as with the induction of severe diseases, e.g., cardiovascular and neurodegenerative diseases as well as diabetes. Due to their conformational flexibility they are often found to interact with a multitude of protein molecules; this one-to-many interaction which is vital for their versatile functioning involves short consensus protein sequences, which are normally detected using slow and cumbersome experimental procedures. In this work we exploit information from disorder-oriented protein interaction networks focused specifically on humans, in order to assemble, by means of overrepresentation, a set of sequence patterns that mediate the functioning of disordered proteins; hence, we are able to identify how a single protein achieves such functional promiscuity. Next, we study the sequential characteristics of the extracted patterns, which exhibit a striking preference towards a very limited subset of amino acids; specifically, residues leucine, glutamic acid, and serine are particularly frequent among the extracted patterns, and we also observe a nontrivial propensity towards alanine and glycine. Furthermore, based on the extracted patterns we set off to infer potential functional implications in order to verify our findings and potentially further extrapolate our knowledge regarding the functioning of disordered proteins. We observe that the extracted patterns are primarily involved with regulation, binding and posttranslational modifications, which constitute the most prominent functions of disordered proteins.
Apparent founder effect during the early years of the San Francisco HIV type 1 epidemic (1978-1979).
Foley, B; Pan, H; Buchbinder, S; Delwart, E L
2000-10-10
HIV-1 envelope sequence variants were RT-PCR amplified from serum samples cryopreserved in San Francisco in 1978-1979. The HIV-1 subtype B env V3-V5 sequences from four homosexual men clustered phylogenetically, with a median nucleotide distance of 2.8%, reflecting a recent common origin. These early U.S. HIV-1 env variants mapped close to the phylogenetic root of the subtype B tree while env variants collected in the United States throughout the 1980s and 1990s showed, on average, increasing genetic diversity and divergence from the subtype B consensus sequence. These results indicate that the majority of HIV-1 currently circulating in the United States may be descended from an initial introduction and rapid spread during the mid- to late 1970s of subtype B viruses with limited variability (i.e., a founder effect). As expected from the starburst-shaped phylogeny of HIV-1 subtype B, contemporary U.S. strains were, on average, more closely related at the nucleic acid and amino acid levels to the earlier 1978-1979 env variants than to each other. The growing levels of HIV-1 genetic diversity, one of multiple obstacles in designing a protective vaccine, may therefore be mitigated by using epidemic founding variants as antigenic strains for protection against contemporary strains.
Yadav, Kamlesh Kumar; Rajasekharan, Ram
2016-11-01
PHM8 is a very important enzyme in nonpolar lipid metabolism because of its role in triacylglycerol (TAG) biosynthesis under phosphate stress conditions. It is positively regulated by the PHO4 transcription factor under low phosphate conditions; however, its regulation has not been explored under normal physiological conditions. General control nonderepressible (GCN4), a basic leucine-zipper transcription factor activates the transcription of amino acids, purine biosynthesis genes and many stress response genes under various stress conditions. In this study, we demonstrate that the level of TAG is regulated by the transcription factor GCN4. GCN4 directly binds to its consensus recognition sequence (TGACTC) in the PHM8 promoter and controls its expression. The analysis of cells expressing the P PHM8 -lacZ reporter gene showed that mutations (TGACTC-GGGCCC) in the GCN4-binding sequence caused a significant increase in β-galactosidase activity. Mutation in the GCN4 binding sequence causes an increase in PHM8 expression, lysophosphatidic acid phosphatase activity and TAG level. PHM8, in conjunction with DGA1, a mono- and diacylglycerol transferase, controls the level of TAG. These results revealed that GCN4 negatively regulates PHM8 and that deletion of GCN4 causes de-repression of PHM8, which is responsible for the increased TAG content in gcn4∆ cells.
Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shetty, Ameesha R.; de Gannes, Vidya; Obi, Chioma C.
Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs inmore » two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3’ end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Lastly, determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.« less
Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4
Shetty, Ameesha R.; de Gannes, Vidya; Obi, Chioma C.; ...
2015-08-15
Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs inmore » two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3’ end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Lastly, determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.« less
Casillas, Rosario; Tabernero, David; Gregori, Josep; Belmonte, Irene; Cortese, Maria Francesca; González, Carolina; Riveiro-Barciela, Mar; López, Rosa Maria; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco
2018-01-01
AIM To determine the variability/conservation of the domain of hepatitis B virus (HBV) preS1 region that interacts with sodium-taurocholate cotransporting polypeptide (hereafter, NTCP-interacting domain) and the prevalence of the rs2296651 polymorphism (S267F, NTCP variant) in a Spanish population. METHODS Serum samples from 246 individuals were included and divided into 3 groups: patients with chronic HBV infection (CHB) (n = 41, 73% Caucasians), patients with resolved HBV infection (n = 100, 100% Caucasians) and an HBV-uninfected control group (n = 105, 100% Caucasians). Variability/conservation of the amino acid (aa) sequences of the NTCP-interacting domain, (aa 2-48 in viral genotype D) and a highly conserved preS1 domain associated with virion morphogenesis (aa 92-103 in viral genotype D) were analyzed by next-generation sequencing and compared in 18 CHB patients with viremia > 4 log IU/mL. The rs2296651 polymorphism was determined in all individuals in all 3 groups using an in-house real-time PCR melting curve analysis. RESULTS The HBV preS1 NTCP-interacting domain showed a high degree of conservation among the examined viral genomes especially between aa 9 and 21 (in the genotype D consensus sequence). As compared with the virion morphogenesis domain, the NTCP-interacting domain had a smaller proportion of HBV genotype-unrelated changes comprising > 1% of the quasispecies (25.5% vs 31.8%), but a larger proportion of genotype-associated viral polymorphisms (34% vs 27.3%), according to consensus sequences from GenBank patterns of HBV genotypes A to H. Variation/conservation in both domains depended on viral genotype, with genotype C being the most highly conserved and genotype E the most variable (limited finding, only 2 genotype E included). Of note, proline residues were highly conserved in both domains, and serine residues showed changes only to threonine or tyrosine in the virion morphogenesis domain. The rs2296651 polymorphism was not detected in any participant. CONCLUSION In our CHB population, the NTCP-interacting domain was highly conserved, particularly the proline residues and essential amino acids related with the NTCP interaction, and the prevalence of rs2296651 was low/null. PMID:29456407
Freyhult, Eva; Moulton, Vincent; Ardell, David H.
2006-01-01
Sequence logos are stacked bar graphs that generalize the notion of consensus sequence. They employ entropy statistics very effectively to display variation in a structural alignment of sequences of a common function, while emphasizing its over-represented features. Yet sequence logos cannot display features that distinguish functional subclasses within a structurally related superfamily nor do they display under-represented features. We introduce two extensions to address these needs: function logos and inverse logos. Function logos display subfunctions that are over-represented among sequences carrying a specific feature. Inverse logos generalize both sequence logos and function logos by displaying under-represented, rather than over-represented, features or functions in structural alignments. To make inverse logos, a compositional inverse is applied to the feature or function frequency distributions before logo construction, where a compositional inverse is a mathematical transform that makes common features or functions rare and vice versa. We applied these methods to a database of structurally aligned bacterial tDNAs to create highly condensed, birds-eye views of potentially all so-called identity determinants and antideterminants that confer specific amino acid charging or initiator function on tRNAs in bacteria. We recovered both known and a few potentially novel identity elements. Function logos and inverse logos are useful tools for exploratory bioinformatic analysis of structure–function relationships in sequence families and superfamilies. PMID:16473848
Li, You-Hai; Han, Wen-Jin; Gui, Xi-Wu; Wei, Tao; Tang, Shuang-Yan; Jin, Jian-Ming
2016-08-02
Tentoxin, a cyclic tetrapeptide produced by several Alternaria species, inhibits the F₁-ATPase activity of chloroplasts, resulting in chlorosis in sensitive plants. In this study, we report two clustered genes, encoding a putative non-ribosome peptide synthetase (NRPS) TES and a cytochrome P450 protein TES1, that are required for tentoxin biosynthesis in Alternaria alternata strain ZJ33, which was isolated from blighted leaves of Eupatorium adenophorum. Using a pair of primers designed according to the consensus sequences of the adenylation domain of NRPSs, two fragments containing putative adenylation domains were amplified from A. alternata ZJ33, and subsequent PCR analyses demonstrated that these fragments belonged to the same NRPS coding sequence. With no introns, TES consists of a single 15,486 base pair open reading frame encoding a predicted 5161 amino acid protein. Meanwhile, the TES1 gene is predicted to contain five introns and encode a 506 amino acid protein. The TES protein is predicted to be comprised of four peptide synthase modules with two additional N-methylation domains, and the number and arrangement of the modules in TES were consistent with the number and arrangement of the amino acid residues of tentoxin, respectively. Notably, both TES and TES1 null mutants generated via homologous recombination failed to produce tentoxin. This study provides the first evidence concerning the biosynthesis of tentoxin in A. alternata.
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.
Davis, C A; Wyatt, G R
1989-01-01
The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
Nucleotide sequences of two genomic DNAs encoding peroxidase of Arabidopsis thaliana.
Intapruk, C; Higashimura, N; Yamamoto, K; Okada, N; Shinmyo, A; Takano, M
1991-02-15
The peroxidase (EC 1.11.1.7)-encoding gene of Arabidopsis thaliana was screened from a genomic library using a cDNA encoding a neutral isozyme of horseradish, Armoracia rusticana, peroxidase (HRP) as a probe, and two positive clones were isolated. From the comparison with the sequences of the HRP-encoding genes, we concluded that two clones contained peroxidase-encoding genes, and they were named prxCa and prxEa. Both genes consisted of four exons and three introns; the introns had consensus nucleotides, GT and AG, at the 5' and 3' ends, respectively. The lengths of each putative exon of the prxEa gene were the same as those of the HRP-basic-isozyme-encoding gene, prxC3, and coded for 349 amino acids (aa) with a sequence homology of 89% to that encoded by prxC3. The prxCa gene was very close to the HRP-neutral-isozyme-encoding gene, prxC1b, and coded for 354 aa with 91% homology to that encoded by prxC1b. The aa sequence homology was 64% between the two peroxidases encoded by prxCa and prxEa.
Kovács, Endre R; Benko, Mária
2009-03-01
Partial genome characterisation of a novel adenovirus, found recently in organ samples of multiple species of dead birds of prey, was carried out by sequence analysis of PCR-amplified DNA fragments. The virus, named as raptor adenovirus 1 (RAdV-1), has originally been detected by a nested PCR method with consensus primers targeting the adenoviral DNA polymerase gene. Phylogenetic analysis with the deduced amino acid sequence of the small PCR product has implied a new siadenovirus type present in the samples. Since virus isolation attempts remained unsuccessful, further characterisation of this putative novel siadenovirus was carried out with the use of PCR on the infected organ samples. The DNA sequence of the central genome part of RAdV-1, encompassing nine full (pTP, 52K, pIIIa, III, pVII, pX, pVI, hexon, protease) and two partial (DNA polymerase and DBP) genes and exceeding 12 kb pairs in size, was determined. Phylogenetic tree reconstructions, based on several genes, unambiguously confirmed the preliminary classification of RAdV-1 as a new species within the genus Siadenovirus. Further study of RAdV-1 is of interest since it represents a rare adenovirus genus of yet undetermined host origin.
Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination.
Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo
2015-01-01
Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination.
Koike-Takeshita, A; Koyama, T; Ogura, K
1997-05-09
We recently described the isolation and sequence analysis of a DNA region containing the genes of Bacillus stearothermophilus heptaprenyl diphosphate synthase, which catalyzes the synthesis of the prenyl side chain of menaquinone-7 of this bacterium. Sequence analyses revealed the presence of three open reading frames (ORFs), designated as ORF-1, ORF-2, and ORF-3, and the structural genes of the heptaprenyl diphosphate synthase were proved to consist of ORF-1 (heps-1) and ORF-3 (heps-2) (Koike-Takeshita, A., Koyama, T., Obata, S., and Ogura, K. (1995) J. Biol. Chem. 270, 18396-18400). The predicted amino acid sequence of ORF-2 (234 amino acids) contains a methyltransferase consensus sequence and shows a 22% identity with UbiG of Escherichia coli, which catalyzes S-adenosyl-L-methionine-dependent methylation of 2-octaprenyl-3-methyl-5-hydroxy-6-methoxy-1,4-benzoquinone. These pieces of information led us to identify the ORF-2 gene product. The cell-free homogenate of the transformant of E. coli with an expression vector of ORF-2 catalyzed the incorporation of S-adenosyl-L-methionine into menaquinone-8, indicating that ORF-2 encodes 2-heptaprenyl-1,4-naphthoquinone methyltransferase, which participates in the terminal step of the menaquinone biosynthesis. Thus it is concluded that the ORF-1, ORF-2, and ORF-3 genes, designated heps-1, menG, and heps-2, respectively, form another cluster involved in menaquinone biosynthesis in addition to the cluster of menB, menC, menD, and menE already identified in the Bacillus subtilis and E. coli chromosomes.
Paldurai, Anandan; Subbiah, Madhuri; Kumar, Sachin; Collins, Peter L.; Samal, Siba K.
2009-01-01
Complete consensus genome sequences were determined for avian paramyxovirus type 8 (APMV-8) strains goose/Delaware/1053/76 (prototype strain) and pintail/Wakuya/20/78. The genome of each strain is 15,342 nucleotides (nt) long, which follows the “rule of six”. The genome consists of six genes in the order of 3′-N-P/V/W-M-F-HN-L-5′. The genes are flanked on either side by conserved transcription start and stop signals, and have intergenic regions ranging from 1 to 30 nt. The genome contains a 55 nt leader region at the 3′-end and a 171 nt trailer region at the 5′-end. Comparison of sequences of strains Delaware and Wakuya showed nucleotide identity of 96.8% at the genome level and amino acid identities of 99.3%, 96.5%, 98.6%, 99.4%, 98.6% and 99.1% for the predicted N, P, M, F, HN and L proteins, respectively. Both strains grew in embryonated chicken eggs and in primary chicken embryo kidney cells, and 293T cells. Both strains contained only a single basic residue at the cleavage activation site of the F protein and their efficiency of replication in vitro depended on and was augmented by, the presence of exogenous protease in most cell lines. Sequence alignment and phylogenic analysis of the predicted amino acid sequence of APMV-8 strain Delaware proteins with the cognate proteins of other available APMV serotypes showed that APMV-8 is more closely related to APMV-2 and -6 than to APMV-1, -3 and -4. PMID:19341613
Shahin, Arwa; Smulders, Marinus J. M.; van Tuyl, Jaap M.; Arens, Paul; Bakker, Freek T.
2014-01-01
Next Generation Sequencing (NGS) may enable estimating relationships among genotypes using allelic variation of multiple nuclear genes simultaneously. We explored the potential and caveats of this strategy in four genetically distant Lilium cultivars to estimate their genetic divergence from transcriptome sequences using three approaches: POFAD (Phylogeny of Organisms from Allelic Data, uses allelic information of sequence data), RAxML (Randomized Accelerated Maximum Likelihood, tree building based on concatenated consensus sequences) and Consensus Network (constructing a network summarizing among gene tree conflicts). Twenty six gene contigs were chosen based on the presence of orthologous sequences in all cultivars, seven of which also had an orthologous sequence in Tulipa, used as out-group. The three approaches generated the same topology. Although the resolution offered by these approaches is high, in this case there was no extra benefit in using allelic information. We conclude that these 26 genes can be widely applied to construct a species tree for the genus Lilium. PMID:25368628
Chloroplast Phylogenomics Indicates that Ginkgo biloba Is Sister to Cycads
Wu, Chung-Shien; Chaw, Shu-Miaw; Huang, Ya-Yi
2013-01-01
Molecular phylogenetic studies have not yet reached a consensus on the placement of Ginkgoales, which is represented by the only living species, Ginkgo biloba (common name: ginkgo). At least six discrepant placements of ginkgo have been proposed. This study aimed to use the chloroplast phylogenomic approach to examine possible factors that lead to such disagreeing placements. We found the sequence types used in the analyses as the most critical factor in the conflicting placements of ginkgo. In addition, the placement of ginkgo varied in the trees inferred from nucleotide (NU) sequences, which notably depended on breadth of taxon sampling, tree-building methods, codon positions, positions of Gnetopsida (common name: gnetophytes), and including or excluding gnetophytes in data sets. In contrast, the trees inferred from amino acid (AA) sequences congruently supported the monophyly of a ginkgo and Cycadales (common name: cycads) clade, regardless of which factors were examined. Our site-stripping analysis further revealed that the high substitution saturation of NU sequences mainly derived from the third codon positions and contributed to the variable placements of ginkgo. In summary, the factors we surveyed did not affect results inferred from analyses of AA sequences. Congruent topologies in our AA trees give more confidence in supporting the ginkgo–cycad sister-group hypothesis. PMID:23315384
Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.
Sasaki, H; Yokoyama, E; Kuroiwa, A
1990-01-01
The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866
Samuel, Arthur S.; Kumar, Sachin; Madhuri, Subbiah; Collins, Peter L.; Samal, Siba K.
2009-01-01
The complete genome consensus sequence was determined for avian paramyxovirus (APMV) serotype 9 prototype strain PMV-9/domestic Duck/New York/22/78. The genome is 15,438 nucleotides (nt) long and encodes six non-overlapping genes in the order of 3′-N-P/V/W-M-F-HN-L-5′ with intergenic regions of 0–30 nt. The genome length follows the “rule of six” and contains a 55-nt leader sequence at the 3′ end and a 47-nt trailer sequence at the 5′ end. The cleavage site of the F protein is I-R-E-G-R-I↓F, which does not conform to the conventional cleavage site of the ubiquitous cellular protease furin. The virus required exogenous protease for in vitro replication and grew only in a few established cell lines, indicating a restricted host range. Alignment and phylogenetic analysis of the predicted amino acid sequences of APMV-9 proteins with the cognate proteins of viruses of all five genera of family Paramyxoviridae showed that APMV-9 is more closely related to APMV-1 than to other APMVs. The mean death time in embryonated chicken eggs was found to be more than 120 h, indicating APMV-9 to be avirulent for chickens. PMID:19185593
Miyazaki, Saori; Sato, Yutaka; Asano, Tomoya; Nagamura, Yoshiaki; Nonomura, Ken-Ichi
2015-10-01
Post-transcriptional gene regulation by RNA recognition motif (RRM) proteins through binding to cis-elements in the 3'-untranslated region (3'-UTR) is widely used in eukaryotes to complete various biological processes. Rice MEIOSIS ARRESTED AT LEPTOTENE2 (MEL2) is the RRM protein that functions in the transition to meiosis in proper timing. The MEL2 RRM preferentially associated with the U-rich RNA consensus, UUAGUU[U/A][U/G][A/U/G]U, dependently on sequences and proportionally to MEL2 protein amounts in vitro. The consensus sequences were located in the putative looped structures of the RNA ligand. A genome-wide survey revealed a tendency of MEL2-binding consensus appearing in 3'-UTR of rice genes. Of 249 genes that conserved the consensus in their 3'-UTR, 13 genes spatiotemporally co-expressed with MEL2 in meiotic flowers, and included several genes whose function was supposed in meiosis; such as Replication protein A and OsMADS3. The proteome analysis revealed that the amounts of small ubiquitin-related modifier-like protein and eukaryotic translation initiation factor3-like protein were dramatically altered in mel2 mutant anthers. Taken together with transcriptome and gene ontology results, we propose that the rice MEL2 is involved in the translational regulation of key meiotic genes on 3'-UTRs to achieve the faithful transition of germ cells to meiosis.
Castilla, Agustín; Panizza, Paola; Rodríguez, Diego; Bonino, Luis; Díaz, Pilar; Irazoqui, Gabriela; Rodríguez Giordano, Sonia
2017-03-01
Janibacter sp. strain R02 (BNM 560) was isolated in our laboratory from an Antarctic soil sample. A remarkable trait of the strain was its high lipolytic activity, detected in Rhodamine-olive oil supplemented plates. Supernatants of Janibacter sp. R02 displayed superb activity on transesterification of acyl glycerols, thus being a good candidate for lipase prospection. Considering the lack of information concerning lipases of the genus Janibacter, we focused on the identification, cloning, expression and characterization of the extracellular lipases of this strain. By means of sequence alignment and clustering of consensus nucleotide sequences, a DNA fragment of 1272bp was amplified, cloned and expressed in E. coli. The resulting recombinant enzyme, named LipJ2, showed preference for short to medium chain-length substrates, and displayed maximum activity at 80°C and pH 8-9, being strongly activated by a mixture of Na + and K + . The enzyme presented an outstanding stability regarding both pH and temperature. Bioinformatics analysis of the amino acid sequence of LipJ2 revealed the presence of a consensus catalytic triad and a canonical pentapeptide. However, two additional rare motifs were found in LipJ2: an SXXL β-lactamase motif and two putative Y-type oxyanion holes (YAP). Although some of the previous features could allow assigning LipJ2 to the bacterial lipase families VIII or X, the phylogenetic analysis showed that LipJ2 clusters apart from other members of known lipase families, indicating that the newly isolated Janibacter esterase LipJ2 would be the first characterized member of a new family of bacterial lipases. Published by Elsevier Inc.
Medina-Carmona, Encarnación; Fuchs, Julian E; Gavira, Jose A; Mesa-Torres, Noel; Neira, Jose L; Salido, Eduardo; Palomino-Morales, Rogelio; Burgos, Miguel; Timson, David J; Pey, Angel L
2017-09-15
Human proteins are vulnerable towards disease-associated single amino acid replacements affecting protein stability and function. Interestingly, a few studies have shown that consensus amino acids from mammals or vertebrates can enhance protein stability when incorporated into human proteins. Here, we investigate yet unexplored relationships between the high vulnerability of human proteins towards disease-associated inactivation and recent evolutionary site-specific divergence of stabilizing amino acids. Using phylogenetic, structural and experimental analyses, we show that divergence from the consensus amino acids at several sites during mammalian evolution has caused local protein destabilization in two human proteins linked to disease: cancer-associated NQO1 and alanine:glyoxylate aminotransferase, mutated in primary hyperoxaluria type I. We demonstrate that a single consensus mutation (H80R) acts as a disease suppressor on the most common cancer-associated polymorphism in NQO1 (P187S). The H80R mutation reactivates P187S by enhancing FAD binding affinity through local and dynamic stabilization of its binding site. Furthermore, we show how a second suppressor mutation (E247Q) cooperates with H80R in protecting the P187S polymorphism towards inactivation through long-range allosteric communication within the structural ensemble of the protein. Our results support that recent divergence of consensus amino acids may have occurred with neutral effects on many functional and regulatory traits of wild-type human proteins. However, divergence at certain sites may have increased the propensity of some human proteins towards inactivation due to disease-associated mutations and polymorphisms. Consensus mutations also emerge as a potential strategy to identify structural hot-spots in proteins as targets for pharmacological rescue in loss-of-function genetic diseases. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
R2R - software to speed the depiction of aesthetic consensus RNA secondary structures
2011-01-01
Background With continuing identification of novel structured noncoding RNAs, there is an increasing need to create schematic diagrams showing the consensus features of these molecules. RNA structural diagrams are typically made either with general-purpose drawing programs like Adobe Illustrator, or with automated or interactive programs specific to RNA. Unfortunately, the use of applications like Illustrator is extremely time consuming, while existing RNA-specific programs produce figures that are useful, but usually not of the same aesthetic quality as those produced at great cost in Illustrator. Additionally, most existing RNA-specific applications are designed for drawing single RNA molecules, not consensus diagrams. Results We created R2R, a computer program that facilitates the generation of aesthetic and readable drawings of RNA consensus diagrams in a fraction of the time required with general-purpose drawing programs. Since the inference of a consensus RNA structure typically requires a multiple-sequence alignment, the R2R user annotates the alignment with commands directing the layout and annotation of the RNA. R2R creates SVG or PDF output that can be imported into Adobe Illustrator, Inkscape or CorelDRAW. R2R can be used to create consensus sequence and secondary structure models for novel RNA structures or to revise models when new representatives for known RNA classes become available. Although R2R does not currently have a graphical user interface, it has proven useful in our efforts to create 100 schematic models of distinct noncoding RNA classes. Conclusions R2R makes it possible to obtain high-quality drawings of the consensus sequence and structural models of many diverse RNA structures with a more practical amount of effort. R2R software is available at http://breaker.research.yale.edu/R2R and as an Additional file. PMID:21205310
Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
Chin, Chen-Shan; Alexander, David H; Marks, Patrick; Klammer, Aaron A; Drake, James; Heiner, Cheryl; Clum, Alicia; Copeland, Alex; Huddleston, John; Eichler, Evan E; Turner, Stephen W; Korlach, Jonas
2013-06-01
We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.
Wellehan, James F. X.; Johnson, April J.; Harrach, Balázs; Benkö, Mária; Pessier, Allan P.; Johnson, Calvin M.; Garner, Michael M.; Childress, April; Jacobson, Elliott R.
2004-01-01
A consensus nested-PCR method was designed for investigation of the DNA polymerase gene of adenoviruses. Gene fragments were amplified and sequenced from six novel adenoviruses from seven lizard species, including four species from which adenoviruses had not previously been reported. Host species included Gila monster, leopard gecko, fat-tail gecko, blue-tongued skink, Tokay gecko, bearded dragon, and mountain chameleon. This is the first sequence information from lizard adenoviruses. Phylogenetic analysis indicated that these viruses belong to the genus Atadenovirus, supporting the reptilian origin of atadenoviruses. This PCR method may be useful for obtaining templates for initial sequencing of novel adenoviruses. PMID:15542689
Wellehan, James F X; Johnson, April J; Harrach, Balázs; Benkö, Mária; Pessier, Allan P; Johnson, Calvin M; Garner, Michael M; Childress, April; Jacobson, Elliott R
2004-12-01
A consensus nested-PCR method was designed for investigation of the DNA polymerase gene of adenoviruses. Gene fragments were amplified and sequenced from six novel adenoviruses from seven lizard species, including four species from which adenoviruses had not previously been reported. Host species included Gila monster, leopard gecko, fat-tail gecko, blue-tongued skink, Tokay gecko, bearded dragon, and mountain chameleon. This is the first sequence information from lizard adenoviruses. Phylogenetic analysis indicated that these viruses belong to the genus Atadenovirus, supporting the reptilian origin of atadenoviruses. This PCR method may be useful for obtaining templates for initial sequencing of novel adenoviruses.
SubCellProt: predicting protein subcellular localization using machine learning approaches.
Garg, Prabha; Sharma, Virag; Chaudhari, Pradeep; Roy, Nilanjan
2009-01-01
High-throughput genome sequencing projects continue to churn out enormous amounts of raw sequence data. However, most of this raw sequence data is unannotated and, hence, not very useful. Among the various approaches to decipher the function of a protein, one is to determine its localization. Experimental approaches for proteome annotation including determination of a protein's subcellular localizations are very costly and labor intensive. Besides the available experimental methods, in silico methods present alternative approaches to accomplish this task. Here, we present two machine learning approaches for prediction of the subcellular localization of a protein from the primary sequence information. Two machine learning algorithms, k Nearest Neighbor (k-NN) and Probabilistic Neural Network (PNN) were used to classify an unknown protein into one of the 11 subcellular localizations. The final prediction is made on the basis of a consensus of the predictions made by two algorithms and a probability is assigned to it. The results indicate that the primary sequence derived features like amino acid composition, sequence order and physicochemical properties can be used to assign subcellular localization with a fair degree of accuracy. Moreover, with the enhanced accuracy of our approach and the definition of a prediction domain, this method can be used for proteome annotation in a high throughput manner. SubCellProt is available at www.databases.niper.ac.in/SubCellProt.
Vaira, A M; Accotto, G P; Costantini, A; Milne, R G
2003-06-01
A 4018 nucleotide sequence was obtained for RNA 1 of Ranunculus white mottle virus (RWMV), genus Ophiovirus, representing an incomplete ORF of 1339 aa. Amino acid sequence analysis revealed significant similarities with RNA polymerases of viruses in the family Rhabdoviridae and a conserved domain of 685 aa, corresponding to the RdRp domain of those in the order Mononegavirales. Phylogenetic analysis indicated that the genus Ophiovirus is not related to the genus Tenuivirus or the family Bunyaviridae, with which it has been linked, and probably deserves a special taxonomic position, within a new family. A pair of degenerate primers was designed from a consensus sequence obtained from a relatively conserved region in the RNA 1 of two members of the genus, Citrus psorosis virus (CPsV) and RWMV. The primers, used in RT-PCR experiments, amplified a 136 bp DNA fragment from all the three recognized members of the genus, i.e. CPsV, RWMV and Tulip mild mottle mosaic virus (TMMMV) and from two tentative ophioviruses from lettuce and freesia. The amplified DNAs were sequenced and compared with the corresponding sequences of CPsV and RWMV and phylogenetic relationships were evaluated. Assays using extracts from plants infected by viruses belonging to the genera Tospovirus, Tenuivirus, Rhabdovirus and Varicosavirus indicated that the primers are genus-specific.
Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.
Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P
2018-05-22
Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.
Esmagambetov, Ilias; Bagaev, Alexander; Pichugin, Alexey; Lysenko, Andrey; Shcherbinin, Dmitry; Sedova, Elena; Logunov, Denis; Shmarov, Maxim; Ataullakhanov, Ravshan; Naroditsky, Boris; Gintsburg, Alexander
2018-01-01
To avoid outbreaks of influenza virus epidemics and pandemics among human populations, modern medicine requires the development of new universal vaccines that are able to provide protection from a wide range of influenza A virus strains. In the course of development of a universal vaccine, it is necessary to consider that immunity must be generated even against viruses from different hosts because new human epidemic virus strains have their origins in viruses of birds and other animals. We have enriched conserved viral proteins–nucleoprotein (NP) and matrix protein 2 (M2)—by B and T-cell epitopes not only human origin but also swine and avian origin. For this purpose, we analyzed M2 and NP sequences with respect to changes in the sequences of known T and B-cell epitopes and chose conserved and evolutionarily significant epitopes. Eventually, we found consensus sequences of M2 and NP that have the maximum quantity of epitopes that are 100% coincident with them. Consensus epitope-enriched amino acid sequences of M2 and NP proteins were included in a recombinant adenoviral vector. Immunization with Ad5-tet-M2NP induced strong CD8 and CD4 T cells responses, specific to each of the encoded antigens, i.e. M2 and NP. Eight months after immunization with Ad5-tet-M2NP, high numbers of M2- and NP-responding “effector memory” CD44posCD62neg T cells were found in the mouse spleens, which revealed a long-term T cell immune memory conferred by the immunization. In all, the challenge experiments showed an extraordinarily wide-ranging efficacy of protection by the Ad5-tet-M2NP vaccine, covering 5 different heterosubtypes of influenza A virus (2 human, 2 avian and 1 swine). PMID:29377916
In silico analysis of the polygalacturonase inhibiting protein 1 from apple, Malus domestica.
Matsaunyane, Lerato Bt; Oelofse, Dean; Dubery, Ian A
2015-03-11
The Malus domestica polygalacturonase inhibiting protein 1 (MdPGIP1) gene, encoding the M. domestica polygalacturonase inhibiting protein 1 (MdPGIP1), was isolated from the Granny Smith apple cultivar (GenBank accession no. DQ185063). The gene was used to transform tobacco and potato for enhanced resistance against fungal diseases. Analysis of the MdPGIP1 nucleotide sequence revealed that the gene comprises 993 nucleotides that encode a 330 amino acid polypeptide. In silico characterization of the MdPGIP1 polypeptide revealed domains typical of PGIP proteins, which include a 24 amino acid putative signal peptide, a potential cleavage site [Alanine-Leucine-Serine (ALS)] for the signal peptide, a 238 amino acid leucine-rich repeat (LRR) domain, a 46 amino acid N-terminal domain and a 22 amino acid C-terminal domain. The hydropathic evaluation of MdPGIP1 indicated a repetitive hydrophobic motif in the LRR domain and a hydrophilic surface area consistent with a globular protein. The typical consensus glycosylation sequence of Asn-X-Ser/Thr was identified in MdPGIP1, indicating potential N-linked glycosylation of MdPGIP1. The molecular mass of non-glycosylated MdPGIP1 was calculated as 36.615 kDa and the theoretical isoelectric point as 6.98. Furthermore, the secondary and tertiary structure of MdPGIP1 was modelled, and revealed that MdPGIP1 is a curved and elongated molecule that contains sheet B1, sheet B2 and 310-helices on its LRR domain. The overall properties of the MdPGIP1 protein is similar to that of the prototypical Phaseolus vulgaris PGIP 2 (PvPGIP2), and the detected differences supported its use in biotechnological applications as an inhibitor of targeted fungal polygalacturonases (PGs).
Ruane, Karen M; Lloyd, Adrian J; Fülöp, Vilmos; Dowson, Christopher G; Barreteau, Hélène; Boniface, Audrey; Dementin, Sébastien; Blanot, Didier; Mengin-Lecreulx, Dominique; Gobec, Stanislav; Dessen, Andréa; Roper, David I
2013-11-15
Formation of the peptidoglycan stem pentapeptide requires the insertion of both L and D amino acids by the ATP-dependent ligase enzymes MurC, -D, -E, and -F. The stereochemical control of the third position amino acid in the pentapeptide is crucial to maintain the fidelity of later biosynthetic steps contributing to cell morphology, antibiotic resistance, and pathogenesis. Here we determined the x-ray crystal structure of Staphylococcus aureus MurE UDP-N-acetylmuramoyl-L-alanyl-D-glutamate:meso-2,6-diaminopimelate ligase (MurE) (E.C. 6.3.2.7) at 1.8 Å resolution in the presence of ADP and the reaction product, UDP-MurNAc-L-Ala-γ-D-Glu-L-Lys. This structure provides for the first time a molecular understanding of how this Gram-positive enzyme discriminates between L-lysine and D,L-diaminopimelic acid, the predominant amino acid that replaces L-lysine in Gram-negative peptidoglycan. Despite the presence of a consensus sequence previously implicated in the selection of the third position residue in the stem pentapeptide in S. aureus MurE, the structure shows that only part of this sequence is involved in the selection of L-lysine. Instead, other parts of the protein contribute substrate-selecting residues, resulting in a lysine-binding pocket based on charge characteristics. Despite the absolute specificity for L-lysine, S. aureus MurE binds this substrate relatively poorly. In vivo analysis and metabolomic data reveal that this is compensated for by high cytoplasmic L-lysine concentrations. Therefore, both metabolic and structural constraints maintain the structural integrity of the staphylococcal peptidoglycan. This study provides a novel focus for S. aureus-directed antimicrobials based on dual targeting of essential amino acid biogenesis and its linkage to cell wall assembly.
Foulon, Veerle; Antonenkov, Vasily D.; Croes, Kathleen; Waelkens, Etienne; Mannaerts, Guy P.; Van Veldhoven, Paul P.; Casteels, Minne
1999-01-01
In the third step of the α-oxidation of 3-methyl-branched fatty acids such as phytanic acid, a 2-hydroxy-3-methylacyl-CoA is cleaved into formyl-CoA and a 2-methyl-branched fatty aldehyde. The cleavage enzyme was purified from the matrix protein fraction of rat liver peroxisomes and identified as a protein made up of four identical subunits of 63 kDa. Its activity proved to depend on Mg2+ and thiamine pyrophosphate, a hitherto unrecognized cofactor of α-oxidation. Formyl-CoA and 2-methylpentadecanal were identified as reaction products when the purified enzyme was incubated with 2-hydroxy-3-methylhexadecanoyl-CoA as the substrate. Hence the enzyme catalyzes a carbon–carbon cleavage, and we propose calling it 2-hydroxyphytanoyl-CoA lyase. Sequences derived from tryptic peptides of the purified rat protein were used as queries to recover human expressed sequence tags from the databases. The composite cDNA sequence of the human lyase contained an ORF of 1,734 bases that encodes a polypeptide with a calculated molecular mass of 63,732 Da. Recombinant human protein, expressed in mammalian cells, exhibited lyase activity. The lyase displayed homology to a putative Caenorhabditis elegans protein that resembles bacterial oxalyl-CoA decarboxylases. Similarly to the decarboxylases, a thiamine pyrophosphate-binding consensus domain was present in the C-terminal part of the lyase. Although no peroxisome targeting signal, neither 1 nor 2, was apparent, transfection experiments with constructs encoding green fluorescent protein fused to the full-length lyase or its C-terminal pentapeptide indicated that the C terminus of the lyase represents a peroxisome targeting signal 1 variant. PMID:10468558
Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A
2008-01-01
Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731
Pfeiffer, M; Klein, A; Steinert, P; Schomburg, D
The 25 amino acid long subunit VhuU of the F420-non-reducing hydrogenase from Methanococcus voltae contains selenocysteine within the consensus sequence of known [NiFe] hydrogenases DP(C or U)CxxCxxH (U = selenocysteine). The sulfur-analogue VhuUc was chemically synthesized, purified and its metal binding capability, the catalytic properties, and structural features were investigated. The polypeptide was able to bind nickel, but did not catalyse the heterolytic activation of H2. 2D-NMR spectroscopy revealed an alpha-helical secondary structure for the 15 N-terminal amino acids in 50% TFE. Nickel only binds to the C-terminus, which contains the conserved amino acid motif. Structures derived from the NMR data are compatible with the participation of both sulfur atoms from the conserved cysteine residues in a metal ion binding. Structures obtained from the data sets for Ni.VhuUc as well as Zn.VhuUc showed no further ligands. The informational value for Ni.VhuUc was low due to paramagnetism.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Guo-Shun; Grabowski, G.A.
1992-10-01
Gaucher disease is the most frequent lysosomal storage disease and the most prevalent Jewish genetic disease. About 30 identified missense mutations are causal to the defective activity of acid [beta]-glucosidase in this disease. cDNAs were characterized from a moderately affected 9-year-old Ashkenazi Jewish Gaucher disease type 1 patient whose 80-years-old, enzyme-deficient, 1226G (Asn[sup 370][yields]Ser [N370S]) homozygous grandfather was nearly asymptomatic. Sequence analyses revealed four populations of cDNAs with either the 1226G mutation, an exact exon 2 ([Delta] EX2) deletion, a deletion of exon 2 and the first 115 bp of exon 3 ([Delta] EX2-3), or a completely normal sequence. Aboutmore » 50% of the cDNAs were the [Delta] EX2, the [Delta] EX2-3, and the normal cDNAs, in a ratio of 6:3:1. Specific amplification and characterization of exon 2 and 5[prime] and 3[prime] intronic flanking sequences from the structural gene demonstrated clones with either the normal sequence or with a G[sup +1][yields]A[sup +1] transition at the exon 2/intron 2 boundary. This mutation destroyed the splice donor consensus site (U1 binding site) for mRNA processing. This transition also was present at the corresponding exon/intron boundary of the highly homologous pseudogene. This new mutation, termed [open quotes]IVS2 G[sup +1],[close quotes] is the first in the Ashkenazi Jewish population. The occurrence of this [open quotes]pseudogene[close quotes]-type mutation in the structural gene indicates the role of acid [beta]-glucosidase pseudogene and structural gene rearrangements in the pathogenesis of this disease. 33 refs., 8 figs., 1 tab.« less
Qu, Wen; Cingolani, Pablo; Zeeberg, Barry R; Ruden, Douglas M
2017-01-01
Deep sequencing of cDNAs made from spliced mRNAs indicates that most coding genes in many animals and plants have pre-mRNA transcripts that are alternatively spliced. In pre-mRNAs, in addition to invariant exons that are present in almost all mature mRNA products, there are at least 6 additional types of exons, such as exons from alternative promoters or with alternative polyA sites, mutually exclusive exons, skipped exons, or exons with alternative 5' or 3' splice sites. Our bioinformatics-based hypothesis is that, in analogy to the genetic code, there is an "alternative-splicing code" in introns and flanking exon sequences, analogous to the genetic code, that directs alternative splicing of many of the 36 types of introns. In humans, we identified 42 different consensus sequences that are each present in at least 100 human introns. 37 of the 42 top consensus sequences are significantly enriched or depleted in at least one of the 36 types of introns. We further supported our hypothesis by showing that 96 out of 96 analyzed human disease mutations that affect RNA splicing, and change alternative splicing from one class to another, can be partially explained by a mutation altering a consensus sequence from one type of intron to that of another type of intron. Some of the alternative splicing consensus sequences, and presumably their small-RNA or protein targets, are evolutionarily conserved from 50 plant to animal species. We also noticed the set of introns within a gene usually share the same splicing codes, thus arguing that one sub-type of splicesosome might process all (or most) of the introns in a given gene. Our work sheds new light on a possible mechanism for generating the tremendous diversity in protein structure by alternative splicing of pre-mRNAs.
Reid-Bayliss, Kate S; Loeb, Lawrence A
2017-08-29
Transcriptional mutagenesis (TM) due to misincorporation during RNA transcription can result in mutant RNAs, or epimutations, that generate proteins with altered properties. TM has long been hypothesized to play a role in aging, cancer, and viral and bacterial evolution. However, inadequate methodologies have limited progress in elucidating a causal association. We present a high-throughput, highly accurate RNA sequencing method to measure epimutations with single-molecule sensitivity. Accurate RNA consensus sequencing (ARC-seq) uniquely combines RNA barcoding and generation of multiple cDNA copies per RNA molecule to eliminate errors introduced during cDNA synthesis, PCR, and sequencing. The stringency of ARC-seq can be scaled to accommodate the quality of input RNAs. We apply ARC-seq to directly assess transcriptome-wide epimutations resulting from RNA polymerase mutants and oxidative stress.
NASA Astrophysics Data System (ADS)
Basu, Sankar; Söderquist, Fredrik; Wallner, Björn
2017-05-01
The focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decades from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs/IDPRs. Disordered proteins are characterized by an enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several deadly human diseases (e.g. cancer, cardiovascular and neurodegenerative diseases) makes them attractive drug targets, and important for a biochemical understanding of the disease(s). The study of the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a structured partner, an IDPR adapts an ordered conformation in the complex. The residues that undergo this disorder-to-order transition are called protean residues, generally found in short contiguous stretches and the first step in understanding the modus operandi of an IDP/IDPR would be to predict these residues. There are a few available methods which predict these protean segments from their amino acid sequences; however, their performance reported in the literature leaves clear room for improvement. With this background, the current study presents `Proteus', a random forest classifier that predicts the likelihood of a residue undergoing a disorder-to-order transition upon binding to a potential partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55 vs. 27%) with a much higher precision on an independent data set. The current study also sheds some light on a possible `disorder-to-order' transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested for proceeding with a real-life structural modeling involving an IDPR using Proteus.
USDA-ARS?s Scientific Manuscript database
Lipase (lip) and lipase-specific foldase (lif) genes of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans NRRL B-2649 were cloned using primers based on consensus sequences, followed by PCR-based genome walking. Sequence analyses showed a putative Lip gene-product (...
Cheung, Wing-I; Chan, Henry Lik-Yuen; Leung, Vincent King-Sun; Tse, Chi-Hang; Fung, Kitty; Lin, Shek-Ying; Wong, Ann; Wong, Vincent Wai-Sun; Chau, Tai-Nin
2010-02-01
In patients with occult hepatitis B virus (HBV) infection, acute exacerbation may occur when they become immunocompromised. Usually, these patients develop hepatitis B surface antigen (HBsAg) seroreversion during the flare. Here we report on a patient with occult HBV infection, who developed HBV exacerbation after chemotherapy for diffuse large B-cell lymphoma. The resurgence of HBV DNA preceded the elevation of liver enzymes for 20 weeks. Atypically, despite high viraemia, serological tests showed persistently negative HBsAg using three different sensitive HBsAg assays (i.e., Architect, Murex and AxSYM). On comparing the amino acid sequence of the index patient with the consensus sequence, five mutations were found at pre-S1, five at pre-S2 and twenty-three mutations at the S region. Six amino acid mutations were located in the 'a' determinant, including P120T, K122R, M133T, F134L, D144A and G145A. The mutants K122R, F134L and G145A in our patient have not been tested for their sensitivity to Architect and Murex assays by the previous investigators and might represent the escape mutants to these assays.
Bioinformatic flowchart and database to investigate the origins and diversity of Clan AA peptidases
Llorens, Carlos; Futami, Ricardo; Renaud, Gabriel; Moya, Andrés
2009-01-01
Background Clan AA of aspartic peptidases relates the family of pepsin monomers evolutionarily with all dimeric peptidases encoded by eukaryotic LTR retroelements. Recent findings describing various pools of single-domain nonviral host peptidases, in prokaryotes and eukaryotes, indicate that the diversity of clan AA is larger than previously thought. The ensuing approach to investigate this enzyme group is by studying its phylogeny. However, clan AA is a difficult case to study due to the low similarity and different rates of evolution. This work is an ongoing attempt to investigate the different clan AA families to understand the cause of their diversity. Results In this paper, we describe in-progress database and bioinformatic flowchart designed to characterize the clan AA protein domain based on all possible protein families through ancestral reconstructions, sequence logos, and hidden markov models (HMMs). The flowchart includes the characterization of a major consensus sequence based on 6 amino acid patterns with correspondence with Andreeva's model, the structural template describing the clan AA peptidase fold. The set of tools is work in progress we have organized in a database within the GyDB project, referred to as Clan AA Reference Database . Conclusion The pre-existing classification combined with the evolutionary history of LTR retroelements permits a consistent taxonomical collection of sequence logos and HMMs. This set is useful for gene annotation but also a reference to evaluate the diversity of, and the relationships among, the different families. Comparisons among HMMs suggest a common ancestor for all dimeric clan AA peptidases that is halfway between single-domain nonviral peptidases and those coded by Ty3/Gypsy LTR retroelements. Sequence logos reveal how all clan AA families follow similar protein domain architecture related to the peptidase fold. In particular, each family nucleates a particular consensus motif in the sequence position related to the flap. The different motifs constitute a network where an alanine-asparagine-like variable motif predominates, instead of the canonical flap of the HIV-1 peptidase and closer relatives. Reviewers This article was reviewed by Daniel H. Haft, Vladimir Kapitonov (nominated by Jerry Jurka), and Ben M. Dunn (nominated by Claus Wilke). PMID:19173708
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bian, Chuanbing; Yuan, Cai; Chen, Liqing
2010-04-05
Triacylglycerol lipases (EC 3.1.1.3) are present in many different organisms including animals, plants, and microbes. Lipases catalyze the hydrolysis of long-chain triglycerides into fatty acids and glycerol at the interface between the water insoluble substrate and the aqueous phase. Lipases can also catalyze the reverse esterification reaction to form glycerides under certain conditions. Lipases of microbial origin are of considerable commercial interest for wide variety of biotechnological applications in industries, including detergent, food, cosmetic, pharmaceutical, fine chemicals, and biodiesel. Nowadays, microbial lipases have become one of the most important industrial enzymes. PEL (Penicillium expansum lipase) is a fungal lipase frommore » Penicillium expansum strain PF898 isolated from Chinese soil that has been subjected to several generations of mutagenesis to increase its enzymatic activity. PEL belongs to the triacylglycerol lipases family, and its catalytic characteristics have been studied. The enzyme has been used in Chinese laundry detergent industry for several years (http://www.leveking.com). However, the poor thermal stability of the enzyme limits its application. To further study and improve this enzyme, PEL was cloned and sequenced. Furthermore, it was overexpressed in Pichia pastoris. PEL contains GHSLG sequence, which is the lipase consensus sequence Gly-X1-Ser-X2-Gly, but has a low amino acid sequence identities to other lipases. The most similar lipases are Rhizomucor miehei (PML) and Rhizopus niveus (PNL) with a 21% and 20% sequence identities to PEL, respectively. Interestingly, the similarity of PEL with the known esterases is somewhat higher with 24% sequence identity to feruloyl esterase A. Here, we report the 1.3 {angstrom} resolution crystal structure of PEL determined by sulfur SAD phasing. This structure not only presents a new lipase structure at high resolution, but also provides a structural platform to analyze the published mutagenesis results. The structure may also open up new avenues for future protein engineering study on PEL.« less
Woldring, Daniel R.; Holec, Patrick V.; Zhou, Hong; Hackel, Benjamin J.
2015-01-01
Discovering new binding function via a combinatorial library in small protein scaffolds requires balance between appropriate mutations to introduce favorable intermolecular interactions while maintaining intramolecular integrity. Sitewise constraints exist in a non-spatial gradient from diverse to conserved in evolved antibody repertoires; yet non-antibody scaffolds generally do not implement this strategy in combinatorial libraries. Despite the fact that biased amino acid distributions, typically elevated in tyrosine, serine, and glycine, have gained wider use in synthetic scaffolds, these distributions are still predominantly applied uniformly to diversified sites. While select sites in fibronectin domains and DARPins have shown benefit from sitewise designs, they have not been deeply evaluated. Inspired by this disparity between diversity distributions in natural libraries and synthetic scaffold libraries, we hypothesized that binders resulting from discovery and evolution would exhibit a non-spatial, sitewise gradient of amino acid diversity. To identify sitewise diversities consistent with efficient evolution in the context of a hydrophilic fibronectin domain, >105 binders to six targets were evolved and sequenced. Evolutionarily favorable amino acid distributions at 25 sites reveal Shannon entropies (range: 0.3–3.9; median: 2.1; standard deviation: 1.1) supporting the diversity gradient hypothesis. Sitewise constraints in evolved sequences are consistent with complementarity, stability, and consensus biases. Implementation of sitewise constrained diversity enables direct selection of nanomolar affinity binders validating an efficient strategy to balance inter- and intra-molecular interaction demands at each site. PMID:26383268
Li, You-Hai; Han, Wen-Jin; Gui, Xi-Wu; Wei, Tao; Tang, Shuang-Yan; Jin, Jian-Ming
2016-01-01
Tentoxin, a cyclic tetrapeptide produced by several Alternaria species, inhibits the F1-ATPase activity of chloroplasts, resulting in chlorosis in sensitive plants. In this study, we report two clustered genes, encoding a putative non-ribosome peptide synthetase (NRPS) TES and a cytochrome P450 protein TES1, that are required for tentoxin biosynthesis in Alternaria alternata strain ZJ33, which was isolated from blighted leaves of Eupatorium adenophorum. Using a pair of primers designed according to the consensus sequences of the adenylation domain of NRPSs, two fragments containing putative adenylation domains were amplified from A. alternata ZJ33, and subsequent PCR analyses demonstrated that these fragments belonged to the same NRPS coding sequence. With no introns, TES consists of a single 15,486 base pair open reading frame encoding a predicted 5161 amino acid protein. Meanwhile, the TES1 gene is predicted to contain five introns and encode a 506 amino acid protein. The TES protein is predicted to be comprised of four peptide synthase modules with two additional N-methylation domains, and the number and arrangement of the modules in TES were consistent with the number and arrangement of the amino acid residues of tentoxin, respectively. Notably, both TES and TES1 null mutants generated via homologous recombination failed to produce tentoxin. This study provides the first evidence concerning the biosynthesis of tentoxin in A. alternata. PMID:27490569
A first report and complete genome sequence of alfalfa enamovirus from Sudan
USDA-ARS?s Scientific Manuscript database
A full genome sequence of a viral pathogen, provisionally named alfalfa enamovirus 2 (AEV-2), was reconstructed from short reads obtained by Illumina RNA sequencing of alfalfa sample originating from Sudan. Ambiguous nucleotides in the resultant consensus assembly and identity of the predicted virus...
A systematic approach to novel virus discovery in emerging infectious disease outbreaks.
Sridhar, Siddharth; To, Kelvin K W; Chan, Jasper F W; Lau, Susanna K P; Woo, Patrick C Y; Yuen, Kwok-Yung
2015-05-01
The discovery of novel viruses is of great importance to human health-both in the setting of emerging infectious disease outbreaks and in disease syndromes of unknown etiology. Despite the recent proliferation of many efficient virus discovery methods, careful selection of a combination of methods is important to demonstrate a novel virus, its clinical associations, and its relevance in a timely manner. The identification of a patient or an outbreak with distinctive clinical features and negative routine microbiological workup is often the starting point for virus hunting. This review appraises the roles of culture, electron microscopy, and nucleic acid detection-based methods in optimizing virus discovery. Cell culture is generally slow but may yield viable virus. Although the choice of cell line often involves trial and error, it may be guided by the clinical syndrome. Electron microscopy is insensitive but fast, and may provide morphological clues to choice of cell line or consensus primers for nucleic acid detection. Consensus primer PCR can be used to detect viruses that are closely related to known virus families. Random primer amplification and high-throughput sequencing can catch any virus genome but cannot yield an infectious virion for testing Koch postulates. A systematic approach that incorporates carefully chosen combinations of virus detection techniques is required for successful virus discovery. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
A consensus linkage map of lentil based on DArT markers from three RIL mapping populations.
Ates, Duygu; Aldemir, Secil; Alsaleh, Ahmad; Erdogmus, Semih; Nemli, Seda; Kahriman, Abdullah; Ozkan, Hakan; Vandenberg, Albert; Tanyolac, Bahattin
2018-01-01
Lentil (Lens culinaris ssp. culinaris Medikus) is a diploid (2n = 2x = 14), self-pollinating grain legume with a haploid genome size of about 4 Gbp and is grown throughout the world with current annual production of 4.9 million tonnes. A consensus map of lentil (Lens culinaris ssp. culinaris Medikus) was constructed using three different lentils recombinant inbred line (RIL) populations, including "CDC Redberry" x "ILL7502" (LR8), "ILL8006" x "CDC Milestone" (LR11) and "PI320937" x "Eston" (LR39). The lentil consensus map was composed of 9,793 DArT markers, covered a total of 977.47 cM with an average distance of 0.10 cM between adjacent markers and constructed 7 linkage groups representing 7 chromosomes of the lentil genome. The consensus map had no gap larger than 12.67 cM and only 5 gaps were found to be between 12.67 cM and 6.0 cM (on LG3 and LG4). The localization of the SNP markers on the lentil consensus map were in general consistent with their localization on the three individual genetic linkage maps and the lentil consensus map has longer map length, higher marker density and shorter average distance between the adjacent markers compared to the component linkage maps. This high-density consensus map could provide insight into the lentil genome. The consensus map could also help to construct a physical map using a Bacterial Artificial Chromosome library and map based cloning studies. Sequence information of DArT may help localization of orientation scaffolds from Next Generation Sequencing data.
A consensus linkage map of lentil based on DArT markers from three RIL mapping populations
Ates, Duygu; Aldemir, Secil; Alsaleh, Ahmad; Erdogmus, Semih; Nemli, Seda; Kahriman, Abdullah; Ozkan, Hakan; Vandenberg, Albert
2018-01-01
Background Lentil (Lens culinaris ssp. culinaris Medikus) is a diploid (2n = 2x = 14), self-pollinating grain legume with a haploid genome size of about 4 Gbp and is grown throughout the world with current annual production of 4.9 million tonnes. Materials and methods A consensus map of lentil (Lens culinaris ssp. culinaris Medikus) was constructed using three different lentils recombinant inbred line (RIL) populations, including “CDC Redberry” x “ILL7502” (LR8), “ILL8006” x “CDC Milestone” (LR11) and “PI320937” x “Eston” (LR39). Results The lentil consensus map was composed of 9,793 DArT markers, covered a total of 977.47 cM with an average distance of 0.10 cM between adjacent markers and constructed 7 linkage groups representing 7 chromosomes of the lentil genome. The consensus map had no gap larger than 12.67 cM and only 5 gaps were found to be between 12.67 cM and 6.0 cM (on LG3 and LG4). The localization of the SNP markers on the lentil consensus map were in general consistent with their localization on the three individual genetic linkage maps and the lentil consensus map has longer map length, higher marker density and shorter average distance between the adjacent markers compared to the component linkage maps. Conclusion This high-density consensus map could provide insight into the lentil genome. The consensus map could also help to construct a physical map using a Bacterial Artificial Chromosome library and map based cloning studies. Sequence information of DArT may help localization of orientation scaffolds from Next Generation Sequencing data. PMID:29351563
Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination
Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo
2015-01-01
Background. Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. Objective. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Methods. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Results. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Conclusion. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination. PMID:26568962
Stewart, J M; Blakely, J A; Karpowicz, P A; Kalanxhi, E; Thatcher, B J; Martin, B M
2004-03-01
We purified myoglobin from beluga whale (Delphinapterus leucas) muscle (longissimus dorsi) with size exclusion and cation exchange chromatographies. The molecular mass was determined by mass spectrometry (17,081 Da) and the isoelectric pH (9.4) by capillary isoelectric focusing. The near-complete amino acid sequence was determined and a phylogeny indicated that beluga was in the same clad as Dall's and harbor porpoises. There were consensus motifs for a phosphorylation site on the protein surface with the most likely site at serine-117. This motif was common to all cetacean myoglobins examined. Two oxygen-binding studies at 37 degrees C indicated dissociation constants (20.5 and 23.6 microM) 5.7-6.6 times larger than horse myoglobin (3.6 microM). The autoxidation rate of beluga myoglobin at 37 degrees C, pH 7.2 was 0.218+/-0.028 h(-1), 1/3 larger than reported for myoglobin of terrestrial mammals. There was no clear sequence change to explain the difference in oxygen binding or autoxidation although substitutions (N66 and T67) in an invariant rich sequence (HGNTV) distal to the heme may play a role. Structural models based on the protein sequence and constructed on topologies of known templates (horse and sperm whale crystal structures) were not adequate to assess perturbation of the heme pocket.
Purification, cDNA cloning, and regulation of lysophospholipase from rat liver.
Sugimoto, H; Hayashi, H; Yamashita, S
1996-03-29
A lysophospholipase was purified 506-fold from rat liver supernatant. The preparation gave a single 24-kDa protein band on SDS-polyacrylamide gel electrophoresis. The enzyme hydrolyzed lysophosphatidylcholine, lysophosphatidylethanolamine, lysophosphatidylinositol, lysophosphatidylserine, and 1-oleoyl-2-acetyl-sn-glycero-3-phosphocholine at pH 6-8. The purified enzyme was used for the preparation of antibody and peptide sequencing. A cDNA clone was isolated by screening a rat liver lambda gt11 cDNA library with the antibody, followed by the selection of further extended clones from a lambda gt10 library. The isolated cDNA was 2,362 base pairs in length and contained an open reading frame encoding 230 amino acids with a Mr of 24,708. The peptide sequences determined were found in the reading frame. When the cDNA was expressed in Escherichia coli cells as the beta-galactosidase fusion, lysophosphatidylcholine-hydrolyzing activity was markedly increased. The deduced amino acid sequence showed significant similarity to Pseudomonas fluorescence esterase A and Spirulina platensis esterase. The three sequences contained the GXSXG consensus at similar positions. The transcript was found in various tissues with the following order of abundance: spleen, heart, kidney, brain, lung, stomach, and testis = liver. In contrast, the enzyme protein was abundant in the following order: testis, liver, kidney, heart, stomach, lung, brain, and spleen. Thus the mRNA abundance disagreed with the level of the enzyme protein in liver, testis, and spleen. When HL-60 cells were induced to differentiate into granulocytes with dimethyl sulfoxide, the 24-kDa lysophospholipase protein increased significantly, but the mRNA abundance remained essentially unchanged. Thus a posttranscriptional control mechanism is present for the regulation of 24-kDa lysophospholipase.
Consensus statement: Virus taxonomy in the age of metagenomics.
Simmonds, Peter; Adams, Mike J; Benkő, Mária; Breitbart, Mya; Brister, J Rodney; Carstens, Eric B; Davison, Andrew J; Delwart, Eric; Gorbalenya, Alexander E; Harrach, Balázs; Hull, Roger; King, Andrew M Q; Koonin, Eugene V; Krupovic, Mart; Kuhn, Jens H; Lefkowitz, Elliot J; Nibert, Max L; Orton, Richard; Roossinck, Marilyn J; Sabanadzovic, Sead; Sullivan, Matthew B; Suttle, Curtis A; Tesh, Robert B; van der Vlugt, René A; Varsani, Arvind; Zerbini, F Murilo
2017-03-01
The number and diversity of viral sequences that are identified in metagenomic data far exceeds that of experimentally characterized virus isolates. In a recent workshop, a panel of experts discussed the proposal that, with appropriate quality control, viruses that are known only from metagenomic data can, and should be, incorporated into the official classification scheme of the International Committee on Taxonomy of Viruses (ICTV). Although a taxonomy that is based on metagenomic sequence data alone represents a substantial departure from the traditional reliance on phenotypic properties, the development of a robust framework for sequence-based virus taxonomy is indispensable for the comprehensive characterization of the global virome. In this Consensus Statement article, we consider the rationale for why metagenomic sequence data should, and how it can, be incorporated into the ICTV taxonomy, and present proposals that have been endorsed by the Executive Committee of the ICTV.
Brylinski, Michal; Konieczny, Leszek; Kononowicz, Andrzej; Roterman, Irena
2008-03-21
The well-known procedure implemented in ClustalW oriented on the sequence comparison was applied to structure comparison. The consensus sequence as well as consensus structure has been defined for proteins belonging to serpine family. The structure of early stage intermediate was the object for similarity search. The high values of W(sequence) appeared to be accordant with high values of W(structure) making possible structure comparison using common criteria for sequence and structure comparison. Since the early stage structural form has been created according to limited conformational sub-space which does not include the beta-structure (this structure is mediated by C7eq structural form), is particularly important to see, that the C7eq structural form may be treated as the seed for beta-structure present in the final native structure of protein. The applicability of ClustalW procedure to structure comparison makes these two comparisons unified.
Quick, Josh; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J
2018-01-01
Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples without isolation remains challenging for viruses such as Zika, where metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence complete genomes comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimised library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved starting with clinical samples in 1-2 days following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. PMID:28538739
Mbanzibwa, Deusdedith R.; Tian, Yanping; Mukasa, Settumba B.; Valkonen, Jari P. T.
2009-01-01
The complete positive-sense single-stranded RNA genome of Cassava brown streak virus (CBSV; genus Ipomovirus; Potyviridae) was found to consist of 9,069 nucleotides and predicted to produce a polyprotein of 2,902 amino acids. It was lacking helper-component proteinase but contained a single P1 serine proteinase that strongly suppressed RNA silencing. Besides the exceptional structure of the 5′-proximal part of the genome, CBSV also contained a Maf/HAM1-like sequence (678 nucleotides, 226 amino acids) recombined between the replicase and coat protein domains in the 3′-proximal part of the genome, which is highly conserved in Potyviridae. HAM1 was flanked by consensus proteolytic cleavage sites for ipomovirus NIaPro cysteine proteinase. Homology of CBSV HAM1 with cellular Maf/HAM1 pyrophosphatases suggests that it may intercept noncanonical nucleoside triphosphates to reduce mutagenesis of viral RNA. PMID:19386713
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhuvanakantham, Raghavan; Chong, Mun-Keat; Ng, Mah-Lee, E-mail: micngml@nus.edu.sg
2009-11-06
West Nile virus (WNV) capsid (C) protein has been shown to enter the nucleus of infected cells. However, the mechanism by which C protein enters the nucleus is unknown. In this study, we have unveiled for the first time that nuclear transport of WNV and Dengue virus C protein is mediated by their direct association with importin-{alpha}. This interplay is mediated by the consensus sequences of bipartite nuclear localization signal located between amino acid residues 85-101 together with amino acid residues 42 and 43 of C protein. Elucidation of biological significance of importin-{alpha}/C protein interaction demonstrated that the binding efficiencymore » of this association influenced the nuclear entry of C protein and virus production. Collectively, this study illustrated the molecular mechanism by which the C protein of arthropod-borne flavivirus enters the nucleus and showed the importance of importin-{alpha}/C protein interaction in the context of flavivirus life-cycle.« less
Fast and accurate de novo genome assembly from long uncorrected reads
Vaser, Robert; Sović, Ivan; Nagarajan, Niranjan
2017-01-01
The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource-intensive error-correction and consensus-generation steps to obtain high-quality assemblies. We show that the error-correction step can be omitted and that high-quality consensus sequences can be generated efficiently with a SIMD-accelerated, partial-order alignment–based, stand-alone consensus module called Racon. Based on tests with PacBio and Oxford Nanopore data sets, we show that Racon coupled with miniasm enables consensus genomes with similar or better quality than state-of-the-art methods while being an order of magnitude faster. PMID:28100585
Sequence specificity of the human mRNA N6-adenosine methylase in vitro.
Harper, J E; Miceli, S M; Roberts, R J; Manley, J L
1990-01-01
N6-adenosine methylation is a frequent modification of mRNAs and their precursors, but little is known about the mechanism of the reaction or the function of the modification. To explore these questions, we developed conditions to examine N6-adenosine methylase activity in HeLa cell nuclear extracts. Transfer of the methyl group from S-[3H methyl]-adenosylmethionine to unlabeled random copolymer RNA substrates of varying ribonucleotide composition revealed a substrate specificity consistent with a previously deduced consensus sequence, Pu[G greater than A]AC[A/C/U]. 32-P labeled RNA substrates of defined sequence were used to examine the minimum sequence requirements for methylation. Each RNA was 20 nucleotides long, and contained either the core consensus sequence GGACU, or some variation of this sequence. RNAs containing GGACU, either in single or multiple copies, were good substrates for methylation, whereas RNAs containing single base substitutions within the GGACU sequence gave dramatically reduced methylation. These results demonstrate that the N6-adenosine methylase has a strict sequence specificity, and that there is no requirement for extended sequences or secondary structures for methylation. Recognition of this sequence does not require an RNA component, as micrococcal nuclease pretreatment of nuclear extracts actually increased methylation efficiency. Images PMID:2216767
Baig, Tayyba T.; Lanchy, Jean-Marc; Lodmell, J. Stephen
2009-01-01
The packaging signal (ψ) of human immunodeficiency virus type 2 (HIV-2) is present in the 5′ noncoding region of RNA and contains a 10-nucleotide palindrome (pal; 5′-392-GGAGUGCUCC) located upstream of the dimerization signal stem-loop 1 (SL1). pal has been shown to be functionally important in vitro and in vivo. We previously showed that the 3′ side of pal (GCUCC-3′) is involved in base-pairing interactions with a sequence downstream of SL1 to make an extended SL1, which is important for replication in vivo and the regulation of dimerization in vitro. However, the role of the 5′ side of pal (5′-GGAGU) was less clear. Here, we characterized this role using an in vivo SELEX approach. We produced a population of HIV-2 DNA genomes with random sequences within the 5′ side of pal and transfected these into COS-7 cells. Viruses from COS-7 cells were used to infect C8166 permissive cells. After several weeks of serial passage in C8166 cells, surviving viruses were sequenced. On the 5′ side of pal there was a striking convergence toward a GGRGN consensus sequence. Individual clones with consensus and nonconsensus sequences were tested in infectivity and packaging assays. Analysis of individuals that diverged from the consensus sequence showed normal viral RNA and protein synthesis but had replication defects and impaired RNA packaging. These findings clearly indicate that the GGRG motif is essential for viral replication and genomic RNA packaging. PMID:18971263
Evolutionary divergence in the catalytic activity of the CAM-1, ROR1 and ROR2 kinase domains.
Bainbridge, Travis W; DeAlmeida, Venita I; Izrael-Tomasevic, Anita; Chalouni, Cécile; Pan, Borlan; Goldsmith, Joshua; Schoen, Alia P; Quiñones, Gabriel A; Kelly, Ryan; Lill, Jennie R; Sandoval, Wendy; Costa, Mike; Polakis, Paul; Arnott, David; Rubinfeld, Bonnee; Ernst, James A
2014-01-01
Receptor tyrosine kinase-like orphan receptors (ROR) 1 and 2 are atypical members of the receptor tyrosine kinase (RTK) family and have been associated with several human diseases. The vertebrate RORs contain an ATP binding domain that deviates from the consensus amino acid sequence, although the impact of this deviation on catalytic activity is not known and the kinase function of these receptors remains controversial. Recently, ROR2 was shown to signal through a Wnt responsive, β-catenin independent pathway and suppress a canonical Wnt/β-catenin signal. In this work we demonstrate that both ROR1 and ROR2 kinase domains are catalytically deficient while CAM-1, the C. elegans homolog of ROR, has an active tyrosine kinase domain, suggesting a divergence in the signaling processes of the ROR family during evolution. In addition, we show that substitution of the non-consensus residues from ROR1 or ROR2 into CAM-1 and MuSK markedly reduce kinase activity, while restoration of the consensus residues in ROR does not restore robust kinase function. We further demonstrate that the membrane-bound extracellular domain alone of either ROR1 or ROR2 is sufficient for suppression of canonical Wnt3a signaling, and that this domain can also enhance Wnt5a suppression of Wnt3a signaling. Based on these data, we conclude that human ROR1 and ROR2 are RTK-like pseudokinases.
CapZyme-Seq Comprehensively Defines Promoter-Sequence Determinants for RNA 5' Capping with NAD.
Vvedenskaya, Irina O; Bird, Jeremy G; Zhang, Yuanchao; Zhang, Yu; Jiao, Xinfu; Barvík, Ivan; Krásný, Libor; Kiledjian, Megerditch; Taylor, Deanne M; Ebright, Richard H; Nickels, Bryce E
2018-05-03
Nucleoside-containing metabolites such as NAD + can be incorporated as 5' caps on RNA by serving as non-canonical initiating nucleotides (NCINs) for transcription initiation by RNA polymerase (RNAP). Here, we report CapZyme-seq, a high-throughput-sequencing method that employs NCIN-decapping enzymes NudC and Rai1 to detect and quantify NCIN-capped RNA. By combining CapZyme-seq with multiplexed transcriptomics, we determine efficiencies of NAD + capping by Escherichia coli RNAP for ∼16,000 promoter sequences. The results define preferred transcription start site (TSS) positions for NAD + capping and define a consensus promoter sequence for NAD + capping: HRRASWW (TSS underlined). By applying CapZyme-seq to E. coli total cellular RNA, we establish that sequence determinants for NCIN capping in vivo match the NAD + -capping consensus defined in vitro, and we identify and quantify NCIN-capped small RNAs (sRNAs). Our findings define the promoter-sequence determinants for NCIN capping with NAD + and provide a general method for analysis of NCIN capping in vitro and in vivo. Copyright © 2018 Elsevier Inc. All rights reserved.
Glynn, Neil C; Comstock, Jack C; Sood, Sushma G; Dang, Phat M; Chaparro, Jose X
2008-01-01
Resistance gene analogues (RGAs) have been isolated from many crops and offer potential in breeding for disease resistance through marker-assisted selection, either as closely linked or as perfect markers. Many R-gene sequences contain kinase domains, and indeed kinase genes have been reported as being proximal to R-genes, making kinase analogues an additionally promising target. The first step towards utilizing RGAs as markers for disease resistance is isolation and characterization of the sequences. Sugarcane clone US01-1158 was identified as resistant to yellow leaf caused by the sugarcane yellow leaf virus (SCYLV) and moderately resistant to rust caused by Puccinia melanocephala Sydow & Sydow. Degenerate primers that had previously proved useful for isolating RGAs and kinase analogues in wheat and soybean were used to amplify DNA from sugarcane (Saccharum spp.) clone US-01-1158. Sequences generated from 1512 positive clones were assembled into 134 contigs of between two and 105 sequences. Comparison of the contig consensuses with the NCBI sequence database using BLASTx showed that 20 had sequence homology to nuclear binding site and leucine rich repeat (NBS-LRR) RGAs, and eight to kinase genes. Alignment of the deduced amino acid sequences with similar sequences from the NCBI database allowed the identification of several conserved domains. The alignment and resulting phenetic tree showed that many of the sequences had greater similarity to sequences from other species than to one another. The use of degenerate primers is a useful method for isolating novel sugarcane RGA and kinase gene analogues. Further studies are needed to evaluate the role of these genes in disease resistance.
Dickinson, Louise; Ahmed, Hashim U; Allen, Clare; Barentsz, Jelle O; Carey, Brendan; Futterer, Jurgen J; Heijmink, Stijn W; Hoskin, Peter J; Kirkham, Alex; Padhani, Anwar R; Persad, Raj; Puech, Philippe; Punwani, Shonit; Sohaib, Aslam S; Tombal, Bertrand; Villers, Arnauld; van der Meulen, Jan; Emberton, Mark
2011-04-01
Multiparametric magnetic resonance imaging (mpMRI) may have a role in detecting clinically significant prostate cancer in men with raised serum prostate-specific antigen levels. Variations in technique and the interpretation of images have contributed to inconsistency in its reported performance characteristics. Our aim was to make recommendations on a standardised method for the conduct, interpretation, and reporting of prostate mpMRI for prostate cancer detection and localisation. A consensus meeting of 16 European prostate cancer experts was held that followed the UCLA-RAND Appropriateness Method and facilitated by an independent chair. Before the meeting, 520 items were scored for "appropriateness" by panel members, discussed face to face, and rescored. Agreement was reached in 67% of 260 items related to imaging sequence parameters. T2-weighted, dynamic contrast-enhanced, and diffusion-weighted MRI were the key sequences incorporated into the minimum requirements. Consensus was also reached on 54% of 260 items related to image interpretation and reporting, including features of malignancy on individual sequences. A 5-point scale was agreed on for communicating the probability of malignancy, with a minimum of 16 prostatic regions of interest, to include a pictorial representation of suspicious foci. Limitations relate to consensus methodology. Dominant personalities are known to affect the opinions of the group and were countered by a neutral chairperson. Consensus was reached on a number of areas related to the conduct, interpretation, and reporting of mpMRI for the detection, localisation, and characterisation of prostate cancer. Before optimal dissemination of this technology, these outcomes will require formal validation in prospective trials. Copyright © 2010 European Association of Urology. Published by Elsevier B.V. All rights reserved.
A filtering method to generate high quality short reads using illumina paired-end technology.
Eren, A Murat; Vineis, Joseph H; Morrison, Hilary G; Sogin, Mitchell L
2013-01-01
Consensus between independent reads improves the accuracy of genome and transcriptome analyses, however lack of consensus between very similar sequences in metagenomic studies can and often does represent natural variation of biological significance. The common use of machine-assigned quality scores on next generation platforms does not necessarily correlate with accuracy. Here, we describe using the overlap of paired-end, short sequence reads to identify error-prone reads in marker gene analyses and their contribution to spurious OTUs following clustering analysis using QIIME. Our approach can also reduce error in shotgun sequencing data generated from libraries with small, tightly constrained insert sizes. The open-source implementation of this algorithm in Python programming language with user instructions can be obtained from https://github.com/meren/illumina-utils.
Ability of HIV-1 Nef to downregulate CD4 and HLA class I differs among viral subtypes
2013-01-01
Background The highly genetically diverse HIV-1 group M subtypes may differ in their biological properties. Nef is an important mediator of viral pathogenicity; however, to date, a comprehensive inter-subtype comparison of Nef in vitro function has not been undertaken. Here, we investigate two of Nef’s most well-characterized activities, CD4 and HLA class I downregulation, for clones obtained from 360 chronic patients infected with HIV-1 subtypes A, B, C or D. Results Single HIV-1 plasma RNA Nef clones were obtained from N=360 antiretroviral-naïve, chronically infected patients from Africa and North America: 96 (subtype A), 93 (B), 85 (C), and 86 (D). Nef clones were expressed by transfection in an immortalized CD4+ T-cell line. CD4 and HLA class I surface levels were assessed by flow cytometry. Nef expression was verified by Western blot. Subset analyses and multivariable linear regression were used to adjust for differences in age, sex and clinical parameters between cohorts. Consensus HIV-1 subtype B and C Nef sequences were synthesized and functionally assessed. Exploratory sequence analyses were performed to identify potential genotypic correlates of Nef function. Subtype B Nef clones displayed marginally greater CD4 downregulation activity (p = 0.03) and markedly greater HLA class I downregulation activity (p < 0.0001) than clones from other subtypes. Subtype C Nefs displayed the lowest in vitro functionality. Inter-subtype differences in HLA class I downregulation remained statistically significant after controlling for differences in age, sex, and clinical parameters (p < 0.0001). The synthesized consensus subtype B Nef showed higher activities compared to consensus C Nef, which was most pronounced in cells expressing lower protein levels. Nef clones exhibited substantial inter-subtype diversity: cohort consensus residues differed at 25% of codons, while a similar proportion of codons exhibited substantial inter-subtype differences in major variant frequency. These amino acids, along with others identified in intra-subtype analyses, represent candidates for mediating inter-subtype differences in Nef function. Conclusions Results support a functional hierarchy of subtype B > A/D > C for Nef-mediated CD4 and HLA class I downregulation. The mechanisms underlying these differences and their relevance to HIV-1 pathogenicity merit further investigation. PMID:24041011
Hulot, Sandrine L.; Korber, Bette; Giorgi, Elena E.; Vandergrift, Nathan; Saunders, Kevin O.; Balachandran, Harikrishnan; Mach, Linh V.; Lifton, Michelle A.; Pantaleo, Giuseppe; Tartaglia, Jim; Phogat, Sanjay; Jacobs, Bertram; Kibler, Karen; Perdiguero, Beatriz; Gomez, Carmen E.; Esteban, Mariano; Rosati, Margherita; Felber, Barbara K.; Pavlakis, George N.; Parks, Robert; Lloyd, Krissey; Sutherland, Laura; Scearce, Richard; Letvin, Norman L.; Seaman, Michael S.; Alam, S. Munir; Montefiori, David; Liao, Hua-Xin; Haynes, Barton F.
2015-01-01
ABSTRACT An effective human immunodeficiency virus type 1 (HIV-1) vaccine must induce protective antibody responses, as well as CD4+ and CD8+ T cell responses, that can be effective despite extraordinary diversity of HIV-1. The consensus and mosaic immunogens are complete but artificial proteins, computationally designed to elicit immune responses with improved cross-reactive breadth, to attempt to overcome the challenge of global HIV diversity. In this study, we have compared the immunogenicity of a transmitted-founder (T/F) B clade Env (B.1059), a global group M consensus Env (Con-S), and a global trivalent mosaic Env protein in rhesus macaques. These antigens were delivered using a DNA prime-recombinant NYVAC (rNYVAC) vector and Env protein boost vaccination strategy. While Con-S Env was a single sequence, mosaic immunogens were a set of three Envs optimized to include the most common forms of potential T cell epitopes. Both Con-S and mosaic sequences retained common amino acids encompassed by both antibody and T cell epitopes and were central to globally circulating strains. Mosaics and Con-S Envs expressed as full-length proteins bound well to a number of neutralizing antibodies with discontinuous epitopes. Also, both consensus and mosaic immunogens induced significantly higher gamma interferon (IFN-γ) enzyme-linked immunosorbent spot assay (ELISpot) responses than B.1059 immunogen. Immunization with these proteins, particularly Con-S, also induced significantly higher neutralizing antibodies to viruses than B.1059 Env, primarily to tier 1 viruses. Both Con-S and mosaics stimulated more potent CD8-T cell responses against heterologous Envs than did B.1059. Both antibody and cellular data from this study strengthen the concept of using in silico-designed centralized immunogens for global HIV-1 vaccine development strategies. IMPORTANCE There is an increasing appreciation for the importance of vaccine-induced anti-Env antibody responses for preventing HIV-1 acquisition. This nonhuman primate study demonstrates that in silico-designed global HIV-1 immunogens, designed for a human clinical trial, are capable of eliciting not only T lymphocyte responses but also potent anti-Env antibody responses. PMID:25855741
Bernardes, Juliana; Zaverucha, Gerson; Vaquero, Catherine; Carbone, Alessandra
2016-01-01
Traditional protein annotation methods describe known domains with probabilistic models representing consensus among homologous domain sequences. However, when relevant signals become too weak to be identified by a global consensus, attempts for annotation fail. Here we address the fundamental question of domain identification for highly divergent proteins. By using high performance computing, we demonstrate that the limits of state-of-the-art annotation methods can be bypassed. We design a new strategy based on the observation that many structural and functional protein constraints are not globally conserved through all species but might be locally conserved in separate clades. We propose a novel exploitation of the large amount of data available: 1. for each known protein domain, several probabilistic clade-centered models are constructed from a large and differentiated panel of homologous sequences, 2. a decision-making protocol combines outcomes obtained from multiple models, 3. a multi-criteria optimization algorithm finds the most likely protein architecture. The method is evaluated for domain and architecture prediction over several datasets and statistical testing hypotheses. Its performance is compared against HMMScan and HHblits, two widely used search methods based on sequence-profile and profile-profile comparison. Due to their closeness to actual protein sequences, clade-centered models are shown to be more specific and functionally predictive than the broadly used consensus models. Based on them, we improved annotation of Plasmodium falciparum protein sequences on a scale not previously possible. We successfully predict at least one domain for 72% of P. falciparum proteins against 63% achieved previously, corresponding to 30% of improvement over the total number of Pfam domain predictions on the whole genome. The method is applicable to any genome and opens new avenues to tackle evolutionary questions such as the reconstruction of ancient domain duplications, the reconstruction of the history of protein architectures, and the estimation of protein domain age. Website and software: http://www.lcqb.upmc.fr/CLADE. PMID:27472895
2011-01-01
Background Big sagebrush (Artemisia tridentata) is one of the most widely distributed and ecologically important shrub species in western North America. This species serves as a critical habitat and food resource for many animals and invertebrates. Habitat loss due to a combination of disturbances followed by establishment of invasive plant species is a serious threat to big sagebrush ecosystem sustainability. Lack of genomic data has limited our understanding of the evolutionary history and ecological adaptation in this species. Here, we report on the sequencing of expressed sequence tags (ESTs) and detection of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers in subspecies of big sagebrush. Results cDNA of A. tridentata sspp. tridentata and vaseyana were normalized and sequenced using the 454 GS FLX Titanium pyrosequencing technology. Assembly of the reads resulted in 20,357 contig consensus sequences in ssp. tridentata and 20,250 contigs in ssp. vaseyana. A BLASTx search against the non-redundant (NR) protein database using 29,541 consensus sequences obtained from a combined assembly resulted in 21,436 sequences with significant blast alignments (≤ 1e-15). A total of 20,952 SNPs and 119 polymorphic SSRs were detected between the two subspecies. SNPs were validated through various methods including sequence capture. Validation of SNPs in different individuals uncovered a high level of nucleotide variation in EST sequences. EST sequences of a third, tetraploid subspecies (ssp. wyomingensis) obtained by Illumina sequencing were mapped to the consensus sequences of the combined 454 EST assembly. Approximately one-third of the SNPs between sspp. tridentata and vaseyana identified in the combined assembly were also polymorphic within the two geographically distant ssp. wyomingensis samples. Conclusion We have produced a large EST dataset for Artemisia tridentata, which contains a large sample of the big sagebrush leaf transcriptome. SNP mapping among the three subspecies suggest the origin of ssp. wyomingensis via mixed ancestry. A large number of SNP and SSR markers provide the foundation for future research to address questions in big sagebrush evolution, ecological genetics, and conservation using genomic approaches. PMID:21767398
Hyder, S M; Stancel, G M; Nawaz, Z; McDonnell, D P; Loose-Mitchell, D S
1992-09-05
We have used transient transfection assays with reporter plasmids expressing chloramphenicol acetyltransferase, linked to regions of mouse c-fos, to identify a specific estrogen response element (ERE) in this protooncogene. This element is located in the untranslated 3'-flanking region of the c-fos gene, 5 kilobases (kb) downstream from the c-fos promoter and 1.5 kb downstream of the poly(A) signal. This element confers estrogen responsiveness to chloramphenicol acetyltransferase reporters linked to both the herpes simplex virus thymidine kinase promoter and the homologous c-fos promoter. Deletion analysis localized the response element to a 200-base pair fragment which contains the element GGTCACCACAGCC that resembles the consensus ERE sequence GGTCACAGTGACC originally identified in Xenopus vitellogenin A2 gene. A synthetic 36-base pair oligodeoxynucleotide containing this c-fos sequence conferred estrogen inducibility to the thymidine kinase promoter. The corresponding sequence also induced reporter activity when present in the c-fos gene fragment 3 kb from the thymidine kinase promoter. Gel-shift experiments demonstrated that synthetic oligonucleotides containing either the consensus ERE or the c-fos element bind human estrogen receptor obtained from a yeast expression system. However, the mobility of the shifted band is faster for the fos-ERE-complex than the consensus ERE complex suggesting that the three-dimensional structure of the protein-DNA complexes is different or that other factors are differentially involved in the two reactions. When the 5'-GGTCA sequence present in the c-fos ERE is mutated to 5'-TTTCA, transcriptional activation and receptor binding activities are both lost. Mutation of the CAGCC-3' element corresponding to the second half-site of the c-fos sequence also led to the loss of receptor binding activity, suggesting that both half-sites of this element are involved in this function. The estrogen induction mediated by either the c-fos or the consensus ERE was blunted by the antiestrogen tamoxifen. Based on these studies, we believe the 3'-fos ERE sequence we have identified may be a major cis-acting element involved in the physiological regulation of the gene by estrogens in vivo.
Banyuls, N; Hernández-Rodríguez, C S; Van Rie, J; Ferré, J
2018-05-15
Vip3 vegetative insecticidal proteins from Bacillus thuringiensis are an important tool for crop protection against caterpillar pests in IPM strategies. While there is wide consensus on their general mode of action, the details of their mode of action are not completely elucidated and their structure remains unknown. In this work the alanine scanning technique was performed on 558 out of the total of 788 amino acids of the Vip3Af1 protein. From the 558 residue substitutions, 19 impaired protein expression and other 19 substitutions severely compromised the insecticidal activity against Spodoptera frugiperda. The latter 19 substitutions mainly clustered in two regions of the protein sequence (amino acids 167-272 and amino acids 689-741). Most of these substitutions also decreased the activity to Agrotis segetum. The characterisation of the sensitivity to proteases of the mutant proteins displaying decreased insecticidal activity revealed 6 different band patterns as evaluated by SDS-PAGE. The study of the intrinsic fluorescence of most selected mutants revealed only slight shifts in the emission peak, likely indicating only minor changes in the tertiary structure. An in silico modelled 3D structure of Vip3Af1 is proposed for the first time.
Sequences of Zika Virus Genomes from a Pediatric Cohort in Nicaragua.
Oldfield, Lauren M; Fedorova, Nadia; Puri, Vinita; Shrivastava, Susmita; Amedeo, Paolo; Durbin, Alan; Rocchi, Iara; Williams, Torrey; Shabman, Reed S; Tan, Gene S; Balmaseda, Angel; Kuan, Guillermina; Saborio, Saira; Gordon, Aubree; Harris, Eva; Pickett, Brett E
2018-06-14
We report here the whole-genome sequence of 11 Zika virus (ZIKV) samples from six pediatric patients in Nicaragua. Serum samples were collected, and ZIKV was isolated in tissue culture. Both serum and virus isolates were sequenced. The consensus ZIKV genomes are greater than 99% identical to each other. Copyright © 2018 Oldfield et al.
Zhang, Lin-Lin; Tan, Mei-Juan; Liu, Guang-Lei; Chi, Zhe; Wang, Guang-Yuan; Chi, Zhen-Ming
2015-04-01
The INU1 gene encoding an exo-inulinase from the marine-derived yeast Candida membranifaciens subsp. flavinogenie W14-3 was cloned and characterized. It had an open reading frame of 1,536 bp long encoding an inulinase. The coding region of it was not interrupted by any intron. The cloned gene encoded 512 amino acid residues of a protein with a putative signal peptide of 23 amino acids and a calculated molecular mass of 57.8 kDa. The protein sequence deduced from the inulinase gene contained the inulinase consensus sequences (WMNDPNGL), (RDP), ECP FS and Q. The protein also had six conserved putative N-glycosylation sites. The deduced inulinase from the yeast strain W14-3 was found to be closely related to that from Candida kutaonensis sp. nov. KRF1, Kluyveromyces marxianus, and Cryptococcus aureus G7a. The inulinase gene with its signal peptide encoding sequence was subcloned into the pMIRSC11 expression vector and expressed in Saccharomyces sp. W0. The recombinant yeast strain W14-3-INU-112 obtained could produce 16.8 U/ml of inulinase activity and 12.5 % (v/v) ethanol from 250 g/l of inulin within 168 h. The monosaccharides were detected after the hydrolysis of inulin with the crude inulinase (the yeast culture). All the results indicated that the cloned gene and the recombinant yeast strain W14-3-INU-112 had potential applications in biotechnology.
Trung, Le Quang; VAN Puyvelde, Karolien; Triest, Ludwig
2008-03-01
Consensus primers, based on exon sequences of the cyp73 gene family coding for cinnamate 4-hydroxylase (C4H) of the lignin biosynthesis pathway, were designed for the tetraploid willow species Salix alba and Salix fragilis. Diagnostic alleles at species level were observed among introns of three cyp73 genes and allowed unambiguous detection of the first generation and introgressed hybrids in populations. Progeny analysis of a female S. alba with a male introgressed hybrid confirmed the codominant inheritance of each intron. Sequences of the diagnostic alleles of both species were similar to those found in the hybrids. © 2007 The Authors.
2013-01-01
Background Cucumber is an important vegetable crop that is susceptible to many pathogens, but no disease resistance (R) genes have been cloned. The availability of whole genome sequences provides an excellent opportunity for systematic identification and characterization of the nucleotide binding and leucine-rich repeat (NB-LRR) type R gene homolog (RGH) sequences in the genome. Cucumber has a very narrow genetic base making it difficult to construct high-density genetic maps. Development of a consensus map by synthesizing information from multiple segregating populations is a method of choice to increase marker density. As such, the objectives of the present study were to identify and characterize NB-LRR type RGHs, and to develop a high-density, integrated cucumber genetic-physical map anchored with RGH loci. Results From the Gy14 draft genome, 70 NB-containing RGHs were identified and characterized. Most RGHs were in clusters with uneven distribution across seven chromosomes. In silico analysis indicated that all 70 RGHs had EST support for gene expression. Phylogenetic analysis classified 58 RGHs into two clades: CNL and TNL. Comparative analysis revealed high-degree sequence homology and synteny in chromosomal locations of these RGH members between the cucumber and melon genomes. Fifty-four molecular markers were developed to delimit 67 of the 70 RGHs, which were integrated into a genetic map through linkage analysis. A 1,681-locus cucumber consensus map including 10 gene loci and spanning 730.0 cM in seven linkage groups was developed by integrating three component maps with a bin-mapping strategy. Physically, 308 scaffolds with 193.2 Mbp total DNA sequences were anchored onto this consensus map that covered 52.6% of the 367 Mbp cucumber genome. Conclusions Cucumber contains relatively few NB-LRR RGHs that are clustered and unevenly distributed in the genome. All RGHs seem to be transcribed and shared significant sequence homology and synteny with the melon genome suggesting conservation of these RGHs in the Cucumis lineage. The 1,681-locus consensus genetic-physical map developed and the RGHs identified and characterized herein are valuable genomics resources that may have many applications such as quantitative trait loci identification, map-based gene cloning, association mapping, marker-assisted selection, as well as assembly of a more complete cucumber genome. PMID:23531125
Nyaga, Martin M.; Stucker, Karla M.; Esona, Mathew D.; Jere, Khuzwayo C.; Mwinyi, Bakari; Shonhai, Annie; Tsolenyanu, Enyonam; Mulindwa, Augustine; Chibumbya, Julia N.; Adolfine, Hokororo; Halpin, Rebecca A.; Roy, Sunando; Stockwell, Timothy B.; Berejena, Chipo; Seheri, Mapaseka L.; Mwenda, Jason M.; Steele, A. Duncan; Wentworth, David E.
2018-01-01
Group A rotaviruses (RVAs) with distinct G and P genotype combinations have been reported globally. We report the genome composition and possible origin of seven G8P[4] and five G2P[4] human RVA strains based on the genetic evolution of all 11 genome segments at the nucleotide level. Twelve RVA ELISA positive stool samples collected in the representative countries of Eastern, Southern and West Africa during the 2007–2012 surveillance seasons were subjected to sequencing using the Ion Torrent PGM and Illumina MiSeq platforms. A reference-based assembly was performed using CLC Bio’s clc_ref_assemble_long program, and full-genome consensus sequences were obtained. With the exception of the neutralising antigen, VP7, all study strains exhibited the DS-1-like genome constellation (P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2) and clustered phylogenetically with reference strains having a DS-1-like genetic backbone. Comparison of the nucleotide and amino acid sequences with selected global cognate genome segments revealed nucleotide and amino acid sequence identities of 81.7–100 % and 90.6–100 %, respectively, with NSP4 gene segment showing the most diversity among the strains. Bayesian analyses of all gene sequences to estimate the time of divergence of the lineage indicated that divergence times ranged from 16 to 44 years, except for the NSP4 gene where the lineage seemed to arise in the more distant past at an estimated 203 years ago. However, the long-term effects of changes found within the NSP4 genome segment should be further explored, and thus we recommend continued whole-genome analyses from larger sample sets to determine the evolutionary mechanisms of the DS-1-like strains collected in Africa. PMID:24952422
Godkin, A; Friede, T; Davenport, M; Stevanovic, S; Willis, A; Jewell, D; Hill, A; Rammensee, H G
1997-06-01
HLA-DQ8 (A1*0301, B1*0302) and -DQ2 (A1*0501, B1*0201) are both associated with diseases such as insulin-dependent diabetes mellitus and coeliac disease. We used the technique of pool sequencing to look at the requirements of peptides binding to HLA-DQ8, and combined these data with naturally sequenced ligands and in vitro binding assays to describe a novel motif for HLA-DQ8. The motif, which has the same basic format as many HLA-DR molecules, consists of four or five anchor regions, in the positions from the N-terminus of the binding core of n, n + 3, n + 5/6 and n + 8, i.e. P1, P4, P6/7 and P9. P1 and P9 require negative or polar residues, with mainly aliphatic residues at P4 and P6/7. The features of the HLA-DQ8 motif were then compared to a pool sequence of peptides eluted from HLA-DQ2. A consensus motif for the binding of a common peptide which may be involved in disease pathogenesis is described. Neither of the disease-associated alleles HLA-DQ2 and -DQ8 have Asp at position 57 of the beta-chain. This Asp, if present, may form a salt bridge with an Arg at position 79 of the alpha-chain and so alter the binding specificity of P9. HLA-DQ2 and -DQ8 both appear to prefer negatively charged amino acids at P9. In contrast, HLA-DQ7 (A1*0301, B1*0301), which is not associated with diabetes, has Asp at beta 57, allowing positively charged amino acids at P9. This analysis of the sequence features of DQ-binding peptides suggests molecular characteristics which may be useful to predict epitopes involved in disease pathogenesis.
Genetic dissection of the consensus sequence for the class 2 and class 3 flagellar promoters
Wozniak, Christopher E.; Hughes, Kelly T.
2008-01-01
Summary Computational searches for DNA binding sites often utilize consensus sequences. These search models make assumptions that the frequency of a base pair in an alignment relates to the base pair’s importance in binding and presume that base pairs contribute independently to the overall interaction with the DNA binding protein. These two assumptions have generally been found to be accurate for DNA binding sites. However, these assumptions are often not satisfied for promoters, which are involved in additional steps in transcription initiation after RNA polymerase has bound to the DNA. To test these assumptions for the flagellar regulatory hierarchy, class 2 and class 3 flagellar promoters were randomly mutagenized in Salmonella. Important positions were then saturated for mutagenesis and compared to scores calculated from the consensus sequence. Double mutants were constructed to determine how mutations combined for each promoter type. Mutations in the binding site for FlhD4C2, the activator of class 2 promoters, better satisfied the assumptions for the binding model than did mutations in the class 3 promoter, which is recognized by the σ28 transcription factor. These in vivo results indicate that the activator sites within flagellar promoters can be modeled using simple assumptions but that the DNA sequences recognized by the flagellar sigma factor require more complex models. PMID:18486950
Lucas, J.N.; Straume, T.; Bogen, K.T.
1998-03-24
A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
1998-01-01
A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.
Method for identifying and quantifying nucleic acid sequence aberrations
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
1998-01-01
A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.
Method for identifying and quantifying nucleic acid sequence aberrations
Lucas, J.N.; Straume, T.; Bogen, K.T.
1998-07-21
A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.
Pasion, S G; Hines, J C; Ou, X; Mahmood, R; Ray, D S
1996-01-01
Gene expression in trypanosomatids appears to be regulated largely at the posttranscriptional level and involves maturation of mRNA precursors by trans splicing of a 39-nucleotide miniexon sequence to the 5' end of the mRNA and cleavage and polyadenylation at the 3' end of the mRNA. To initiate the identification of sequences involved in the periodic expression of DNA replication genes in trypanosomatids, we have mapped splice acceptor sites in the 5' flanking region of the TOP2 gene, which encodes the kinetoplast DNA topoisomerase, and have carried out deletion analysis of this region on a plasmid-encoded TOP2 gene. Block deletions within the 5' untranslated region (UTR) identified two regions (-608 to -388 and -387 to -186) responsible for periodic accumulation of the mRNA. Deletion of one or the other of these sequences had no effect on periodic expression of the mRNA, while deletion of both regions resulted in constitutive expression of the mRNA throughout the cell cycle. Subcloning of these sequences into the 5' UTR of a construct lacking both regions of the TOP2 5' UTR has shown that an octamer consensus sequence present in the 5' UTR of the TOP2, RPA1, and DHFR-TS mRNAs is required for normal cycling of the TOP2 mRNA. Mutation of the consensus octamer sequence in the TOP2 5' UTR in a plasmid construct containing only a single consensus octamer and that shows normal cycling of the plasmid-encoded TOP2 mRNA resulted in substantial reduction of the cycling of the mRNA level. These results imply a negative regulation of TOP2 mRNA during the cell cycle by a mechanism involving redundant elements containing one or more copies of a conserved octamer sequence within the 5' UTR of TOP2 mRNA. PMID:8943327
Pasion, S G; Hines, J C; Ou, X; Mahmood, R; Ray, D S
1996-12-01
Gene expression in trypanosomatids appears to be regulated largely at the posttranscriptional level and involves maturation of mRNA precursors by trans splicing of a 39-nucleotide miniexon sequence to the 5' end of the mRNA and cleavage and polyadenylation at the 3' end of the mRNA. To initiate the identification of sequences involved in the periodic expression of DNA replication genes in trypanosomatids, we have mapped splice acceptor sites in the 5' flanking region of the TOP2 gene, which encodes the kinetoplast DNA topoisomerase, and have carried out deletion analysis of this region on a plasmid-encoded TOP2 gene. Block deletions within the 5' untranslated region (UTR) identified two regions (-608 to -388 and -387 to -186) responsible for periodic accumulation of the mRNA. Deletion of one or the other of these sequences had no effect on periodic expression of the mRNA, while deletion of both regions resulted in constitutive expression of the mRNA throughout the cell cycle. Subcloning of these sequences into the 5' UTR of a construct lacking both regions of the TOP2 5' UTR has shown that an octamer consensus sequence present in the 5' UTR of the TOP2, RPA1, and DHFR-TS mRNAs is required for normal cycling of the TOP2 mRNA. Mutation of the consensus octamer sequence in the TOP2 5' UTR in a plasmid construct containing only a single consensus octamer and that shows normal cycling of the plasmid-encoded TOP2 mRNA resulted in substantial reduction of the cycling of the mRNA level. These results imply a negative regulation of TOP2 mRNA during the cell cycle by a mechanism involving redundant elements containing one or more copies of a conserved octamer sequence within the 5' UTR of TOP2 mRNA.
Logan, Grace; Freimanis, Graham L; King, David J; Valdazo-González, Begoña; Bachanek-Bankowska, Katarzyna; Sanderson, Nicholas D; Knowles, Nick J; King, Donald P; Cottam, Eleanor M
2014-09-30
Next-Generation Sequencing (NGS) is revolutionizing molecular epidemiology by providing new approaches to undertake whole genome sequencing (WGS) in diagnostic settings for a variety of human and veterinary pathogens. Previous sequencing protocols have been subject to biases such as those encountered during PCR amplification and cell culture, or are restricted by the need for large quantities of starting material. We describe here a simple and robust methodology for the generation of whole genome sequences on the Illumina MiSeq. This protocol is specific for foot-and-mouth disease virus (FMDV) or other polyadenylated RNA viruses and circumvents both the use of PCR and the requirement for large amounts of initial template. The protocol was successfully validated using five FMDV positive clinical samples from the 2001 epidemic in the United Kingdom, as well as a panel of representative viruses from all seven serotypes. In addition, this protocol was successfully used to recover 94% of an FMDV genome that had previously been identified as cell culture negative. Genome sequences from three other non-FMDV polyadenylated RNA viruses (EMCV, ERAV, VESV) were also obtained with minor protocol amendments. We calculated that a minimum coverage depth of 22 reads was required to produce an accurate consensus sequence for FMDV O. This was achieved in 5 FMDV/O/UKG isolates and the type O FMDV from the serotype panel with the exception of the 5' genomic termini and area immediately flanking the poly(C) region. We have developed a universal WGS method for FMDV and other polyadenylated RNA viruses. This method works successfully from a limited quantity of starting material and eliminates the requirement for genome-specific PCR amplification. This protocol has the potential to generate consensus-level sequences within a routine high-throughput diagnostic environment.
Duda, Anja; Stange, Annett; Lüftenegger, Daniel; Stanke, Nicole; Westphal, Dana; Pietschmann, Thomas; Eastman, Scott W; Linial, Maxine L; Rethwilm, Axel; Lindemann, Dirk
2004-12-01
Analogous to cellular glycoproteins, viral envelope proteins contain N-terminal signal sequences responsible for targeting them to the secretory pathway. The prototype foamy virus (PFV) envelope (Env) shows a highly unusual biosynthesis. Its precursor protein has a type III membrane topology with both the N and C terminus located in the cytoplasm. Coexpression of FV glycoprotein and interaction of its leader peptide (LP) with the viral capsid is essential for viral particle budding and egress. Processing of PFV Env into the particle-associated LP, surface (SU), and transmembrane (TM) subunits occur posttranslationally during transport to the cell surface by yet-unidentified cellular proteases. Here we provide strong evidence that furin itself or a furin-like protease and not the signal peptidase complex is responsible for both processing events. N-terminal protein sequencing of the SU and TM subunits of purified PFV Env-immunoglobulin G immunoadhesin identified furin consensus sequences upstream of both cleavage sites. Mutagenesis analysis of two overlapping furin consensus sequences at the PFV LP/SU cleavage site in the wild-type protein confirmed the sequencing data and demonstrated utilization of only the first site. Fully processed SU was almost completely absent in viral particles of mutants having conserved arginine residues replaced by alanines in the first furin consensus sequence, but normal processing was observed upon mutation of the second motif. Although these mutants displayed a significant loss in infectivity as a result of reduced particle release, no correlation to processing inhibition was observed, since another mutant having normal LP/SU processing had a similar defect.
Interaction of the Sliding Clamp β-Subunit and Hda, a DnaA-Related Protein
Kurz, Mareike; Dalrymple, Brian; Wijffels, Gene; Kongsuwan, Kritaya
2004-01-01
In Escherichia coli, interactions between the replication initiation protein DnaA, the β subunit of DNA polymerase III (the sliding clamp protein), and Hda, the recently identified DnaA-related protein, are required to convert the active ATP-bound form of DnaA to an inactive ADP-bound form through the accelerated hydrolysis of ATP. This rapid hydrolysis of ATP is proposed to be the main mechanism that blocks multiple initiations during cell cycle and acts as a molecular switch from initiation to replication. However, the biochemical mechanism for this crucial step in DNA synthesis has not been resolved. Using purified Hda and β proteins in a plate binding assay and Ni-nitrilotriacetic acid pulldown analysis, we show for the first time that Hda directly interacts with β in vitro. A new β-binding motif, a hexapeptide with the consensus sequence QL[SP]LPL, related to the previously identified β-binding pentapeptide motif (QL[SD]LF) was found in the amino terminus of the Hda protein. Mutants of Hda with amino acid changes in the hexapeptide motif are severely defective in their ability to bind β. A 10-amino-acid peptide containing the E. coli Hda β-binding motif was shown to compete with Hda for binding to β in an Hda-β interaction assay. These results establish that the interaction of Hda with β is mediated through the hexapeptide sequence. We propose that this interaction may be crucial to the events that lead to the inactivation of DnaA and the prevention of excess initiation of rounds of replication. PMID:15150238
Interaction of the sliding clamp beta-subunit and Hda, a DnaA-related protein.
Kurz, Mareike; Dalrymple, Brian; Wijffels, Gene; Kongsuwan, Kritaya
2004-06-01
In Escherichia coli, interactions between the replication initiation protein DnaA, the beta subunit of DNA polymerase III (the sliding clamp protein), and Hda, the recently identified DnaA-related protein, are required to convert the active ATP-bound form of DnaA to an inactive ADP-bound form through the accelerated hydrolysis of ATP. This rapid hydrolysis of ATP is proposed to be the main mechanism that blocks multiple initiations during cell cycle and acts as a molecular switch from initiation to replication. However, the biochemical mechanism for this crucial step in DNA synthesis has not been resolved. Using purified Hda and beta proteins in a plate binding assay and Ni-nitrilotriacetic acid pulldown analysis, we show for the first time that Hda directly interacts with beta in vitro. A new beta-binding motif, a hexapeptide with the consensus sequence QL[SP]LPL, related to the previously identified beta-binding pentapeptide motif (QL[SD]LF) was found in the amino terminus of the Hda protein. Mutants of Hda with amino acid changes in the hexapeptide motif are severely defective in their ability to bind beta. A 10-amino-acid peptide containing the E. coli Hda beta-binding motif was shown to compete with Hda for binding to beta in an Hda-beta interaction assay. These results establish that the interaction of Hda with beta is mediated through the hexapeptide sequence. We propose that this interaction may be crucial to the events that lead to the inactivation of DnaA and the prevention of excess initiation of rounds of replication.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fong, H.K.W.; Yoshimoto, K.K.; Eversole-Cire, P.
1988-05-01
Recent molecular cloning of cDNA for the ..cap alpha.. subunit of bovine transducin (a guanine nucleotide-binding regulatory protein, or G protein) has revealed the presence of two retinal-specific transducins, called T/sub r/ and T/sub c/, which are expressed in rod or cone photoreceptor cells. In a further study of G-protein diversity and signal transduction in the retina, the authors have identified a G-protein ..cap alpha.. subunit, which they refer to as G/sub z/..cap alpha.., by isolating a human retinal cDNA clone that cross-hybridizes at reduced stringency with bovine T/sub r/ ..cap alpha..-subunit cDNA. The deduced amino acid sequence of G/submore » z/..cap alpha.. is 41-67% identical with those of other known G-protein ..cap alpha.. subunits. However, the 355-residue G/sub z/..cap alpha.. lacks a consensus site for ADP-ribosylation by pertussis toxin, and its amino acid sequence varies within a number of regions that are strongly conserved among all of the other G-protein ..cap alpha.. subunits. They suggest that G/sub z/..cap alpha.., which appears to be highly expressed in neural tissues, represents a member of a subfamily of G proteins that mediate signal transduction in pertussis toxin-insensitive systems.« less
Martínez, Lidia Mayorga; Orozco, Aurea; Villalobos, Patricia; Valverde-R, Carlos
2008-05-01
Thyroid hormone bioactivity is finely regulated at the cellular level by the peripheral iodothyronine deiodinases (D). The study of thyroid function in fish has been restricted mainly to teleosts, whereas the study and characterization of Ds have been overlooked in chondrichthyes. Here we report the cloning and operational characterization of both the native and the recombinant hepatic type 3 iodothyronine deiodinase in the tropical shark Chiloscyllium punctatum. Native and recombinant sD3 show identical catalytic activities: a strong preference for T3-inner-ring deiodination, a requirement for a high concentration of DTT, a sequential reaction mechanism, and resistance to PTU inhibition. The cloned cDNA contains 1298 nucleotides [excluding the poly(A) tail] and encodes a predicted protein of 259 amino acids. The triplet TGA coding for selenocysteine (Sec) is at position 123. The consensus selenocysteine insertion sequence (SECIS) was identified 228 bp upstream of the poly(A) tail and corresponds to form 2. The deduced amino acid sequence was 77% and 72% identical to other D3 cDNAs in fishes and other vertebrates, respectively. As in the case of other piscivore teleost species, shark expresses hepatic D3 through adulthood. This characteristic may be associated with the alimentary strategy in which the protection from an exogenous overload of thyroid hormones could be of physiological importance for thyroidal homeostasis.
Kohda, Daisuke
2018-04-01
Promiscuous recognition of ligands by proteins is as important as strict recognition in numerous biological processes. In living cells, many short, linear amino acid motifs function as targeting signals in proteins to specify the final destination of the protein transport. In general, the target signal is defined by a consensus sequence containing wild-characters, and hence represented by diverse amino acid sequences. The classical lock-and-key or induced-fit/conformational selection mechanism may not cover all aspects of the promiscuous recognition. On the basis of our crystallographic and NMR studies on the mitochondrial Tom20 protein-presequence interaction, we proposed a new hypothetical mechanism based on "a rapid equilibrium of multiple states with partial recognitions". This dynamic, multiple recognition mode enables the Tom20 receptor to recognize diverse mitochondrial presequences with nearly equal affinities. The plant Tom20 is evolutionally unrelated to the animal Tom20 in our study, but is a functional homolog of the animal/fungal Tom20. NMR studies by another research group revealed that the presequence binding by the plant Tom20 was not fully explained by simple interaction modes, suggesting the presence of a similar dynamic, multiple recognition mode. Circumstantial evidence also suggested that similar dynamic mechanisms may be applicable to other promiscuous recognitions of signal peptides by the SRP54/Ffh and SecA proteins.
Molecular characteristic and physiological role of DOPA-decarboxylase.
Guenter, Joanna; Lenartowski, Robert
2016-12-31
The enzyme DOPA decarboxylase (aromatic-L-amino-acid decarboxylase, DDC) plays an important role in the dopaminergic system and participates in the uptake and decarboxylation of amine precursors in the peripheral tissues. Apart from catecholamines, DDC catalyses the biosynthesis of serotonin and trace amines. It has been shown that the DDC amino acid sequence is highly evolutionarily conserved across many species. The activity of holoenzyme is regulated by stimulation/blockade of membrane receptors, phosphorylation of serine residues, and DDC interaction with regulatory proteins. A single gene codes for DDC both in neuronal and non-neuronal tissue, but synthesized isoforms of mRNA differ in the 5' UTR and in the presence of alternative exons. Tissue-specific expression of the DDC gene is controlled by two spatially distinct promoters - neuronal and non-neuronal. Several consensus sequences recognized by the HNF and POU family proteins have been mapped in the neuronal DDC promoter. Since DDC is located close to the imprinted gene cluster, its expression can be subjected to tightly controlled epigenetic regulation. Perturbations in DDC expression result in a range of neurodegenerative and psychiatric disorders and correlate with neoplasia. Apart from the above issues, the role of DDC in prostate cancer, bipolar affective disorder, Parkinson's disease and DDC deficiency is discussed in our review. Moreover, novel and prospective clinical treatments based on gene therapy and stem cells for the diseases mentioned above are described.
Designing probe from E6 genome region of human Papillomavirus 16 for sensing applications.
Parmin, Nor Azizah; Hashim, Uda; Gopinath, Subash C B
2018-02-01
Human Papillomavirus (HPV) is a standout amongst the most commonly reported over 100 types, among them genotypes 16, 18, 31 and 45 are the high-risk HPV. Herein, we designed the oligonucleotide probe for the detection of predominant HPV type 16 for the sensing applications. Conserved amino acid sequences within E6 region of the open reading frame in the HPV genome was used as the basis to design oligonucleotide probe to detect cervical cancer. Analyses of E6 amino acid sequences from the high-risk HPVs were done to check the percentage of similarity and consensus regions that cause different cancers, including cervical cancer. Basic local alignment search tools (BLAST) have given extra statistical parameters, for example, desire values (E-values) and score bits. The probe, 'GGG GTC GGT GGA CCG GTC GAT GTA' was designed with 66.7% GC content. This oligonucleotide probe is designed with the length of 24 mer, GC percent is between 40 and 70, and the melting point (Tm) is above 50°C. The probe needed an acceptable length between 22 and 31 mer. The choice of region is identified here can be used as a probe, has implications for HPV detection techniques in biosensor especially for clinical determination of cervical cancer. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schriner, J.E.; Yi, W.; Hofmann, S.L.
Palmitoyl-protein thioesterase (PPT) is a small glycoprotein that removes palmitate groups from cysteine residues in lipid-modified proteins. We recently reported mutations in PPT in patients with infantile neuronal ceroid lipofuscinosis (INCL), a severe neurodegenerative disorder. INCL is characterized by the accumulation of proteolipid storage material in brain and other tissues, suggesting that the disease is a consequence of abnormal catabolism of acylated proteins. In the current paper, we report the sequence of the human PPT cDNA and the structure of the human PPT gene. The cDNA predicts a protein of 306 amino acids that contains a 25-amino-acid signal peptide, threemore » N-linked glycosylation sites, and consensus motifs characteristic of thioesterases. Northern analysis of a human tissue blot revealed ubiquitous expression of a single 2.5-kb mRNA, with highest expression in lung, brain, and heart. The human PPT gene spans 25 kb and is composed of seven coding exons and a large eighth exon, containing the entire 3{prime}-untranslated region of 1388 bp. An Alu repeat and promoter elements corresponding to putative binding sites for several general transcription factors were identified in the 1060 nucleotides upstream of the transcription start site. The human PPT cDNA sequence and gene structure will provide the means for the identification of further causative mutations in INCL and facilitate genetic screening in selected high-risk populations. 31 refs., 5 figs., 1 tab.« less
cWINNOWER algorithm for finding fuzzy dna motifs
NASA Technical Reports Server (NTRS)
Liang, S.; Samanta, M. P.; Biegel, B. A.
2004-01-01
The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.
cWINNOWER Algorithm for Finding Fuzzy DNA Motifs
NASA Technical Reports Server (NTRS)
Liang, Shoudan
2003-01-01
The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).
Reviving the Dead: History and Reactivation of an Extinct L1
Yang, Lei; Brunsfeld, John; Scott, LuAnn; Wichman, Holly
2014-01-01
Although L1 sequences are present in the genomes of all placental mammals and marsupials examined to date, their activity was lost in the megabat family, Pteropodidae, ∼24 million years ago. To examine the characteristics of L1s prior to their extinction, we analyzed the evolutionary history of L1s in the genome of a megabat, Pteropus vampyrus, and found a pattern of periodic L1 expansion and quiescence. In contrast to the well-characterized L1s in human and mouse, megabat genomes have accommodated two or more simultaneously active L1 families throughout their evolutionary history, and major peaks of L1 deposition into the genome always involved multiple families. We compared the consensus sequences of the two major megabat L1 families at the time of their extinction to consensus L1s of a variety of mammalian species. Megabat L1s are comparable to the other mammalian L1s in terms of adenosine content and conserved amino acids in the open reading frames (ORFs). However, the intergenic region (IGR) of the reconstructed element from the more active family is dramatically longer than the IGR of well-characterized human and mouse L1s. We synthesized the reconstructed element from this L1 family and tested the ability of its components to support retrotransposition in a tissue culture assay. Both ORFs are capable of supporting retrotransposition, while the IGR is inhibitory to retrotransposition, especially when combined with either of the reconstructed ORFs. We dissected the inhibitory effect of the IGR by testing truncated and shuffled versions and found that length is a key factor, but not the only one affecting inhibition of retrotransposition. Although the IGR is inhibitory to retrotransposition, this inhibition does not account for the extinction of L1s in megabats. Overall, the evolution of the L1 sequence or the quiescence of L1 is unlikely the reason of L1 extinction. PMID:24968166
Pacios, Luis F; Tordesillas, Leticia; Cuesta-Herranz, Javier; Compes, Esther; Sánchez-Monge, Rosa; Palacín, Arantxa; Salcedo, Gabriel; Díaz-Perales, Araceli
2008-04-01
Lipid transfer proteins (LTPs) are the major allergens of Rosaceae fruits in the Mediterranean area. Pru p 3, the LTP and major allergen of peach, is a suitable model for studying food allergy and amino acid sequences related with its IgE-binding capacity. In this work, we sought to map IgE mimotopes on the structure of Pru p 3, using the combination of a random peptide phage display library and a three-dimensional modelling approach. Pru p 3-specific IgE was purified from 2 different pools of sera from peach allergic patients grouped by symptoms (OAS-pool or SYS-pool), and used for screening of a random dodecapeptide phage display library. Positive clones were further confirmed by ELISA assays testing individual sera from each pool. Three-dimensional modelling allowed location of mimotopes based on analysis of electrostatic properties and solvent exposure of the Pru p 3 surface. Twenty-one phage clones were selected using Pru p 3-specific IgE, 9 of which were chosen using OAS-specific IgE while the other 12 were selected with systemic-specific IgE. Peptide alignments revealed consensus sequences for each pool: L37 R39 T40 P42 D43 R44 A46 P70 S76 P78 Y79 for OAS-IgE, and N35 N36 L37 R39 T40 D43 A46 S76 I77 P78 for systemic-IgE. These 2 consensus sequences were mapped on the same surface of Pru p 3, corresponding to the helix 2-loop-helix 3 region and part of the non-structured C-terminal coil. Thus, 2 relevant conformational IgE-binding regions of Pru p 3 were identified using a random peptide phage display library. Mimotopes can be used to study the interaction between allergens and IgE, and to accelerate the process to design new vaccines and new immunotherapy strategies.
Reviving the dead: history and reactivation of an extinct l1.
Yang, Lei; Brunsfeld, John; Scott, LuAnn; Wichman, Holly
2014-06-01
Although L1 sequences are present in the genomes of all placental mammals and marsupials examined to date, their activity was lost in the megabat family, Pteropodidae, ∼24 million years ago. To examine the characteristics of L1s prior to their extinction, we analyzed the evolutionary history of L1s in the genome of a megabat, Pteropus vampyrus, and found a pattern of periodic L1 expansion and quiescence. In contrast to the well-characterized L1s in human and mouse, megabat genomes have accommodated two or more simultaneously active L1 families throughout their evolutionary history, and major peaks of L1 deposition into the genome always involved multiple families. We compared the consensus sequences of the two major megabat L1 families at the time of their extinction to consensus L1s of a variety of mammalian species. Megabat L1s are comparable to the other mammalian L1s in terms of adenosine content and conserved amino acids in the open reading frames (ORFs). However, the intergenic region (IGR) of the reconstructed element from the more active family is dramatically longer than the IGR of well-characterized human and mouse L1s. We synthesized the reconstructed element from this L1 family and tested the ability of its components to support retrotransposition in a tissue culture assay. Both ORFs are capable of supporting retrotransposition, while the IGR is inhibitory to retrotransposition, especially when combined with either of the reconstructed ORFs. We dissected the inhibitory effect of the IGR by testing truncated and shuffled versions and found that length is a key factor, but not the only one affecting inhibition of retrotransposition. Although the IGR is inhibitory to retrotransposition, this inhibition does not account for the extinction of L1s in megabats. Overall, the evolution of the L1 sequence or the quiescence of L1 is unlikely the reason of L1 extinction.
Mapping and Sequencing the Human Genome
DOE R&D Accomplishments Database
1988-01-01
Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Q Zhai; M Landesman; H Robinson
2011-12-31
Retroviral Gag proteins contain short late-domain motifs that recruit cellular ESCRT pathway proteins to facilitate virus budding. ALIX-binding late domains often contain the core consensus sequence YPX{sub n}L (where X{sub n} can vary in sequence and length). However, some simian immunodeficiency virus (SIV) Gag proteins lack this consensus sequence, yet still bind ALIX. We mapped divergent, ALIX-binding late domains within the p6{sup Gag} proteins of SIV{sub MAC239} ({sub 40}SREK{und P}YKE{und VT}ED{und L}LHLNSLF{sub 59}) and SIV{sub agmTan-1} ({sub 24}AAG{und A}YDP{und AR}KL{und L}EQYAKK{sub 41}). Crystal structures revealed that anchoring tyrosines (in lightface) and nearby hydrophobic residues (underlined) contact the ALIX V domain,more » revealing how lentiviruses employ a diverse family of late-domain sequences to bind ALIX and promote virus budding.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhai, Q.; Robinson, H.; Landesman, M. B.
2011-01-01
Retroviral Gag proteins contain short late-domain motifs that recruit cellular ESCRT pathway proteins to facilitate virus budding. ALIX-binding late domains often contain the core consensus sequence YPX{sub n}L (where X{sub n} can vary in sequence and length). However, some simian immunodeficiency virus (SIV) Gag proteins lack this consensus sequence, yet still bind ALIX. We mapped divergent, ALIX-binding late domains within the p6{sup Gag} proteins of SIV{sub mac239} ({sub 40}SREK{und P}YKE{und VT}ED{und L}LHLNSLF{sub 59}) and SIV{sub agmTan-1} ({sub 24}AAG{und A}YDP{und AR}KL{und L}EQYAKK{sub 41}). Crystal structures revealed that anchoring tyrosines (in lightface) and nearby hydrophobic residues (underlined) contact the ALIX V domain,more » revealing how lentiviruses employ a diverse family of late-domain sequences to bind ALIX and promote virus budding.« less
Freimuth, P; Anderson, C W
1993-03-01
The sequence of a 1158-base pair fragment of the human adenovirus serotype 12 (Ad12) genome was determined. This segment encodes the precursors for virion components Mu and VI. Both Ad12 precursors contain two sequences that conform to a consensus sequence motif for cleavage by the endoproteinase of adenovirus 2 (Ad2). Analysis of the amino terminus of VI and of the peptide fragments found in Ad12 virions demonstrated that these sites are cleaved during Ad12 maturation. This observation suggests that the recognition motif for adenovirus endoproteinases is highly conserved among human serotypes. The adenovirus 2 endoproteinase polypeptide requires additional co-factors for activity (C. W. Anderson, Protein Expression Purif., 1993, 4, 8-15). Synthetic Ad12 or Ad2 pVI carboxy-terminal peptides each permitted efficient cleavage of an artificial endoproteinase substrate by recombinant Ad2 endoproteinase polypeptide.
USDA-ARS?s Scientific Manuscript database
The soybean Consensus Map 4.0 facilitated the anchoring of 95.6% of the soybean whole genome sequence developed by the Joint Genome Institute, Department of Energy but only properly oriented 66% of the sequence scaffolds. To find additional single nucleotide polymorphism (SNP) markers for additiona...
Berger, Cordula; Parson, Walther
2009-06-01
The degradation state of some biological traces recovered from the crime scene requires the amplification of very short fragments to attain a useful mitochondrial (mt)DNA sequence. We have previously introduced two mini-multiplex assays that amplify 10 overlapping control region (CR) fragments in two separate multiplex PCRs, which brought successful CR consensus sequences from even highly degraded DNA extracts. This procedure requires a total of 20 sequencing reactions per sample, which is laborious and cost intensive. For only moderately degraded samples that we encounter more frequently with typical mtDNA casework material, we developed two new multiplex assays that use a subset of the mini-amplicon primers but embrace larger fragments (midis) and require only 10 sequencing reactions to build a double-stranded CR consensus sequence. We used a preceding mtDNA quantitation step by real-time PCR with two different target fragments (143 and 283 bp) that roughly correspond to the average fragment sizes of the different multiplex approaches to estimate size-dependent mtDNA quantities and to aid the choice of the appropriate PCR multiplexes with respect to quality of the results and required costs.
Goh, C J; Park, D; Lee, J S; Sebastiani, F; Hahn, Y
2018-01-01
Amalgaviridae is a family of double-stranded, monosegmented RNA viruses that are associated with plants, fungi, microsporidians, and animals. A sequence contig derived from the transcriptome of a eudicot, Cistus incanus (the family Cistaceae; commonly known as hoary rockrose), was identified as the genome sequence of a novel plant RNA virus and named Cistus incanus RNA virus 1 (CiRV1). Sequence comparison and phylogenetic analysis indicated that CiRV1 is a novel species of the genus Amalgavirus in the family Amalgaviridae. The CiRV1 genome contig has two overlapping open reading frames (ORFs). ORF1 encodes a putative replication factory matrix-like protein, while ORF2 encodes a RNA-dependent RNA polymerase (RdRp) domain. An ORF1+2 fusion protein, which functions in viral RNA replication, is produced by a +1 programmed ribosomal frameshifting (PRF) mechanism. A +1 PRF motif UUU_CGU, which matches the conserved amalgavirus +1 PRF consensus sequence UUU_CGN, was found at the boundary of CiRV1 ORF1 and ORF2. Comparison of 25 amalgavirus ORF1+2 fusion proteins revealed that only three different positions within a 13-amino acid segment were recurrently used at the boundary, possibly being selected so as not to interfere with correct folding and function of the fusion protein. CiRV1 is the first virus found to be associated with the Cistus species and may be useful for studying amalgaviruses.
Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E
2016-06-20
Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hu, H M; Chuang, C K; Lee, M J; Tseng, T C; Tang, T K
2000-11-01
We previously reported two novel testis-specific serine/threonine kinases, Aie1 (mouse) and AIE2 (human), that share high amino acid identities with the kinase domains of fly aurora and yeast Ipl1. Here, we report the entire intron-exon organization of the Aie1 gene and analyze the expression patterns of Aie1 mRNA during testis development. The mouse Aie1 gene spans approximately 14 kb and contains seven exons. The sequences of the exon-intron boundaries of the Aie1 gene conform to the consensus sequences (GT/AG) of the splicing donor and acceptor sites of most eukaryotic genes. Comparative genomic sequencing revealed that the gene structure is highly conserved between mouse Aie1 and human AIE2. However, much less homology was found in the sequence outside the kinase-coding domains. The Aie1 locus was mapped to mouse chromosome 7A2-A3 by fluorescent in situ hybridization. Northern blot analysis indicates that Aie1 mRNA likely is expressed at a low level on day 14 and reaches its plateau on day 21 in the developing postnatal testis. RNA in situ hybridization indicated that the expression of the Aie1 transcript was restricted to meiotically active germ cells, with the highest levels detected in spermatocytes at the late pachytene stage. These findings suggest that Aie1 plays a role in spermatogenesis.
Differential processing of pro-neurotensin/neuromedin N and relationship to pro-hormone convertases.
Kitabgi, Patrick
2006-10-01
Neurotensin (NT) is synthesized as part of a larger precursor that also contains neuromedin N (NN), a six amino acid neurotensin-like peptide. NT and NN are located in the C-terminal region of the precursor (pro-NT/NN) where they are flanked and separated by three Lys-Arg sequences. A fourth dibasic sequence is present in the middle of the precursor. Dibasics are the consensus sites recognized and cleaved by endoproteases that belong to the recently identified family of pro-protein convertases (PCs). In tissues that express pro-NT/NN, the three C-terminal Lys-Arg sites are differentially processed, whereas the middle dibasic is poorly cleaved. Pro-NT/NN processing gives rise mainly to NT and NN in the brain, to NT and a large peptide ending with the NN sequence at its C-terminus (large NN) in the gut and to NT, large NN and a large peptide ending with the NT sequence (large NT) in the adrenals. Recent evidence indicates that PC1, PC2 and PC5-A are the pro-hormone convertases responsible for the processing patterns observed in the gut, brain and adrenals, respectively. As NT, NN, large NT and large NN are all endowed with biological activity, the evidence reviewed here supports the idea that post-translational processing of pro-NT/NN in tissues may generate biological diversity.
Gene structure and functional characterization of growth hormone in dogfish, Squalus acanthias.
Moriyama, Shunsuke; Oda, Mayumi; Yamazaki, Tomohide; Yamaguchi, Kiyoko; Amiya, Noriko; Takahashi, Akiyoshi; Amano, Masafumi; Goto, Tomoaki; Nozaki, Masumi; Meguro, Hiroshi; Kawauchi, Hiroshi
2008-06-01
Dogfish (Squalus acanthias) growth hormone (GH) was identified by cDNA cloning and protein purification from the pituitary gland. Dogfish GH cDNA encoded a prehormone of 210 amino acids (aa). Sequence analysis of purified GH revealed that the prehormone is composed of a signal peptide of 27 aa and a mature protein of 183 aa. Dogfish GH showed 94% sequence identity with blue shark GH, and also showed 37-66%, 26%, and 48-67% sequence identity with GH from osteichtyes, an agnathan, and tetrapods. The site of production was identified through immunocytochemistry to be cells of the proximal pars distalis of the pituitary gland. Dogfish GH stimulates both insulin-like growth factor-I and II mRNA levels in dogfish liver in vitro. The dogfish GH gene consisted of five exons and four introns, the same as in lamprey, teleosts such as cypriniforms and siluriforms, and tetrapods. The 5'-flanking region within 1082 bp of the transcription start site contained consensus sequences for the TATA box, Pit-1/GHF-1, CRE, TRE, and ERE. These results show that the endocrine mechanism for growth stimulation by the GH-IGF axis was established at an early stage of vertebrate evolution, and that the 5-exon-type gene organization might reflect the structure of the ancestral gene for the GH gene family.
Li, Chenxi; Liu, Hongyu; Li, Jinzhe; Liu, Dafei; Meng, Runze; Zhang, Qingshan; Shaozhou, Wulin; Bai, Xiaofei; Zhang, Tingting; Liu, Ming; Zhang, Yun
2016-01-01
Waterfowl parvovirus (WPV) infection causes high mortality and morbidity in both geese (Anser anser) and Muscovy ducks (Cairina moschata), resulting in significant losses to the waterfowl industries. The VP3 protein of WPV is a major structural protein that induces neutralizing antibodies in the waterfowl. However, B-cell epitopes on the VP3 protein of WPV have not been characterized. To understand the antigenic determinants of the VP3 protein, we used the monoclonal antibody (mAb) 4A6 to screen a set of eight partially expressed overlapping peptides spanning VP3. Using western blotting and an enzyme-linked immunosorbent assay (ELISA), we localized the VP3 epitope between amino acids (aa) 57 and 112. To identify the essential epitope residues, a phage library displaying 12-mer random peptides was screened with mAb 4A6. Phage clone peptides displayed a consensus sequence of YxRFHxH that mimicked the sequence 82Y/FNRFHCH88, which corresponded to amino acid residues 82 to 88 of VP3 protein of WPVs. mAb 4A6 binding to biotinylated fragments corresponding to amino acid residues 82 to 88 of the VP3 protein verified that the 82FxRFHxH88 was the VP3 epitope and that amino acids 82F is necessary to retain maximal binding to mAb 4A6. Parvovirus-positive goose and duck sera reacted with the epitope peptide by dot blotting assay, revealing the importance of these amino acids of the epitope in antibody-epitope binding reactivity. We identified the motif FxRFHxH as a VP3-specific B-cell epitope that is recognized by the neutralizing mAb 4A6. This finding might be valuable in understanding of the antigenic topology of VP3 of WPV.
Ruane, Karen M.; Lloyd, Adrian J.; Fülöp, Vilmos; Dowson, Christopher G.; Barreteau, Hélène; Boniface, Audrey; Dementin, Sébastien; Blanot, Didier; Mengin-Lecreulx, Dominique; Gobec, Stanislav; Dessen, Andréa; Roper, David I.
2013-01-01
Formation of the peptidoglycan stem pentapeptide requires the insertion of both l and d amino acids by the ATP-dependent ligase enzymes MurC, -D, -E, and -F. The stereochemical control of the third position amino acid in the pentapeptide is crucial to maintain the fidelity of later biosynthetic steps contributing to cell morphology, antibiotic resistance, and pathogenesis. Here we determined the x-ray crystal structure of Staphylococcus aureus MurE UDP-N-acetylmuramoyl-l-alanyl-d-glutamate:meso-2,6-diaminopimelate ligase (MurE) (E.C. 6.3.2.7) at 1.8 Å resolution in the presence of ADP and the reaction product, UDP-MurNAc-l-Ala-γ-d-Glu-l-Lys. This structure provides for the first time a molecular understanding of how this Gram-positive enzyme discriminates between l-lysine and d,l-diaminopimelic acid, the predominant amino acid that replaces l-lysine in Gram-negative peptidoglycan. Despite the presence of a consensus sequence previously implicated in the selection of the third position residue in the stem pentapeptide in S. aureus MurE, the structure shows that only part of this sequence is involved in the selection of l-lysine. Instead, other parts of the protein contribute substrate-selecting residues, resulting in a lysine-binding pocket based on charge characteristics. Despite the absolute specificity for l-lysine, S. aureus MurE binds this substrate relatively poorly. In vivo analysis and metabolomic data reveal that this is compensated for by high cytoplasmic l-lysine concentrations. Therefore, both metabolic and structural constraints maintain the structural integrity of the staphylococcal peptidoglycan. This study provides a novel focus for S. aureus-directed antimicrobials based on dual targeting of essential amino acid biogenesis and its linkage to cell wall assembly. PMID:24064214
Nanopore DNA Sequencing and Genome Assembly on the International Space Station.
Castro-Wallace, Sarah L; Chiu, Charles Y; John, Kristen K; Stahl, Sarah E; Rubins, Kathleen H; McIntyre, Alexa B R; Dworkin, Jason P; Lupisella, Mark L; Smith, David J; Botkin, Douglas J; Stephenson, Timothy A; Juul, Sissel; Turner, Daniel J; Izquierdo, Fernando; Federman, Scot; Stryke, Doug; Somasekar, Sneha; Alexander, Noah; Yu, Guixia; Mason, Christopher E; Burton, Aaron S
2017-12-21
We evaluated the performance of the MinION DNA sequencer in-flight on the International Space Station (ISS), and benchmarked its performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing platforms in terrestrial laboratories. Samples contained equimolar mixtures of genomic DNA from lambda bacteriophage, Escherichia coli (strain K12, MG1655) and Mus musculus (female BALB/c mouse). Nine sequencing runs were performed aboard the ISS over a 6-month period, yielding a total of 276,882 reads with no apparent decrease in performance over time. From sequence data collected aboard the ISS, we constructed directed assemblies of the ~4.6 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% consensus pairwise identity, respectively; de novo assembly of the E. coli genome from raw reads yielded a single contig comprising 99.9% of the genome at 98.6% consensus pairwise identity. Simulated real-time analyses of in-flight sequence data using an automated bioinformatic pipeline and laptop-based genomic assembly demonstrated the feasibility of sequencing analysis and microbial identification aboard the ISS. These findings illustrate the potential for sequencing applications including disease diagnosis, environmental monitoring, and elucidating the molecular basis for how organisms respond to spaceflight.
NASA Astrophysics Data System (ADS)
Sui, Xin; Yang, Yongqing; Xu, Xianyun; Zhang, Shuai; Zhang, Lingzhong
2018-02-01
This paper investigates the consensus of multi-agent systems with probabilistic time-varying delays and packet losses via sampled-data control. On the one hand, a Bernoulli-distributed white sequence is employed to model random packet losses among agents. On the other hand, a switched system is used to describe packet dropouts in a deterministic way. Based on the special property of the Laplacian matrix, the consensus problem can be converted into a stabilization problem of a switched system with lower dimensions. Some mean square consensus criteria are derived in terms of constructing an appropriate Lyapunov function and using linear matrix inequalities (LMIs). Finally, two numerical examples are given to show the effectiveness of the proposed method.
Mohammadi, Mohammad; Rasaee, Mohammad Javad; Rajabibazl, Masoumeh; Paknejad, Malihe; Zare, Mehrak; Mohammadzadeh, Sara
2007-08-01
PR81 is an anti-MUC1 monoclonal antibody (MAb) which was generated against human MUC1 mucin that reacted with breast cancerous tissue, MUC1 positive cell line (MCF-7, BT-20, and T-4 7 D), and synthetic peptide, including the tandem repeat sequence of MUC1. Here we characterized the binding properties of PR81 against the tandem repeat of MUC1 by two different epitope mapping techniques, namely, PEPSCAN and phage display. Epitope mapping of PR81 MAb by PEPSCAN revealed a minimal consensus binding sequence, PDTRP, which is found on MUC1 peptide as the most important epitope. Using the phage display peptide library, we identified the motif PD(T/S/G)RP as an epitope and the motif AVGLSPDGSRGV as a mimotope recognized by PR81. Results of these two methods showed that the two residues, arginine and aspartic acid, have important roles in antibody binding and threonine can be substituted by either glycine or serine. These results may be of importance in tailor making antigens used in immunoassay.
Wu, S C; Grindley, J; Winnier, G E; Hargett, L; Hogan, B L
1998-01-01
Cloning and sequencing of mouse Mf2 (mesoderm/mesenchyme forkhead 2) cDNAs revealed an open reading frame encoding a putative protein of 492 amino acids which, after in vitro translation, binds to a DNA consensus sequence. Mf2 is expressed at high levels in the ventral region of newly formed somites, in sclerotomal derivatives, in lateral plate and cephalic mesoderm and in the first and second branchial arches. Other regions of mesodermal expression include the developing tongue, meninges, nose, whiskers, kidney, genital tubercule and limb joints. In the nervous system Mf2 is transcribed in restricted regions of the mid- and forebrain. In several tissues, including the early somite, Mf2 is expressed in cell populations adjacent to regions expressing sonic hedgehog (Shh) and in explant cultures of presomitic mesoderm Mf2 is induced by Shh secreted by COS cells. These results suggest that Mf2, like other murine forkhead genes, has multiple roles in embryogenesis, possibly mediating the response of cells to signaling molecules such as SHH.
USDA-ARS?s Scientific Manuscript database
Background: Vertebrate immune systems generate diverse repertoires of antibodies capable of mediating response to a variety of antigens. Next generation sequencing methods provide unique approaches to a number of immuno-based research areas including antibody discovery and engineering, disease surve...
Liang, Yunyun; Liu, Sanyang; Zhang, Shengli
2015-01-01
Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
Zhang, Wenshan; Hu, Dandan; Raman, Rosy; Guo, Shaomin; Wei, Zili; Shen, Xueqi; Meng, Jinling; Raman, Harsh; Zou, Jun
2017-01-01
Brassica carinata (BBCC) is an allotetraploid in Brassicas with unique alleles for agronomic traits and has huge potential as source for biodiesel production. To investigate the genome-wide molecular diversity, population structure and linkage disequilibrium (LD) pattern in this species, we genotyped a panel of 81 accessions of B. carinata with genotyping by sequencing approach DArTseq, generating a total of 54,510 polymorphic markers. Two subpopulations were exhibited in the B. carinata accessions. The average distance of LD decay (r2 = 0.1) in B subgenome (0.25 Mb) was shorter than that of C subgenome (0.40 Mb). Genome-wide association analysis (GWAS) identified a total of seven markers significantly associated with five seed quality traits in two experiments. To further identify the quantitative trait loci (QTL) for important agronomic and seed quality traits, we phenotyped a doubled haploid (DH) mapping population derived from the “YW” cross between two parents (Y-BcDH64 and W-BcDH76) representing from the two subpopulations. The YW DH population and its parents were grown in three contrasting environments; spring (Hezheng and Xining, China), semi-winter (Wuhan, China), and spring (Wagga Wagga, Australia) across 5 years for QTL mapping. Genetic bases of phenotypic variation in seed yield and its seven related traits, and six seed quality traits were determined. A total of 282 consensus QTL accounting for these traits were identified including nine major QTL for flowering time, oleic acid, linolenic acid, pod number of main inflorescence, and seed weight. Of these, 109 and 134 QTL were specific to spring and semi-winter environment, respectively, while 39 consensus QTL were identified in both contrasting environments. Two QTL identified for linolenic acid (B3) and erucic acid (C7) were validated in the diverse lines used for GWAS. A total of 25 QTL accounting for flowering time, erucic acid, and oleic acid were aligned to the homologous QTL or candidate gene regions in the C genome of B. napus. These results would not only provide insights for genetic improvement of this species, but will also identify useful genetic variation hidden in the Cc subgenome of B. carinata to improve canola cultivars. PMID:28484482
Method for isolating chromosomal DNA in preparation for hybridization in suspension
Lucas, Joe N.
2000-01-01
A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. Chromosomal DNA in a sample containing cell debris is prepared for hybridization in suspension by treating the mixture with RNase. The treated DNA can also be fixed prior to hybridization.
Kim, Seongman; Dai, Gan; O’Callaghan, Dennis J.; Kim, Seong Kee
2012-01-01
The immediate-early protein (IEP), the major regulatory protein encoded by the IE gene of equine herpesvirus 1 (EHV-1), plays a crucial role as both transcription activator and repressor during a productive lytic infection. To investigate the mechanism by which the EHV-1 IEP inhibits its own promoter, IE promoter-luciferase reporter plasmids containing wild-type and mutant IEP-binding site (IEBS) were constructed and used for luciferase reporter assays. The IEP inhibited transcription from its own promoter in the presence of a consensus IEBS (5’-ATCGT-3’) located near the transcription initiation site but did not inhibit when the consensus sequence was deleted. To determine whether the distance between the TATA box and the IEBS affects transcriptional repression, the IEBS was displaced from the original site by the insertion of synthetic DNA sequences. Luciferase reporter assays revealed that the IEP is able to repress its own promoter when the IEBS is located within 26-bp from the TATA box. We also found that the proper orientation and position of the IEBS were required for the repression by the IEP. Interestingly, the level of repression was significantly reduced when a consensus TATA sequence was deleted from the promoter region, indicating that the IEP efficiently inhibits its own promoter in a TATA box-dependent manner. Taken together, these results suggest that the EHV-1 IEP delicately modulates autoregulation of its gene through the consensus IEBS that is near the transcription initiation site and the TATA box. PMID:22265772
Kim, Seongman; Dai, Gan; O'Callaghan, Dennis J; Kim, Seong Kee
2012-04-01
The immediate-early protein (IEP), the major regulatory protein encoded by the IE gene of equine herpesvirus 1 (EHV-1), plays a crucial role as both transcription activator and repressor during a productive lytic infection. To investigate the mechanism by which the EHV-1 IEP inhibits its own promoter, IE promoter-luciferase reporter plasmids containing wild-type and mutant IEP-binding site (IEBS) were constructed and used for luciferase reporter assays. The IEP inhibited transcription from its own promoter in the presence of a consensus IEBS (5'-ATCGT-3') located near the transcription initiation site but did not inhibit when the consensus sequence was deleted. To determine whether the distance between the TATA box and the IEBS affects transcriptional repression, the IEBS was displaced from the original site by the insertion of synthetic DNA sequences. Luciferase reporter assays revealed that the IEP is able to repress its own promoter when the IEBS is located within 26-bp from the TATA box. We also found that the proper orientation and position of the IEBS were required for the repression by the IEP. Interestingly, the level of repression was significantly reduced when a consensus TATA sequence was deleted from the promoter region, indicating that the IEP efficiently inhibits its own promoter in a TATA box-dependent manner. Taken together, these results suggest that the EHV-1 IEP delicately modulates autoregulation of its gene through the consensus IEBS that is near the transcription initiation site and the TATA box. Copyright © 2012. Published by Elsevier B.V.
Kortenhoeven, Cornell; Joubert, Fourie; Bastos, Armanda D S; Abolnik, Celia
2015-02-22
Extensive focus is placed on the comparative analyses of consensus genotypes in the study of West Nile virus (WNV) emergence. Few studies account for genetic change in the underlying WNV quasispecies population variants. These variants are not discernable in the consensus genome at the time of emergence, and the maintenance of mutation-selection equilibria of population variants is greatly underestimated. The emergence of lineage 1 WNV strains has been studied extensively, but recent epidemics caused by lineage 2 WNV strains in Hungary, Austria, Greece and Italy emphasizes the increasing importance of this lineage to public health. In this study we explored the quasispecies dynamics of minority variants that contribute to cell-tropism and host determination, i.e. the ability to infect different cell types or cells from different species from Next Generation Sequencing (NGS) data of a historic lineage 2 WNV strain. Minority variants contributing to host cell membrane association persist in the viral population without contributing to the genetic change in the consensus genome. Minority variants are shown to maintain a stable mutation-selection equilibrium under positive selection, particularly in the capsid gene region. This study is the first to infer positive selection and the persistence of WNV haplotype variants that contribute to viral fitness without accompanying genetic change in the consensus genotype, documented solely from NGS sequence data. The approach used in this study streamlines the experimental design seeking viral minority variants accurately from NGS data whilst minimizing the influence of associated sequence error.
Sechi, Leonardo A.; Zanetti, Stefania; Dupré, Ilaria; Delogu, Giovanni; Fadda, Giovanni
1998-01-01
The presence of enterobacterial repetitive intergenic consensus (ERIC) sequences was demonstrated for the first time in the genome of Mycobacterium tuberculosis; these sequences have been found in transcribed regions of the chromosomes of gram-negative bacteria. In this study genetic diversity among clinical isolates of M. tuberculosis was determined by PCR with ERIC primers (ERIC-PCR). The study isolates comprised 71 clinical isolates collected from Sardinia, Italy. ERIC-PCR was able to identify 59 distinct profiles. The results obtained were compared with IS6110 and PCR-GTG fingerprinting. We found that the level of differentiation obtained by ERIC-PCR is greater than that obtained by IS6110 fingerprinting and comparable to that obtained by PCR-GTG. This method of fingerprinting is rapid and sensitive and can be applied to the study of the epidemiology of M. tuberculosis infections, especially when IS6110 fingerprinting is not of any help. PMID:9431935
Ho, Cynthia K. Y.; Raghwani, Jayna; Koekkoek, Sylvie; Liang, Richard H.; Van der Meer, Jan T. M.; Van Der Valk, Marc; De Jong, Menno; Pybus, Oliver G.
2016-01-01
ABSTRACT In contrast to other available next-generation sequencing platforms, PacBio single-molecule, real-time (SMRT) sequencing has the advantage of generating long reads albeit with a relatively higher error rate in unprocessed data. Using this platform, we longitudinally sampled and sequenced the hepatitis C virus (HCV) envelope genome region (1,680 nucleotides [nt]) from individuals belonging to a cluster of sexually transmitted cases. All five subjects were coinfected with HIV-1 and a closely related strain of HCV genotype 4d. In total, 50 samples were analyzed by using SMRT sequencing. By using 7 passes of circular consensus sequencing, the error rate was reduced to 0.37%, and the median number of sequences was 612 per sample. A further reduction of insertions was achieved by alignment against a sample-specific reference sequence. However, in vitro recombination during PCR amplification could not be excluded. Phylogenetic analysis supported close relationships among HCV sequences from the four male subjects and subsequent transmission from one subject to his female partner. Transmission was characterized by a strong genetic bottleneck. Viral genetic diversity was low during acute infection and increased upon progression to chronicity but subsequently fluctuated during chronic infection, caused by the alternate detection of distinct coexisting lineages. SMRT sequencing combines long reads with sufficient depth for many phylogenetic analyses and can therefore provide insights into within-host HCV evolutionary dynamics without the need for haplotype reconstruction using statistical algorithms. IMPORTANCE Next-generation sequencing has revolutionized the study of genetically variable RNA virus populations, but for phylogenetic and evolutionary analyses, longer sequences than those generated by most available platforms, while minimizing the intrinsic error rate, are desired. Here, we demonstrate for the first time that PacBio SMRT sequencing technology can be used to generate full-length HCV envelope sequences at the single-molecule level, providing a data set with large sequencing depth for the characterization of intrahost viral dynamics. The selection of consensus reads derived from at least 7 full circular consensus sequencing rounds significantly reduced the intrinsic high error rate of this method. We used this method to genetically characterize a unique transmission cluster of sexually transmitted HCV infections, providing insight into the distinct evolutionary pathways in each patient over time and identifying the transmission-associated genetic bottleneck as well as fluctuations in viral genetic diversity over time, accompanied by dynamic shifts in viral subpopulations. PMID:28077634
Discovery of 12-mer peptides that bind to wood lignin
Yamaguchi, Asako; Isozaki, Katsuhiro; Nakamura, Masaharu; Takaya, Hikaru; Watanabe, Takashi
2016-01-01
Lignin, an abundant terrestrial polymer, is the only large-volume renewable feedstock composed of an aromatic skeleton. Lignin has been used mostly as an energy source during paper production; however, recent interest in replacing fossil fuels with renewable resources has highlighted its potential value in providing aromatic chemicals. Highly selective degradation of lignin is pivotal for industrial production of paper, biofuels, chemicals, and materials. However, few studies have examined natural and synthetic molecular components recognizing the heterogeneous aromatic polymer. Here, we report the first identification of lignin-binding peptides possessing characteristic sequences using a phage display technique. The consensus sequence HFPSP was found in several lignin-binding peptides, and the outer amino acid sequence affected the binding affinity of the peptides. Substitution of phenylalanine7 with Ile in the lignin-binding peptide C416 (HFPSPIFQRHSH) decreased the affinity of the peptide for softwood lignin without changing its affinity for hardwood lignin, indicating that C416 recognised structural differences between the lignins. Circular dichroism spectroscopy demonstrated that this peptide adopted a highly flexible random coil structure, allowing key residues to be appropriately arranged in relation to the binding site in lignin. These results provide a useful platform for designing synthetic and biological catalysts selectively bind to lignin. PMID:26903196
Molecular cloning and characterization of novel phytocystatin gene from turmeric, Curcuma longa.
Chan, Seow-Neng; Abu Bakar, Norliza; Mahmood, Maziah; Ho, Chai-Ling; Shaharuddin, Noor Azmi
2014-01-01
Phytocystatin, a type of protease inhibitor (PI), plays major roles in plant defense mechanisms and has been reported to show antipathogenic properties and plant stress tolerance. Recombinant plant PIs are gaining popularity as potential candidates in engineering of crop protection and in synthesizing medicine. It is therefore crucial to identify PI from novel sources like Curcuma longa as it is more effective in combating against pathogens due to its novelty. In this study, a novel cDNA fragment encoding phytocystatin was isolated using degenerate PCR primers, designed from consensus regions of phytocystatin from other plant species. A full-length cDNA of the phytocystatin gene, designated CypCl, was acquired using 5'/3' rapid amplification of cDNA ends method and it has been deposited in NCBI database (accession number KF545954.1). It has a 687 bp long open reading frame (ORF) which encodes 228 amino acids. BLAST result indicated that CypCl is similar to cystatin protease inhibitor from Cucumis sativus with 74% max identity. Sequence analysis showed that CypCl contains most of the motifs found in a cystatin, including a G residue, LARFAV-, QxVxG sequence, PW dipeptide, and SNSL sequence at C-terminal extension. Phylogenetic studies also showed that CypCl is related to phytocystatin from Elaeis guineensis.
Molecular Cloning and Characterization of Novel Phytocystatin Gene from Turmeric, Curcuma longa
Chan, Seow-Neng; Abu Bakar, Norliza; Mahmood, Maziah; Ho, Chai-Ling
2014-01-01
Phytocystatin, a type of protease inhibitor (PI), plays major roles in plant defense mechanisms and has been reported to show antipathogenic properties and plant stress tolerance. Recombinant plant PIs are gaining popularity as potential candidates in engineering of crop protection and in synthesizing medicine. It is therefore crucial to identify PI from novel sources like Curcuma longa as it is more effective in combating against pathogens due to its novelty. In this study, a novel cDNA fragment encoding phytocystatin was isolated using degenerate PCR primers, designed from consensus regions of phytocystatin from other plant species. A full-length cDNA of the phytocystatin gene, designated CypCl, was acquired using 5′/3′ rapid amplification of cDNA ends method and it has been deposited in NCBI database (accession number KF545954.1). It has a 687 bp long open reading frame (ORF) which encodes 228 amino acids. BLAST result indicated that CypCl is similar to cystatin protease inhibitor from Cucumis sativus with 74% max identity. Sequence analysis showed that CypCl contains most of the motifs found in a cystatin, including a G residue, LARFAV-, QxVxG sequence, PW dipeptide, and SNSL sequence at C-terminal extension. Phylogenetic studies also showed that CypCl is related to phytocystatin from Elaeis guineensis. PMID:25853138
Genetic Diversity in Oxytocin Ligands and Receptors in New World Monkeys
Ren, Dongren; Lu, Guoqing; Moriyama, Hideaki; Mustoe, Aaryn C.; Harrison, Emily B.; French, Jeffrey A.
2015-01-01
Oxytocin (OXT) is an important neurohypophyseal hormone that influences wide spectrum of reproductive and social processes. Eutherian mammals possess a highly conserved sequence of OXT (Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly). However, in this study, we sequenced the coding region for OXT in 22 species covering all New World monkeys (NWM) genera and clades, and characterize five OXT variants, including consensus mammalian Leu8-OXT, major variant Pro8-OXT, and three previously unreported variants: Ala8-OXT, Thr8-OXT, and Phe2-OXT. Pro8-OXT shows clear structural and physicochemical differences from Leu8-OXT. We report multiple predicted amino acid substitutions in the G protein-coupled OXT receptor (OXTR), especially in the critical N-terminus, which is crucial for OXT recognition and binding. Genera with same Pro8-OXT tend to cluster together on a phylogenetic tree based on OXTR sequence, and we demonstrate significant coevolution between OXT and OXTR. NWM species are characterized by high incidence of social monogamy, and we document an association between OXTR phylogeny and social monogamy. Our results demonstrate remarkable genetic diversity in the NWM OXT/OXTR system, which can provide a foundation for molecular, pharmacological, and behavioral studies of the role of OXT signaling in regulating complex social phenotypes. PMID:25938568
Li, Jie; Overall, Christopher C.; Johnson, Rudd C.; ...
2015-09-21
The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Jie; Overall, Christopher C.; Johnson, Rudd C.
The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less
Kantarski, Traci; Larson, Steve; Zhang, Xiaofei; DeHaan, Lee; Borevitz, Justin; Anderson, James; Poland, Jesse
2017-01-01
Development of the first consensus genetic map of intermediate wheatgrass gives insight into the genome and tools for molecular breeding. Intermediate wheatgrass (Thinopyrum intermedium) has been identified as a candidate for domestication and improvement as a perennial grain, forage, and biofuel crop and is actively being improved by several breeding programs. To accelerate this process using genomics-assisted breeding, efficient genotyping methods and genetic marker reference maps are needed. We present here the first consensus genetic map for intermediate wheatgrass (IWG), which confirms the species' allohexaploid nature (2n = 6x = 42) and homology to Triticeae genomes. Genotyping-by-sequencing was used to identify markers that fit expected segregation ratios and construct genetic maps for 13 heterogeneous parents of seven full-sib families. These maps were then integrated using a linear programming method to produce a consensus map with 21 linkage groups containing 10,029 markers, 3601 of which were present in at least two populations. Each of the 21 linkage groups contained between 237 and 683 markers, cumulatively covering 5061 cM (2891 cM--Kosambi) with an average distance of 0.5 cM between each pair of markers. Through mapping the sequence tags to the diploid (2n = 2x = 14) barley reference genome, we observed high colinearity and synteny between these genomes, with three homoeologous IWG chromosomes corresponding to each of the seven barley chromosomes, and mapped translocations that are known in the Triticeae. The consensus map is a valuable tool for wheat breeders to map important disease-resistance genes within intermediate wheatgrass. These genomic tools can help lead to rapid improvement of IWG and development of high-yielding cultivars of this perennial grain that would facilitate the sustainable intensification of agricultural systems.
Quick, Joshua; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah C; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno R; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J
2017-06-01
Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab.
Torto-Alalibo, Trudy; Tian, Miaoying; Gajendran, Kamal; Waugh, Mark E; van West, Pieter; Kamoun, Sophien
2005-01-01
Background The oomycete Saprolegnia parasitica is one of the most economically important fish pathogens. There is a dramatic recrudescence of Saprolegnia infections in aquaculture since the use of the toxic organic dye malachite green was banned in 2002. Little is known about the molecular mechanisms underlying pathogenicity in S. parasitica and other animal pathogenic oomycetes. In this study we used a genomics approach to gain a first insight into the transcriptome of S. parasitica. Results We generated 1510 expressed sequence tags (ESTs) from a mycelial cDNA library of S. parasitica. A total of 1279 consensus sequences corresponding to 525944 base pairs were assembled. About half of the unigenes showed similarities to known protein sequences or motifs. The S. parasitica sequences tended to be relatively divergent from Phytophthora sequences. Based on the sequence alignments of 18 conserved proteins, the average amino acid identity between S. parasitica and three Phytophthora species was 77% compared to 93% within Phytophthora. Several S. parasitica cDNAs, such as those with similarity to fungal type I cellulose binding domain proteins, PAN/Apple module proteins, glycosyl hydrolases, proteases, as well as serine and cysteine protease inhibitors, were predicted to encode secreted proteins that could function in virulence. Some of these cDNAs were more similar to fungal proteins than to other eukaryotic proteins confirming that oomycetes and fungi share some virulence components despite their evolutionary distance Conclusion We provide a first glimpse into the gene content of S. parasitica, a reemerging oomycete fish pathogen. These resources will greatly accelerate research on this important pathogen. The data is available online through the Oomycete Genomics Database [1]. PMID:16076392
Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus.
Laprevotte, I; Hampe, A; Sherr, C J; Galibert, F
1984-01-01
The nucleotide sequence of the gag gene of feline leukemia virus and its flanking sequences were determined and compared with the corresponding sequences of two strains of feline sarcoma virus and with that of the Moloney strain of murine leukemia virus. A high degree of nucleotide sequence homology between the feline leukemia virus and murine leukemia virus gag genes was observed, suggesting that retroviruses of domestic cats and laboratory mice have a common, proximal evolutionary progenitor. The predicted structure of the complete feline leukemia virus gag gene precursor suggests that the translation of nonglycosylated and glycosylated gag gene polypeptides is initiated at two different AUG codons. These initiator codons fall in the same reading frame and are separated by a 222-base-pair segment which encodes an amino terminal signal peptide. The nucleotide sequence predicts the order of amino acids in each of the individual gag-coded proteins (p15, p12, p30, p10), all of which derive from the gag gene precursor. Stable stem-and-loop secondary structures are proposed for two regions of viral RNA. The first falls within sequences at the 5' end of the viral genome, together with adjacent palindromic sequences which may play a role in dimer linkage of RNA subunits. The second includes coding sequences at the gag-pol junction and is proposed to be involved in translation of the pol gene product. Sequence analysis of the latter region shows that the gag and pol genes are translated in different reading frames. Classical consensus splice donor and acceptor sequences could not be localized to regions which would permit synthesis of the expected gag-pol precursor protein. Alternatively, we suggest that the pol gene product (RNA-dependent DNA polymerase) could be translated by a frameshift suppressing mechanism which could involve cleavage modification of stems and loops in a manner similar to that observed in tRNA processing. PMID:6328019
Caspase-1 from the silkworm, Bombyx mori, is involved in Bombyx mori nucleopolyhedrovirus infection.
Wang, Qiang; Ju, Xiaoli; Chen, Liang; Chen, Keping
2017-03-01
Caspase-1 is one of the effector caspases in mammals that plays a central role in apoptosis. However, the lepidopteran caspase-1, especially the Bombyx mori caspase-1 (Bm-caspase-1), has not been investigated in detail. In this study, Bm-caspase-1 was identified from an expressed sequence tag database in B. mori by BLAST search. The open reading frame of Bm-caspase-1 contained 879 nucleotides and encoded 293 amino acids with a predicted molecular mass of 33 kDa. Bm-caspase-1 contained two consensus amino acid motifs of caspase cleavage sites, DEGDA and TETDG. Caspase activity assays revealed significant proteolytic activity of the Ac-DEVD-pNA substrate. Bm-caspase-1 can be detected in all tissues and developmental stages by a semi quantitative polymerase chain reaction assay. More importantly, the expression level of Bm-caspase-1 is increased upon baculovirus infection and up-regulated in BmNPV-resistant silkworms. Taken together, these results indicate that Bm-caspase-1 plays an important role during baculovirus infection.
Chaves Neto, Antonio Hernandes; Queiroz, Karla Cristiana; Milani, Renato; Paredes-Gamero, Edgar Julian; Justo, Giselle Zenker; Peppelenbosch, Maikel P; Ferreira, Carmen Veríssima
2011-01-01
Despite numerous reports on the ability of ascorbic acid and β-glycerophosphate (AA/β-GP) to induce osteoblast differentiation, little is known about the molecular mechanisms involved in this phenomenon. In this work, we used a peptide array containing specific consensus sequences (potential substrates) for protein kinases and traditional biochemical techniques to examine the signaling pathways modulated during AA/β-GP-induced osteoblast differentiation. The kinomic profile obtained after 7 days of treatment with AA/β-GP identified 18 kinase substrates with significantly enhanced or reduced phosphorylation. Peptide substrates for Akt, PI3K, PKC, BCR, ABL, PRKG1, PAK1, PAK2, ERK1, ERBB2, and SYK showed a considerable reduction in phosphorylation, whereas enhanced phosphorylation was observed in substrates for CHKB, CHKA, PKA, FAK, ATM, PKA, and VEGFR-1. These findings confirm the potential usefulness of peptide microarrays for identifying kinases known to be involved in bone development in vivo and in vitro and show that this technique can be used to investigate kinases whose function in osteoblastic differentiation is poorly understood.
A conserved mechanism for replication origin recognition and binding in archaea.
Majerník, Alan I; Chong, James P J
2008-01-15
To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.
Tins, B; Cassar-Pullicino, V; Haddaway, M; Nachtrab, U
2012-01-01
Objectives The bulk of spinal imaging is still performed with conventional two-dimensional sequences. This study assesses the suitability of three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions (SPACE) sequence for routine spinal imaging. Methods 62 MRI examinations of the spine were evaluated by 2 examiners in consensus for the depiction of anatomy and presence of artefact. We noted pathologies that might be missed using the SPACE sequence only or the SPACE and a sagittal T1 weighted sequence. The reference standards were sagittal and axial T1 weighted and T2 weighted sequences. At a later date the evaluation was repeated by one of the original examiners and an additional examiner. Results There was good agreement of the single evaluations and consensus evaluation for the conventional sequences: κ>0.8, confidence interval (CI)>0.6–1.0. For the SPACE sequence, depiction of anatomy was very good for 84% of cases, with high interobserver agreement, but there was poor interobserver agreement for other cases. For artefact assessment of SPACE, κ=0.92, CI=0.92–1.0. The SPACE sequence was superior to conventional sequences for depiction of anatomy and artefact resistance. The SPACE sequence occasionally missed bone marrow oedema. In conjunction with sagittal T1 weighted sequences, no abnormality was missed. The isotropic SPACE sequence was superior to conventional sequences in imaging difficult anatomy such as in scoliosis and spondylolysis. Conclusion The SPACE sequence allows excellent assessment of anatomy owing to high spatial resolution and resistance to artefact. The sensitivity for bone marrow abnormalities is limited. PMID:22374284
Tins, B; Cassar-Pullicino, V; Haddaway, M; Nachtrab, U
2012-08-01
The bulk of spinal imaging is still performed with conventional two-dimensional sequences. This study assesses the suitability of three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions (SPACE) sequence for routine spinal imaging. 62 MRI examinations of the spine were evaluated by 2 examiners in consensus for the depiction of anatomy and presence of artefact. We noted pathologies that might be missed using the SPACE sequence only or the SPACE and a sagittal T(1) weighted sequence. The reference standards were sagittal and axial T(1) weighted and T(2) weighted sequences. At a later date the evaluation was repeated by one of the original examiners and an additional examiner. There was good agreement of the single evaluations and consensus evaluation for the conventional sequences: κ>0.8, confidence interval (CI)>0.6-1.0. For the SPACE sequence, depiction of anatomy was very good for 84% of cases, with high interobserver agreement, but there was poor interobserver agreement for other cases. For artefact assessment of SPACE, κ=0.92, CI=0.92-1.0. The SPACE sequence was superior to conventional sequences for depiction of anatomy and artefact resistance. The SPACE sequence occasionally missed bone marrow oedema. In conjunction with sagittal T(1) weighted sequences, no abnormality was missed. The isotropic SPACE sequence was superior to conventional sequences in imaging difficult anatomy such as in scoliosis and spondylolysis. The SPACE sequence allows excellent assessment of anatomy owing to high spatial resolution and resistance to artefact. The sensitivity for bone marrow abnormalities is limited.
Zhuo, Tao; Li, Yuan-Yuan; Xiang, Hai-Ying; Wu, Zhan-Yu; Wang, Xian-Bin; Wang, Ying; Zhang, Yong-Liang; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui
2014-06-01
Polerovirus P0 suppressors of host gene silencing contain a consensus F-box-like motif with Leu/Pro (L/P) requirements for suppressor activity. The Inner Mongolian Potato leafroll virus (PLRV) P0 protein (P0(PL-IM)) has an unusual F-box-like motif that contains a Trp/Gly (W/G) sequence and an additional GW/WG-like motif (G139/W140/G141) that is lacking in other P0 proteins. We used Agrobacterium infiltration-mediated RNA silencing assays to establish that P0(PL-IM) has a strong suppressor activity. Mutagenesis experiments demonstrated that the P0(PL-IM) F-box-like motif encompasses amino acids 76-LPRHLHYECLEWGLLCG THP-95, and that the suppressor activity is abolished by L76A, W87A, or G88A substitution. The suppressor activity is also weakened substantially by mutations within the G139/W140/G141 region and is eliminated by a mutation (F220R) in a C-terminal conserved sequence of P0(PL-IM). As has been observed with other P0 proteins, P0(PL-IM) suppression is correlated with reduced accumulation of the host AGO1-silencing complex protein. However, P0(PL-IM) fails to bind SKP1, which functions in a proteasome pathway that may be involved in AGO1 degradation. These results suggest that P0(PL-IM) may suppress RNA silencing by using an alternative pathway to target AGO1 for degradation. Our results help improve our understanding of the molecular mechanisms involved in PLRV infection.
Stevenson, Clare E. M.; Assaad, Aoun; Chandra, Govind; Le, Tung B. K.; Greive, Sandra J.; Bibb, Mervyn J.; Lawson, David M.
2013-01-01
Consistent with their complex lifestyles and rich secondary metabolite profiles, the genomes of streptomycetes encode a plethora of transcription factors, the vast majority of which are uncharacterized. Herein, we use Surface Plasmon Resonance (SPR) to identify and delineate putative operator sites for SCO3205, a MarR family transcriptional regulator from Streptomyces coelicolor that is well represented in sequenced actinomycete genomes. In particular, we use a novel SPR footprinting approach that exploits indirect ligand capture to vastly extend the lifetime of a standard streptavidin SPR chip. We define two operator sites upstream of sco3205 and a pseudopalindromic consensus sequence derived from these enables further potential operator sites to be identified in the S. coelicolor genome. We evaluate each of these through SPR and test the importance of the conserved bases within the consensus sequence. Informed by these results, we determine the crystal structure of a SCO3205-DNA complex at 2.8 Å resolution, enabling molecular level rationalization of the SPR data. Taken together, our observations support a DNA recognition mechanism involving both direct and indirect sequence readout. PMID:23748564
Pekosz, Andrew; Lamb, Robert A.
2000-01-01
Two mRNA species are derived from the influenza C virus RNA segment six, (i) a colinear transcript containing a 374-amino-acid residue open reading frame (referred to herein as the seg 6 ORF) which is translated to yield the p42 protein, and (ii) a spliced mRNA which encodes the influenza C virus matrix (CM1) protein consisting of the first 242 amino acids of p42. The p42 protein undergoes proteolytic cleavage at a consensus signal peptidase cleavage site after residue 259, yielding the p31 and CM2 proteins. Translocation of p42 into the endoplasmic reticulum membrane occurs cotranslationally and requires the hydrophobic internal signal peptide (residues 239 to 259), as well as the predicted transmembrane domain of CM2 (residues 285 to 308). The p31 protein was found to undergo rapid degradation after cleavage from p42. Addition of the 26S proteasome inhibitor lactacystin to influenza C virus-infected or seg 6 ORF cDNA-transfected cells drastically reduced p31 degradation. Transfer of the 17-residue C-terminal region of p31 to heterologous proteins resulted in their rapid turnover. The hydrophobic nature, but not the specific amino acid sequence of the 17-amino-acid C terminus of p31 appears to act as the signal for targeting the protein to membranes and for degradation. PMID:11044092
Rapid Multi-Locus Sequence Typing Using Microfluidic Biochips
2010-05-12
Sequence Types. The evolutionary history of all the B. cereus MLST concatenated Sequence Types (545 taxa, 2,394 nucleotide positions) was inferred using...the Neighbor-Joining method [28]. The bootstrap consensus tree inferred from 100 replicates was taken to represent the evolutionary history of the... Chlamydia (manuscript in preparation) and performed pilot studies on Staphylococcus aureus and Streptoccus pneumoniae (Data S4 and Text S2). Another potential
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.
Khoe, Clairine V; Chung, Long H; Murray, Vincent
2018-06-01
The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation.
Dudek, Christian-Alexander; Dannheim, Henning; Schomburg, Dietmar
2017-01-01
The prediction of gene functions is crucial for a large number of different life science areas. Faster high throughput sequencing techniques generate more and larger datasets. The manual annotation by classical wet-lab experiments is not suitable for these large amounts of data. We showed earlier that the automatic sequence pattern-based BrEPS protocol, based on manually curated sequences, can be used for the prediction of enzymatic functions of genes. The growing sequence databases provide the opportunity for more reliable patterns, but are also a challenge for the implementation of automatic protocols. We reimplemented and optimized the BrEPS pattern generation to be applicable for larger datasets in an acceptable timescale. Primary improvement of the new BrEPS protocol is the enhanced data selection step. Manually curated annotations from Swiss-Prot are used as reliable source for function prediction of enzymes observed on protein level. The pool of sequences is extended by highly similar sequences from TrEMBL and SwissProt. This allows us to restrict the selection of Swiss-Prot entries, without losing the diversity of sequences needed to generate significant patterns. Additionally, a supporting pattern type was introduced by extending the patterns at semi-conserved positions with highly similar amino acids. Extended patterns have an increased complexity, increasing the chance to match more sequences, without losing the essential structural information of the pattern. To enhance the usability of the database, we introduced enzyme function prediction based on consensus EC numbers and IUBMB enzyme nomenclature. BrEPS is part of the Braunschweig Enzyme Database (BRENDA) and is available on a completely redesigned website and as download. The database can be downloaded and used with the BrEPScmd command line tool for large scale sequence analysis. The BrEPS website and downloads for the database creation tool, command line tool and database are freely accessible at http://breps.tu-bs.de.
BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation
Schomburg, Dietmar
2017-01-01
The prediction of gene functions is crucial for a large number of different life science areas. Faster high throughput sequencing techniques generate more and larger datasets. The manual annotation by classical wet-lab experiments is not suitable for these large amounts of data. We showed earlier that the automatic sequence pattern-based BrEPS protocol, based on manually curated sequences, can be used for the prediction of enzymatic functions of genes. The growing sequence databases provide the opportunity for more reliable patterns, but are also a challenge for the implementation of automatic protocols. We reimplemented and optimized the BrEPS pattern generation to be applicable for larger datasets in an acceptable timescale. Primary improvement of the new BrEPS protocol is the enhanced data selection step. Manually curated annotations from Swiss-Prot are used as reliable source for function prediction of enzymes observed on protein level. The pool of sequences is extended by highly similar sequences from TrEMBL and SwissProt. This allows us to restrict the selection of Swiss-Prot entries, without losing the diversity of sequences needed to generate significant patterns. Additionally, a supporting pattern type was introduced by extending the patterns at semi-conserved positions with highly similar amino acids. Extended patterns have an increased complexity, increasing the chance to match more sequences, without losing the essential structural information of the pattern. To enhance the usability of the database, we introduced enzyme function prediction based on consensus EC numbers and IUBMB enzyme nomenclature. BrEPS is part of the Braunschweig Enzyme Database (BRENDA) and is available on a completely redesigned website and as download. The database can be downloaded and used with the BrEPScmd command line tool for large scale sequence analysis. The BrEPS website and downloads for the database creation tool, command line tool and database are freely accessible at http://breps.tu-bs.de. PMID:28750104
Event-triggered consensus tracking of multi-agent systems with Lur'e nonlinear dynamics
NASA Astrophysics Data System (ADS)
Huang, Na; Duan, Zhisheng; Wen, Guanghui; Zhao, Yu
2016-05-01
In this paper, distributed consensus tracking problem for networked Lur'e systems is investigated based on event-triggered information interactions. An event-triggered control algorithm is designed with the advantages of reducing controller update frequency and sensor energy consumption. By using tools of ?-procedure and Lyapunov functional method, some sufficient conditions are derived to guarantee that consensus tracking is achieved under a directed communication topology. Meanwhile, it is shown that Zeno behaviour of triggering time sequences is excluded for the proposed event-triggered rule. Finally, some numerical simulations on coupled Chua's circuits are performed to illustrate the effectiveness of the theoretical algorithms.
Jay, Z. J.; Rusch, D. B.; Tringe, S. G.; Bailey, C.; Jennings, R. M.
2014-01-01
High-temperature (>70°C) ecosystems in Yellowstone National Park (YNP) provide an unparalleled opportunity to study chemotrophic archaea and their role in microbial community structure and function under highly constrained geochemical conditions. Acidilobus spp. (order Desulfurococcales) comprise one of the dominant phylotypes in hypoxic geothermal sulfur sediment and Fe(III)-oxide environments along with members of the Thermoproteales and Sulfolobales. Consequently, the primary goals of the current study were to analyze and compare replicate de novo sequence assemblies of Acidilobus-like populations from four different mildly acidic (pH 3.3 to 6.1) high-temperature (72°C to 82°C) environments and to identify metabolic pathways and/or protein-encoding genes that provide a detailed foundation of the potential functional role of these populations in situ. De novo assemblies of the highly similar Acidilobus-like populations (>99% 16S rRNA gene identity) represent near-complete consensus genomes based on an inventory of single-copy genes, deduced metabolic potential, and assembly statistics generated across sites. Functional analysis of coding sequences and confirmation of gene transcription by Acidilobus-like populations provide evidence that they are primarily chemoorganoheterotrophs, generating acetyl coenzyme A (acetyl-CoA) via the degradation of carbohydrates, lipids, and proteins, and auxotrophic with respect to several external vitamins, cofactors, and metabolites. No obvious pathways or protein-encoding genes responsible for the dissimilatory reduction of sulfur were identified. The presence of a formate dehydrogenase (Fdh) and other protein-encoding genes involved in mixed-acid fermentation supports the hypothesis that Acidilobus spp. function as degraders of complex organic constituents in high-temperature, mildly acidic, hypoxic geothermal systems. PMID:24162572
ERIC Educational Resources Information Center
Fyfe, Emily R.; DeCaro, Marci S.; Rittle-Johnson, Bethany
2014-01-01
Background: The sequencing of learning materials greatly influences the knowledge that learners construct. Recently, learning theorists have focused on the sequencing of instruction in relation to solving related problems. The general consensus suggests explicit instruction should be provided; however, when to provide instruction remains unclear.…
Sampled-Data Consensus of Linear Multi-agent Systems With Packet Losses.
Zhang, Wenbing; Tang, Yang; Huang, Tingwen; Kurths, Jurgen
In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.
Nucleic acid arrays and methods of synthesis
Sabanayagam, Chandran R.; Sano, Takeshi; Misasi, John; Hatch, Anson; Cantor, Charles
2001-01-01
The present invention generally relates to high density nucleic acid arrays and methods of synthesizing nucleic acid sequences on a solid surface. Specifically, the present invention contemplates the use of stabilized nucleic acid primer sequences immobilized on solid surfaces, and circular nucleic acid sequence templates combined with the use of isothermal rolling circle amplification to thereby increase nucleic acid sequence concentrations in a sample or on an array of nucleic acid sequences.
Examination of the catalytic fitness of the hammerhead ribozyme by in vitro selection.
Tang, J; Breaker, R R
1997-01-01
We have designed a self-cleaving ribozyme construct that is rendered inactive during preparative in vitro transcription by allosteric interactions with ATP. This allosteric ribozyme was constructed by joining a hammerhead domain to an ATP-binding RNA aptamer, thereby creating a ribozyme whose catalytic rate can be controlled by ATP. Upon purification by PAGE, the engineered ribozyme undergoes rapid self-cleavage when incubated in the absence of ATP. This strategy of "allosteric delay" was used to prepare intact hammerhead ribozymes that would otherwise self-destruct during transcription. Using a similar strategy, we have prepared a combinatorial pool of RNA in order to assess the catalytic fitness of ribozymes that carry the natural consensus sequence for the hammerhead. Using in vitro selection, this comprehensive RNA pool was screened for sequence variants of the hammerhead ribozyme that also display catalytic activity. We find that sequences that comprise the core of naturally occurring hammerhead dominate the population of selected RNAs, indicating that the natural consensus sequence of this ribozyme is optimal for catalytic function. PMID:9257650
Bertaccini, Edward J.; Yoluk, Ozge; Lindahl, Erik R.; Trudell, James R.
2013-01-01
Background Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). While its molecular structure remains unknown, significant progress has been made towards understanding its interactions with anesthetics via molecular modeling. Methods The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50’s. Results Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between alpha and beta subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Conclusion Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed correlation of ligand docking scores with experimentally measured GABAaR potentiation. PMID:23770602
Bertaccini, Edward J; Yoluk, Ozge; Lindahl, Erik R; Trudell, James R
2013-11-01
Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). Although its molecular structure remains unknown, significant progress has been made toward understanding its interactions with anesthetics via molecular modeling. The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH-sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50s. Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between α and β subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed a correlation of ligand docking scores with experimentally measured GABAaR potentiation.
Edwards, W. Barry
2013-01-01
The aim of this study was to identify potential ligands of PSMA suitable for further development as novel PSMA-targeted peptides using phage display technology. The human PSMA protein was immobilized as a target followed by incubation with a 15-mer phage display random peptide library. After one round of prescreening and two rounds of screening, high-stringency screening at the third round of panning was performed to identify the highest affinity binders. Phages which had a specific binding activity to PSMA in human prostate cancer cells were isolated and the DNA corresponding to the 15-mers were sequenced to provide three consensus sequences: GDHSPFT, SHFSVGS and EVPRLSLLAVFL as well as other sequences that did not display consensus. Two of the peptide sequences deduced from DNA sequencing of binding phages, SHSFSVGSGDHSPFT and GRFLTGGTGRLLRIS were labeled with 5-carboxyfluorescein and shown to bind and co-internalize with PSMA on human prostate cancer cells by fluorescence microscopy. The high stringency requirements yielded peptides with affinities KD∼1 µM or greater which are suitable starting points for affinity maturation. While these values were less than anticipated, the high stringency did yield peptide sequences that apparently bound to different surfaces on PSMA. These peptide sequences could be the basis for further development of peptides for prostate cancer tumor imaging and therapy. PMID:23935860
Li, Maoyin; Bahn, Sung Chul; Guo, Liang; Musgrave, William; Berg, Howard; Welti, Ruth; Wang, Xuemin
2011-01-01
The release of fatty acids from membrane lipids has been implicated in various plant processes, and the patatin-related phospholipases (pPLAs) constitute a major enzyme family that catalyzes fatty acid release. The Arabidopsis thaliana pPLA family has 10 members that are classified into three groups. Group 3 pPLAIII has four members but lacks the canonical lipase/esterase consensus catalytic sequences, and their enzymatic activity and cellular functions have not been delineated. Here, we show that pPLAIIIβ hydrolyzes phospholipids and galactolipids and additionally has acyl-CoA thioesterase activity. Alterations of pPLAIIIβ result in changes in lipid levels and composition. pPLAIIIβ-KO plants have longer leaves, petioles, hypocotyls, primary roots, and root hairs than wild-type plants, whereas pPLAIIIβ-OE plants exhibit the opposite phenotype. In addition, pPLAIIIβ-OE plants have significantly lower cellulose content and mechanical strength than wild-type plants. Root growth of pPLAIIIβ-KO plants is less sensitive to treatment with free fatty acids, the enzymatic products of pPLAIIIβ, than wild-type plants; root growth of pPLAIIIβ-OE plants is more sensitive. These data suggest that alteration of pPLAIIIβ expression and the resulting lipid changes alter cellulose content and cell elongation in Arabidopsis. PMID:21447788
Sawada, Akihisa; Croom-Carter, Deborah; Kondo, Osamu; Yasui, Masahiro; Koyama-Sato, Maho; Inoue, Masami; Kawa, Keisei; Rickinson, Alan B; Tierney, Rosemary J
2011-05-01
Polymorphisms in Epstein-Barr virus (EBV) latent genes can identify virus strains from different human populations and individual strains within a population. An Asian EBV signature has been defined almost exclusively from Chinese viruses, with little information from other Asian countries. Here we sequenced polymorphic regions of the EBNA1, 2, 3A, 3B, 3C and LMP1 genes of 31 Japanese strains from control donors and EBV-associated T/NK-cell lymphoproliferative disease (T/NK-LPD) patients. Though identical to Chinese strains in their dominant EBNA1 and LMP1 alleles, Japanese viruses were subtly different at other loci. Thus, while Chinese viruses mainly fall into two families with strongly linked 'Wu' or 'Li' alleles at EBNA2 and EBNA3A/B/C, Japanese viruses all have the consensus Wu EBNA2 allele but fall into two families at EBNA3A/B/C. One family has variant Li-like sequences at EBNA3A and 3B and the consensus Li sequence at EBNA3C; the other family has variant Wu-like sequences at EBNA3A, variants of a low frequency Chinese allele 'Sp' at EBNA3B and a consensus Sp sequence at EBNA3C. Thus, EBNA3A/B/C allelotypes clearly distinguish Japanese from Chinese strains. Interestingly, most Japanese viruses also lack those immune-escape mutations in the HLA-A11 epitope-encoding region of EBNA3B that are so characteristic of viruses from the highly A11-positive Chinese population. Control donor-derived and T/NK-LPD-derived strains were similarly distributed across allelotypes and, by using allelic polymorphisms to track virus strains in patients pre- and post-haematopoietic stem-cell transplant, we show that a single strain can induce both T/NK-LPD and B-cell-lymphoproliferative disease in the same patient.
Myamoto, D T; Pidde-Queiroz, G; Pedroso, A; Gonçalves-de-Andrade, R M; van den Berg, C W; Tambourgi, D V
2016-09-01
A transcriptome analysis of the venom glands of the spider Loxosceles laeta, performed by our group, in a previous study (Fernandes-Pedrosa et al., 2008), revealed a transcript with a sequence similar to the human complement component C3. Here we present the analysis of this transcript. cDNA fragments encoding the C3 homologue (Lox-C3) were amplified from total RNA isolated from the venom glands of L. laeta by RACE-PCR. Lox-C3 is a 5178 bps cDNA sequence encoding a 190kDa protein, with a domain configuration similar to human C3. Multiple alignments of C3-like proteins revealed two processing sites, suggesting that Lox-C3 is composed of three chains. Furthermore, the amino acids consensus sequences for the thioester was found, in addition to putative sequences responsible for FB binding. The phylogenetic analysis showed that Lox-C3 belongs to the same group as two C3 isoforms from the spider Hasarius adansoni (Family Salcitidae), showing 53% homology with these. This is the first characterization of a Loxosceles cDNA sequence encoding a human C3 homologue, and this finding, together with our previous finding of the expression of a FB-like molecule, suggests that this spider species also has a complement system. This work will help to improve our understanding of the innate immune system in these spiders and the ancestral structure of C3. Copyright © 2016 Elsevier GmbH. All rights reserved.
Nagao, K; Taguchi, Y; Arioka, M; Kadokura, H; Takatsuki, A; Yoda, K; Yamasaki, M
1995-01-01
We have isolated a Schizosaccharomyces pombe gene, bfr1+, which on a multicopy plasmid vector, pDB248', confers resistance to brefeldin A (BFA), an inhibitor of intracellular protein transport. This gene encodes a novel protein of 1,531 amino acids with an intramolecular duplicated structure, each half containing a single ATP-binding consensus sequence and a set of six transmembrane sequences. This structural characteristic of bfr1+ protein resembles that of mammalian P-glycoprotein, which, by exporting a variety of anticancer drugs, has been shown to be responsible for multidrug resistance in tumor cells. Consistent with this is that S. pombe cells harboring bfr1+ on pDB248' are resistant to actinomycin D, cerulenin, and cytochalasin B, as well as to BFA. The relative positions of the ATP-binding sequences and the clusters of transmembrane sequences within the bfr1+ protein are, however, transposed in comparison with those in P-glycoprotein; the bfr1+ protein has N-terminal ATP-binding sequence followed by transmembrane segments in each half of the molecule. The bfr1+ protein exhibited significant homology in primary and secondary structures with two recently identified multidrug resistance gene products of Saccharomyces cerevisiae, Snq2 and Sts1/Pdr5/Ydr1. The bfr1+ gene is not essential for cell growth or mating, but a delta bfr1 mutant exhibited hypersensitivity to BFA. We propose that the bfr1+ protein is another member of the ATP-binding cassette superfamily and serves as an efflux pump of various antibiotics. PMID:7883711
Sequence, molecular properties, and chromosomal mapping of mouse lumican
NASA Technical Reports Server (NTRS)
Funderburgh, J. L.; Funderburgh, M. L.; Hevelone, N. D.; Stech, M. E.; Justice, M. J.; Liu, C. Y.; Kao, W. W.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)
1995-01-01
PURPOSE. Lumican is a major proteoglycan of vertebrate cornea. This study characterizes mouse lumican, its molecular form, cDNA sequence, and chromosomal localization. METHODS. Lumican sequence was determined from cDNA clones selected from a mouse corneal cDNA expression library using a bovine lumican cDNA probe. Tissue expression and size of lumican mRNA were determined using Northern hybridization. Glycosidase digestion followed by Western blot analysis provided characterization of molecular properties of purified mouse corneal lumican. Chromosomal mapping of the lumican gene (Lcn) used Southern hybridization of a panel of genomic DNAs from an interspecific murine backcross. RESULTS. Mouse lumican is a 338-amino acid protein with high-sequence identity to bovine and chicken lumican proteins. The N-terminus of the lumican protein contains consensus sequences for tyrosine sulfation. A 1.9-kb lumican mRNA is present in cornea and several other tissues. Antibody against bovine lumican reacted with recombinant mouse lumican expressed in Escherichia coli and also detected high molecular weight proteoglycans in extracts of mouse cornea. Keratanase digestion of corneal proteoglycans released lumican protein, demonstrating the presence of sulfated keratan sulfate chains on mouse corneal lumican in vivo. The lumican gene (Lcn) was mapped to the distal region of mouse chromosome 10. The Lcn map site is in the region of a previously identified developmental mutant, eye blebs, affecting corneal morphology. CONCLUSIONS. This study demonstrates sulfated keratan sulfate proteoglycan in mouse cornea and describes the tools (antibodies and cDNA) necessary to investigate the functional role of this important corneal molecule using naturally occurring and induced mutants of the murine lumican gene.
A variant Tc4 transposable element in the nematode C. elegans could encode a novel protein.
Li, W; Shaw, J E
1993-01-01
A variant C. elegans Tc4 transposable element, Tc4-rh1030, has been sequenced and is 3483 bp long. The Tc4 element that had been analyzed previously is 1605 bp long, consists of two 774-bp nearly perfect inverted terminal repeats connected by a 57-bp loop, and lacks significant open reading frames. In Tc4-rh1030, by comparison, a 2343-bp novel sequence is present in place of a 477-bp segment in one of the inverted repeats. The novel sequence of Tc4-rh1030 is present about five times per haploid genome and is invariably associated with Tc4 elements; we have used the designation Tc4v to denote this variant subfamily of Tc4 elements. Sequence analysis of three cDNA clones suggests that a Tc4v element contains at least five exons that could encode a novel basic protein of 537 amino acid residues. On northern blots, a 1.6-kb Tc4v-specific transcript was detected in the mutator strain TR679 but not in the wild-type strain N2; Tc4 elements are known to transpose in TR679 but appear to be quiescent in N2. We have analyzed transcripts produced by an unc-33 gene that has the Tc4-rh1030 insertional mutation in its transcribed region; all or almost all of the Tc4v sequence is frequently spliced out of the mutant unc-33 transcripts, sometimes by means of non-consensus splice acceptor sites. Images PMID:8382791
Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar
2014-12-01
Lysophosphatidyl acyltransferase (LPAT) is one of the major triacylglycerol synthesis enzymes, controlling the metabolic flow of lysophosphatidic acid to phosphatidic acid. Experimental studies in Arabidopsis have shown that LPAT activity is exhibited primarily by three distinct isoforms, namely the plastid-located LPAT1, the endoplasmic reticulum-located LPAT2, and the soluble isoform of LPAT (solLPAT). In this study, 24 putative genes representing all LPAT isoforms were identified from the analysis of 11 complete genomes including green algae, red algae, diatoms and higher plants. We observed LPAT1 and solLPAT genes to be ubiquitously present in nearly all genomes examined, whereas LPAT2 genes to have evolved more recently in the plant lineage. Phylogenetic analysis indicated that LPAT1, LPAT2 and solLPAT have convergently evolved through separate evolutionary paths and belong to three different gene families, which was further evidenced by their wide divergence at gene structure and sequence level. The genome distribution supports the hypothesis that each gene encoding a LPAT is not duplicated. Mapping of exon-intron structure of LPAT genes to the domain structure of proteins across different algal and plant species indicates that exon shuffling plays no role in the evolution of LPAT genes. Besides the previously defined motifs, several conserved consensus sequences were discovered which could be useful to distinguish different LPAT isoforms. Taken together, this study will enable the generation of experimental approximations to better understand the functional role of algal LPAT in lipid accumulation.
Iwasaki, T; Yamaguchi-Shinozaki, K; Shinozaki, K
1995-05-20
In Arabidopsis thaliana, the induction of a dehydration-responsive gene, rd22, is mediated by abscisic acid (ABA) but the gene does not include any sequence corresponding to the consensus ABA-responsive element (ABRE), RYACGTGGYR, in its promoter region. The cis-regulatory region of the rd22 promoter was identified by monitoring the expression of beta-glucuronidase (GUS) activity in leaves of transgenic tobacco plants transformed with chimeric gene fusions constructed between 5'-deleted promoters of rd22 and the coding region of the GUS reporter gene. A 67-bp nucleotide fragment corresponding to positions -207 to -141 of the rd22 promoter conferred responsiveness to dehydration and ABA on a non-responsive promoter. The 67-bp fragment contains the sequences of the recognition sites for some transcription factors, such as MYC, MYB, and GT-1. The fact that accumulation of rd22 mRNA requires protein synthesis raises the possibility that the expression of rd22 might be regulated by one of these trans-acting protein factors whose de novo synthesis is induced by dehydration or ABA. Although the structure of the RD22 protein is very similar to that of a non-storage seed protein, USP, of Vicia faba, the expression of the GUS gene driven by the rd22 promoter in non-stressed transgenic Arabidopsis plants was found mainly in flowers and bolted stems rather than in seeds.
Structure of the coding region and mRNA variants of the apyrase gene from pea (Pisum sativum)
NASA Technical Reports Server (NTRS)
Shibata, K.; Abe, S.; Davies, E.
2001-01-01
Partial amino acid sequences of a 49 kDa apyrase (ATP diphosphohydrolase, EC 3.6.1.5) from the cytoskeletal fraction of etiolated pea stems were used to derive oligonucleotide DNA primers to generate a cDNA fragment of pea apyrase mRNA by RT-PCR and these primers were used to screen a pea stem cDNA library. Two almost identical cDNAs differing in just 6 nucleotides within the coding regions were found, and these cDNA sequences were used to clone genomic fragments by PCR. Two nearly identical gene fragments containing 8 exons and 7 introns were obtained. One of them (H-type) encoded the mRNA sequence described by Hsieh et al. (1996) (DDBJ/EMBL/GenBank Z32743), while the other (S-type) differed by the same 6 nucleotides as the mRNAs, suggesting that these genes may be alleles. The six nucleotide differences between these two alleles were found solely in the first exon, and these mutation sites had two types of consensus sequences. These mRNAs were found with varying lengths of 3' untranslated regions (3'-UTR). There are some similarities between the 3'-UTR of these mRNAs and those of actin and actin binding proteins in plants. The putative roles of the 3'-UTR and alternative polyadenylation sites are discussed in relation to their possible role in targeting the mRNAs to different subcellular compartments.
Cloning and characterization of an autonomous replication sequence from Coxiella burnetii.
Suhan, M; Chen, S Y; Thompson, H A; Hoover, T A; Hill, A; Williams, J C
1994-01-01
A Coxiella burnetii chromosomal fragment capable of functioning as an origin for the replication of a kanamycin resistance (Kanr) plasmid was isolated by use of origin search methods utilizing an Escherichia coli host. The 5.8-kb fragment was subcloned into phagemid vectors and was deleted progressively by an exonuclease III-S1 technique. Plasmids containing progressively shorter DNA fragments were then tested for their capability to support replication by transformation of an E. coli polA strain. A minimal autonomous replication sequence (ARS) was delimited to 403 bp. Sequencing of the entire 5.8-kb region revealed that the minimal ARS contained two consensus DnaA boxes, three A + T-rich 21-mers, a transcriptional promoter leading rightwards, and potential integration host factor and factor of inversion stimulation binding sites. Database comparisons of deduced amino acid sequences revealed that open reading frames located around the ARS were homologous to genes often, but not always, found near bacterial chromosomal origins; these included identities with rpmH and rnpA in E. coli and identities with the 9K protein and 60K membrane protein in E. coli and Pseudomonas species. These and direct hybridization data suggested that the ARS was chromosomal and not associated with the resident plasmid QpH1. Two-dimensional agarose gel electrophoresis did not reveal the presence of initiating intermediates, indicating that the ARS did not initiate chromosome replication during laboratory growth of C. burnetii. Images PMID:8071197
Pastar, Irena; Tonic, Ivana; Golic, Natasa; Kojic, Milan; van Kranenburg, Richard; Kleerebezem, Michiel; Topisirovic, Ljubisa; Jovanovic, Goran
2003-01-01
A novel proteinase, PrtR, produced by the human vaginal isolate Lactobacillus rhamnosus strain BGT10 was identified and genetically characterized. The prtR gene and flanking regions were cloned and sequenced. The deduced amino acid sequence of PrtR shares characteristics that are common for other cell envelope proteinases (CEPs) characterized to date, but in contrast to the other cell surface subtilisin-like serine proteinases, it has a smaller and somewhat different B domain and lacks the helix domain, and the anchor domain has a rare sorting signal sequence. Furthermore, PrtR lacks the insert domain, which otherwise is situated inside the catalytic serine protease domain of all CEPs, and has a different cell wall spacer (W) domain similar to that of the cell surface antigen I and II polypeptides expressed by oral and vaginal streptococci. Moreover, the PrtR W domain exhibits significant sequence homology to the consensus sequence that has been shown to be the hallmark of human intestinal mucin protein. According to its αS1- and β-casein cleavage efficacy, PrtR is an efficient proteinase at pH 6.5 and is distributed throughout all L. rhamnosus strains tested. Proteinase extracts of the BGT10 strain obtained with Ca2+-free buffer at pH 6.5 were proteolytically active. The prtR promoter-like sequence was determined, and the minimal promoter region was defined by use of prtR-gusA operon fusions. The prtR expression is Casitone dependent, emphasizing that nitrogen depletion elevates its transcription. This is in correlation with the catalytic activity of the PrtR proteinase. PMID:14532028
King, Caitriona; Barton, David E
2006-01-01
Background Hereditary haemochromatosis (HH) is a recessively-inherited disorder of iron over-absorption prevalent in Caucasian populations. Affected individuals for Type 1 HH are usually either homozygous for a cysteine to tyrosine amino acid substitution at position 282 (C282Y) of the HFE gene, or compound heterozygotes for C282Y and for a histidine to aspartic acid change at position 63 (H63D). Molecular genetic testing for these two mutations has become widespread in recent years. With diverse testing methods and reporting practices in use, there was a clear need for agreed guidelines for haemochromatosis genetic testing. The UK Clinical Molecular Genetics Society has elaborated a consensus process for the development of disease-specific best practice guidelines for genetic testing. Methods A survey of current practice in the molecular diagnosis of haemochromatosis was conducted. Based on the results of this survey, draft guidelines were prepared using the template developed by UK Clinical Molecular Genetics Society. A workshop was held to develop the draft into a consensus document. The consensus document was then posted on the Clinical Molecular Genetics Society website for broader consultation and amendment. Results Consensus or near-consensus was achieved on all points in the draft guidelines. The consensus and consultation processes worked well, and outstanding issues were documented in an appendix to the guidelines. Conclusion An agreed set of best practice guidelines were developed for diagnostic, predictive and carrier testing for hereditary haemochromatosis and for reporting the results of such testing. PMID:17134494
Westbrook, Jared W.; Chhatre, Vikram E.; Wu, Le-Shin; Chamala, Srikar; Neves, Leandro Gomide; Muñoz, Patricio; Martínez-García, Pedro J.; Neale, David B.; Kirst, Matias; Mockaitis, Keithanne; Nelson, C. Dana; Peter, Gary F.; Echt, Craig S.
2015-01-01
A consensus genetic map for Pinus taeda (loblolly pine) and Pinus elliottii (slash pine) was constructed by merging three previously published P. taeda maps with a map from a pseudo-backcross between P. elliottii and P. taeda. The consensus map positioned 3856 markers via genotyping of 1251 individuals from four pedigrees. It is the densest linkage map for a conifer to date. Average marker spacing was 0.6 cM and total map length was 2305 cM. Functional predictions of mapped genes were improved by aligning expressed sequence tags used for marker discovery to full-length P. taeda transcripts. Alignments to the P. taeda genome mapped 3305 scaffold sequences onto 12 linkage groups. The consensus genetic map was used to compare the genome-wide linkage disequilibrium in a population of distantly related P. taeda individuals (ADEPT2) used for association genetic studies and a multiple-family pedigree used for genomic selection (CCLONES). The prevalence and extent of LD was greater in CCLONES as compared to ADEPT2; however, extended LD with LGs or between LGs was rare in both populations. The average squared correlations, r2, between SNP alleles less than 1 cM apart were less than 0.05 in both populations and r2 did not decay substantially with genetic distance. The consensus map and analysis of linkage disequilibrium establish a foundation for comparative association mapping and genomic selection in P. taeda and P. elliottii. PMID:26068575
Radford, Devon R; Leon-Velarde, Carlos G; Chen, Shu; Hamidi Oskouei, Amir M; Balamurugan, Sampathkumar
2018-03-29
The genomes of two strains of Salmonella enterica subsp. enterica serovar Cubana and serovar Muenchen, isolated from dry hazelnuts and chia seeds, respectively, were sequenced using the Illumina MiSeq platform, assembled de novo using the overlap-layout-consensus method, and aligned to their respective most identical sequence genome scaffolds using MUMMER and BLAST searches. Copyright © 2018 Radford et al.
2014-01-01
Background Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources. PMID:24460871
Bioinformatics prediction of siRNAs as potential antiviral agents against dengue viruses
Villegas-Rosales, Paula M; Méndez-Tenorio, Alfonso; Ortega-Soto, Elizabeth; Barrón, Blanca L
2012-01-01
Dengue virus (DENV 1-4) represents the major emerging arthropod-borne viral infection in the world. Currently, there is neither an available vaccine nor a specific treatment. Hence, there is a need of antiviral drugs for these viral infections; we describe the prediction of short interfering RNA (siRNA) as potential therapeutic agents against the four DENV serotypes. Our strategy was to carry out a series of multiple alignments using ClustalX program to find conserved sequences among the four DENV serotype genomes to obtain a consensus sequence for siRNAs design. A highly conserved sequence among the four DENV serotypes, located in the encoding sequence for NS4B and NS5 proteins was found. A total of 2,893 complete DENV genomes were downloaded from the NCBI, and after a depuration procedure to identify identical sequences, 220 complete DENV genomes were left. They were edited to select the NS4B and NS5 sequences, which were aligned to obtain a consensus sequence. Three different servers were used for siRNA design, and the resulting siRNAs were aligned to identify the most prevalent sequences. Three siRNAs were chosen, one targeted the genome region that codifies for NS4B protein and the other two; the region for NS5 protein. Predicted secondary structure for DENV genomes was used to demonstrate that the siRNAs were able to target the viral genome forming double stranded structures, necessary to activate the RNA silencing machinery. PMID:22829722
Xie, Qingjun; Tzfadia, Oren; Levy, Matan; Weithorn, Efrat; Peled-Zehavi, Hadas; Van Parys, Thomas; Van de Peer, Yves; Galili, Gad
2016-01-01
ABSTRACT Most of the proteins that are specifically turned over by selective autophagy are recognized by the presence of short Atg8 interacting motifs (AIMs) that facilitate their association with the autophagy apparatus. Such AIMs can be identified by bioinformatics methods based on their defined degenerate consensus F/W/Y-X-X-L/I/V sequences in which X represents any amino acid. Achieving reliability and/or fidelity of the prediction of such AIMs on a genome-wide scale represents a major challenge. Here, we present a bioinformatics approach, high fidelity AIM (hfAIM), which uses additional sequence requirements—the presence of acidic amino acids and the absence of positively charged amino acids in certain positions—to reliably identify AIMs in proteins. We demonstrate that the use of the hfAIM method allows for in silico high fidelity prediction of AIMs in AIM-containing proteins (ACPs) on a genome-wide scale in various organisms. Furthermore, by using hfAIM to identify putative AIMs in the Arabidopsis proteome, we illustrate a potential contribution of selective autophagy to various biological processes. More specifically, we identified 9 peroxisomal PEX proteins that contain hfAIM motifs, among which AtPEX1, AtPEX6 and AtPEX10 possess evolutionary-conserved AIMs. Bimolecular fluorescence complementation (BiFC) results verified that AtPEX6 and AtPEX10 indeed interact with Atg8 in planta. In addition, we show that mutations occurring within or nearby hfAIMs in PEX1, PEX6 and PEX10 caused defects in the growth and development of various organisms. Taken together, the above results suggest that the hfAIM tool can be used to effectively perform genome-wide in silico screens of proteins that are potentially regulated by selective autophagy. The hfAIM system is a web tool that can be accessed at link: http://bioinformatics.psb.ugent.be/hfAIM/. PMID:27071037
Investigating intra-host and intra-herd sequence diversity of foot-and-mouth disease virus.
King, David J; Freimanis, Graham L; Orton, Richard J; Waters, Ryan A; Haydon, Daniel T; King, Donald P
2016-10-01
Due to the poor-fidelity of the enzymes involved in RNA genome replication, foot-and-mouth disease (FMD) virus samples comprise of unique polymorphic populations. In this study, deep sequencing was utilised to characterise the diversity of FMD virus (FMDV) populations in 6 infected cattle present on a single farm during the series of outbreaks in the UK in 2007. A novel RT-PCR method was developed to amplify a 7.6kb nucleotide fragment encompassing the polyprotein coding region of the FMDV genome. Illumina sequencing of each sample identified the fine polymorphic structures at each nucleotide position, from consensus level changes to variants present at a 0.24% frequency. These data were used to investigate population dynamics of FMDV at both herd and host levels, evaluate the impact of host on the viral swarm structure and to identify transmission links with viruses recovered from other farms in the same series of outbreaks. In 7 samples, from 6 different animals, a total of 5 consensus level variants were identified, in addition to 104 sub-consensus variants of which 22 were shared between 2 or more animals. Further analysis revealed differences in swarm structures from samples derived from the same animal suggesting the presence of distinct viral populations evolving independently at different lesion sites within the same infected animal. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.
Code of Federal Regulations, 2011 CFR
2011-07-01
... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...
ERIC Educational Resources Information Center
Morey, Candice C.; Miron, Monica D.
2016-01-01
Among models of working memory, there is not yet a consensus about how to describe functions specific to storing verbal or visual-spatial memories. We presented aural-verbal and visual-spatial lists simultaneously and sometimes cued one type of information after presentation, comparing accuracy in conditions with and without informative…
Common Viral Integration Sites Identified in Avian Leukosis Virus-Induced B-Cell Lymphomas
Justice, James F.; Morgan, Robin W.
2015-01-01
ABSTRACT Avian leukosis virus (ALV) induces B-cell lymphoma and other neoplasms in chickens by integrating within or near cancer genes and perturbing their expression. Four genes—MYC, MYB, Mir-155, and TERT—have previously been identified as common integration sites in these virus-induced lymphomas and are thought to play a causal role in tumorigenesis. In this study, we employ high-throughput sequencing to identify additional genes driving tumorigenesis in ALV-induced B-cell lymphomas. In addition to the four genes implicated previously, we identify other genes as common integration sites, including TNFRSF1A, MEF2C, CTDSPL, TAB2, RUNX1, MLL5, CXorf57, and BACH2. We also analyze the genome-wide ALV integration landscape in vivo and find increased frequency of ALV integration near transcriptional start sites and within transcripts. Previous work has shown ALV prefers a weak consensus sequence for integration in cultured human cells. We confirm this consensus sequence for ALV integration in vivo in the chicken genome. PMID:26670384
Ulloa, Mauricio; Hulse-Kemp, Amanda M; De Santiago, Luis M; Stelly, David M; Burke, John J
2017-01-01
High-density linkage maps are vital to supporting the correct placement of scaffolds and gene sequences on chromosomes and fundamental to contemporary organismal research and scientific approaches to genetic improvement, especially in paleopolyploids with exceptionally complex genomes, eg, upland cotton ( Gossypium hirsutum L., "2n = 52"). Three independently developed intraspecific upland mapping populations were analyzed to generate 3 high-density genetic linkage single-nucleotide polymorphism (SNP) maps and a consensus map using the CottonSNP63K array. The populations consisted of a previously reported F 2 , a recombinant inbred line (RIL), and reciprocal RIL population, from "Phytogen 72" and "Stoneville 474" cultivars. The cluster file provided 7417 genotyped SNP markers, resulting in 26 linkage groups corresponding to the 26 chromosomes (c) of the allotetraploid upland cotton (AD) 1 arisen from the merging of 2 genomes ("A" Old World and "D" New World). Patterns of chromosome-specific recombination were largely consistent across mapping populations. The high-density genetic consensus map included 7244 SNP markers that spanned 3538 cM and comprised 3824 SNP bins, of which 1783 and 2041 were in the A t and D t subgenomes with 1825 and 1713 cM map lengths, respectively. Subgenome average distances were nearly identical, indicating that subgenomic differences in bin number arose due to the high numbers of SNPs on the D t subgenome. Examination of expected recombination frequency or crossovers (COs) on the chromosomes within each population of the 2 subgenomes revealed that COs were also not affected by the SNPs or SNP bin number in these subgenomes. Comparative alignment analyses identified historical ancestral A t -subgenomic translocations of c02 and c03, as well as of c04 and c05. The consensus map SNP sequences aligned with high congruency to the NBI assembly of Gossypium hirsutum . However, the genomic comparisons revealed evidence of additional unconfirmed possible duplications, inversions and translocations, and unbalance SNP sequence homology or SNP sequence/loci genomic dominance, or homeolog loci bias of the upland tetraploid A t and D t subgenomes. The alignments indicated that 364 SNP-associated previously unintegrated scaffolds can be placed in pseudochromosomes of the NBI G hirsutum assembly. This is the first intraspecific SNP genetic linkage consensus map assembled in G hirsutum with a core of reproducible mendelian SNP markers assayed on different populations and it provides further knowledge of chromosome arrangement of genic and nongenic SNPs. Together, the consensus map and RIL populations provide a synergistically useful platform for localizing and identifying agronomically important loci for improvement of the cotton crop.
Simonini, Sara; Roig-Villanova, Irma; Gregis, Veronica; Colombo, Bilitis; Colombo, Lucia; Kater, Martin M.
2012-01-01
BASIC PENTACYSTEINE (BPC) transcription factors have been identified in a large variety of plant species. In Arabidopsis thaliana there are seven BPC genes, which, except for BPC5, are expressed ubiquitously. BPC genes are functionally redundant in a wide range of developmental processes. Recently, we reported that BPC1 binds to guanine and adenine (GA)–rich consensus sequences in the SEEDSTICK (STK) promoter in vitro and induces conformational changes. Here we show by chromatin immunoprecipitation experiments that in vivo BPCs also bind to the consensus boxes, and when these were mutated, expression from the STK promoter was derepressed, resulting in ectopic expression in the inflorescence. We also reveal that SHORT VEGETATIVE PHASE (SVP) is a direct regulator of STK. SVP is a floral meristem identity gene belonging to the MADS box gene family. The SVP-APETALA1 (AP1) dimer recruits the SEUSS (SEU)-LEUNIG (LUG) transcriptional cosuppressor to repress floral homeotic gene expression in the floral meristem. Interestingly, we found that GA consensus sequences in the STK promoter to which BPCs bind are essential for recruitment of the corepressor complex to this promoter. Our data suggest that we have identified a new regulatory mechanism controlling plant gene expression that is probably generally used, when considering BPCs’ wide expression profile and the frequent presence of consensus binding sites in plant promoters. PMID:23054472
Modeling repetitive, non‐globular proteins
Basu, Koli; Campbell, Robert L.; Guo, Shuaiqi; Sun, Tianjun
2016-01-01
Abstract While ab initio modeling of protein structures is not routine, certain types of proteins are more straightforward to model than others. Proteins with short repetitive sequences typically exhibit repetitive structures. These repetitive sequences can be more amenable to modeling if some information is known about the predominant secondary structure or other key features of the protein sequence. We have successfully built models of a number of repetitive structures with novel folds using knowledge of the consensus sequence within the sequence repeat and an understanding of the likely secondary structures that these may adopt. Our methods for achieving this success are reviewed here. PMID:26914323
Badaut, Cyril; Bertin, Gwladys; Rustico, Tatiana; Fievet, Nadine; Massougbodji, Achille; Gaye, Alioune; Deloron, Philippe
2010-01-01
Background Placental malaria is a disease linked to the sequestration of Plasmodium falciparum infected red blood cells (IRBC) in the placenta, leading to reduced materno-fetal exchanges and to local inflammation. One of the virulence factors of P. falciparum involved in cytoadherence to chondroitin sulfate A, its placental receptor, is the adhesive protein VAR2CSA. Its localisation on the surface of IRBC makes it accessible to the immune system. VAR2CSA contains six DBL domains. The DBL6ε domain is the most variable. High variability constitutes a means for the parasite to evade the host immune response. The DBL6ε domain could constitute a very attractive basis for a vaccine candidate but its reported variability necessitates, for antigenic characterisations, identifying and classifying commonalities across isolates. Methodology/Principal Findings Local alignment analysis of the DBL6ε domain had revealed that it is not as variable as previously described. Variability is concentrated in seven regions present on the surface of the DBL6ε domain. The main goal of our work is to classify and group variable sequences that will simplify further research to determine dominant epitopes. Firstly, variable sequences were grouped following their average percent pairwise identity (APPI). Groups comprising many variable sequences sharing low variability were found. Secondly, ELISA experiments following the IgG recognition of a recombinant DBL6ε domain, and of peptides mimicking its seven variable blocks, allowed to determine an APPI cut-off and to isolate groups represented by a single consensus sequence. Conclusions/Significance A new sequence approach is used to compare variable regions in sequences that have extensive segmental gene relationship. Using this approach, the VAR2CSA DBL6 domain is composed of 7 variable blocks with limited polymorphism. Each variable block is composed of a limited number of consensus types. Based on peptide based ELISA, variable blocks with 85% or greater sequence identity are expected to be recognized equally well by antibody and can be considered the same consensus type. Therefore, the analysis of the antibody response against the classified small number of sequences should be helpful to determine epitopes. PMID:20585655
Human Immunodeficiency Virus type 1 group M consensus and mosaic envelope glycoproteins
Korber, Bette T.; Fischer, William; Liao, Hua-Xin; Haynes, Barton F.; Letvin, Norman; Hahn, Beatrice H.
2017-11-21
The disclosure relates to nucleic acids mosaic clade M HIV-1 Env polypeptides and to compositions and vectors comprising same. The nucleic acids are suitable for use in inducing an immune response to HIV-1 in a human.
Nanoplatforms for highly sensitive fluorescence detection of cancer-related proteases.
Wang, Hongwang; Udukala, Dinusha N; Samarakoon, Thilani N; Basel, Matthew T; Kalita, Mausam; Abayaweera, Gayani; Manawadu, Harshi; Malalasekera, Aruni; Robinson, Colette; Villanueva, David; Maynez, Pamela; Bossmann, Leonie; Riedy, Elizabeth; Barriga, Jenny; Wang, Ni; Li, Ping; Higgins, Daniel A; Zhu, Gaohong; Troyer, Deryl L; Bossmann, Stefan H
2014-02-01
Numerous proteases are known to be necessary for cancer development and progression including matrix metalloproteinases (MMPs), tissue serine proteases, and cathepsins. The goal of this research is to develop an Fe/Fe3O4 nanoparticle-based system for clinical diagnostics, which has the potential to measure the activity of cancer-associated proteases in biospecimens. Nanoparticle-based "light switches" for measuring protease activity consist of fluorescent cyanine dyes and porphyrins that are attached to Fe/Fe3O4 nanoparticles via consensus sequences. These consensus sequences can be cleaved in the presence of the correct protease, thus releasing a fluorescent dye from the Fe/Fe3O4 nanoparticle, resulting in highly sensitive (down to 1 × 10(-16) mol l(-1) for 12 proteases), selective, and fast nanoplatforms (required time: 60 min).
Carlson, Jonathan M.; Chan, Benjamin; Chopera, Denis R.; Brumme, Chanson J.; Markle, Tristan J.; Martin, Eric; Shahid, Aniqa; Anmole, Gursev; Mwimanzi, Philip; Nassab, Pauline; Penney, Kali A.; Rahman, Manal A.; Milloy, M.-J.; Schechter, Martin T.; Markowitz, Martin; Carrington, Mary; Walker, Bruce D.; Wagner, Theresa; Buchbinder, Susan; Fuchs, Jonathan; Koblin, Beryl; Mayer, Kenneth H.; Harrigan, P. Richard; Brockman, Mark A.; Poon, Art F. Y.; Brumme, Zabrina L.
2014-01-01
HLA-restricted immune escape mutations that persist following HIV transmission could gradually spread through the viral population, thereby compromising host antiviral immunity as the epidemic progresses. To assess the extent and phenotypic impact of this phenomenon in an immunogenetically diverse population, we genotypically and functionally compared linked HLA and HIV (Gag/Nef) sequences from 358 historic (1979–1989) and 382 modern (2000–2011) specimens from four key cities in the North American epidemic (New York, Boston, San Francisco, Vancouver). Inferred HIV phylogenies were star-like, with approximately two-fold greater mean pairwise distances in modern versus historic sequences. The reconstructed epidemic ancestral (founder) HIV sequence was essentially identical to the North American subtype B consensus. Consistent with gradual diversification of a “consensus-like” founder virus, the median “background” frequencies of individual HLA-associated polymorphisms in HIV (in individuals lacking the restricting HLA[s]) were ∼2-fold higher in modern versus historic HIV sequences, though these remained notably low overall (e.g. in Gag, medians were 3.7% in the 2000s versus 2.0% in the 1980s). HIV polymorphisms exhibiting the greatest relative spread were those restricted by protective HLAs. Despite these increases, when HIV sequences were analyzed as a whole, their total average burden of polymorphisms that were “pre-adapted” to the average host HLA profile was only ∼2% greater in modern versus historic eras. Furthermore, HLA-associated polymorphisms identified in historic HIV sequences were consistent with those detectable today, with none identified that could explain the few HIV codons where the inferred epidemic ancestor differed from the modern consensus. Results are therefore consistent with slow HIV adaptation to HLA, but at a rate unlikely to yield imminent negative implications for cellular immunity, at least in North America. Intriguingly, temporal changes in protein activity of patient-derived Nef (though not Gag) sequences were observed, suggesting functional implications of population-level HIV evolution on certain viral proteins. PMID:24762668
Solid phase sequencing of double-stranded nucleic acids
Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.
2002-01-01
This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.
Wezner-Ptasinska, Magdalena; Otlewski, Jacek
2015-12-01
Variable lymphocyte receptors (VLRs) are non-immunoglobulin components of adaptive immunity in jawless vertebrates. These proteins composed of leucine-rich repeat modules offer some advantages over antibodies in target binding and therefore are attractive candidates for biotechnological applications. In this paper we report the design and characterization of a phage display library based on a previously proposed dVLR scaffold containing six LRR modules [Wezner-Ptasinska et al., 2011]. Our library was designed based on a consensus approach in which the randomization scheme reflects the frequencies of amino acids naturally occurring in respective positions responsible for antigen recognition. We demonstrate general applicability of the scaffold by selecting dVLRs specific for lysozyme and S100A7 protein with KD values in the micromolar range. The dVLR library could be used as a convenient alternative to antibodies for effective isolation of high affinity binders.
Schwannomatosis associated with multiple meningiomas due to a familial SMARCB1 mutation.
Bacci, Costanza; Sestini, Roberta; Provenzano, Aldesia; Paganini, Irene; Mancini, Irene; Porfirio, Berardino; Vivarelli, Rossella; Genuardi, Maurizio; Papi, Laura
2010-02-01
Schwannomatosis (MIM 162091) is a condition predisposing to the development of central and peripheral schwannomas; most cases are sporadic without a clear family history but a few families with a clear autosomal dominant pattern of transmission have been described. Germline mutations in SMARCB1 are associated with schwannomatosis. We report a family with multiple schwannomas and meningiomas. A SMARCB1 germline mutation in exon 1 was identified. The mutation, c.92A>T (p.Glu31Val), occurs in a highly conserved amino acid in the SMARCB1 protein. In addition, in silico analysis demonstrated that the mutation disrupts the donor consensus sequence of exon 1. RNA studies verified the absence of mRNA transcribed by the mutant allele. This is the first report of a SMARCB1 germline mutation in a family with schwannomatosis characterized by the development of multiple meningiomas.
Ciolkowski, Ingo; Wanke, Dierk; Birkenbihl, Rainer P; Somssich, Imre E
2008-09-01
WRKY transcription factors have been shown to play a major role in regulating, both positively and negatively, the plant defense transcriptome. Nearly all studied WRKY factors appear to have a stereotypic binding preference to one DNA element termed the W-box. How specificity for certain promoters is accomplished therefore remains completely unknown. In this study, we tested five distinct Arabidopsis WRKY transcription factor subfamily members for their DNA binding selectivity towards variants of the W-box embedded in neighboring DNA sequences. These studies revealed for the first time differences in their binding site preferences, which are partly dependent on additional adjacent DNA sequences outside of the TTGACY-core motif. A consensus WRKY binding site derived from these studies was used for in silico analysis to identify potential target genes within the Arabidopsis genome. Furthermore, we show that even subtle amino acid substitutions within the DNA binding region of AtWRKY11 strongly impinge on its binding activity. Additionally, all five factors were found localized exclusively to the plant cell nucleus and to be capable of trans-activating expression of a reporter gene construct in vivo.
A Family of at Least Seven β-Galactosidase Genes Is Expressed during Tomato Fruit Development
Smith, David L.; Gross, Kenneth C.
2000-01-01
During our search for a cDNA encoding β-galactosidase II, a β-galactosidase/exogalactanase (EC 3.2.1.23) present during tomato (Lycopersicon esculentum Mill.) fruit ripening, a family of seven tomato β-galactosidase (TBG) cDNAs was identified. The shared amino acid sequence identity among the seven TBG clones ranged from 33% to 79%. All contained the putative active site-containing consensus sequence pattern G-G-P-[LIVM]-x-Q-x-E-N-E-[FY] belonging to glycosyl hydrolase family 35. Six of the seven single-copy genes were mapped using restriction fragment length polymorphisms of recombinant inbred lines. RNA gel-blot analysis was used to evaluate TBG mRNA levels throughout fruit development, in different fruit tissues, and in various plant tissues. RNA gel-blot analysis was also used to reveal TBG mRNA levels in fruit of the rin, nor, and Nr tomato mutants. The TBG4-encoded protein, known to correspond to β-galactosidase II, was expressed in yeast and exo-galactanase activity was confirmed via a quantified release of galactosyl residues from cell wall fractions containing β(1→4)-d-galactan purified from tomato fruit. PMID:10889266
Eukaryotic tRNAs fingerprint invertebrates vis-à-vis vertebrates.
Mitra, Sanga; Das, Pijush; Samadder, Arpa; Das, Smarajit; Betai, Rupal; Chakrabarti, Jayprokas
2015-01-01
During translation, aminoacyl-tRNA synthetases recognize the identities of the tRNAs to charge them with their respective amino acids. The conserved identities of 58,244 eukaryotic tRNAs of 24 invertebrates and 45 vertebrates in genomic tRNA database were analyzed and their novel features extracted. The internal promoter sequences, namely, A-Box and B-Box, were investigated and evidence gathered that the intervention of optional nucleotides at 17a and 17b correlated with the optimal length of the A-Box. The presence of canonical transcription terminator sequences at the immediate vicinity of tRNA genes was ventured. Even though non-canonical introns had been reported in red alga, green alga, and nucleomorph so far, fairly motivating evidence of their existence emerged in tRNA genes of other eukaryotes. Non-canonical introns were seen to interfere with the internal promoters in two cases, questioning their transcription fidelity. In a first of its kind, phylogenetic constructs based on tRNA molecules delineated and built the trees of the vast and diverse invertebrates and vertebrates. Finally, two tRNA models representing the invertebrates and the vertebrates were drawn, by isolating the dominant consensus in the positional fluctuations of nucleotide compositions.
Modular structural elements in the replication origin region of Tetrahymena rDNA.
Du, C; Sanzgiri, R P; Shaiu, W L; Choi, J K; Hou, Z; Benbow, R M; Dobbs, D L
1995-01-01
Computer analyses of the DNA replication origin region in the amplified rRNA genes of Tetrahymena thermophila identified a potential initiation zone in the 5'NTS [Dobbs, Shaiu and Benbow (1994), Nucleic Acids Res. 22, 2479-2489]. This region consists of a putative DNA unwinding element (DUE) aligned with predicted bent DNA segments, nuclear matrix or scaffold associated region (MAR/SAR) consensus sequences, and other common modular sequence elements previously shown to be clustered in eukaryotic chromosomal origin regions. In this study, two mung bean nuclease-hypersensitive sites in super-coiled plasmid DNA were localized within the major DUE-like element predicted by thermodynamic analyses. Three restriction fragments of the 5'NTS region predicted to contain bent DNA segments exhibited anomalous migration characteristic of bent DNA during electrophoresis on polyacrylamide gels. Restriction fragments containing the 5'NTS region bound Tetrahymena nuclear matrices in an in vitro binding assay, consistent with an association of the replication origin region with the nuclear matrix in vivo. The direct demonstration in a protozoan origin region of elements previously identified in Drosophila, chick and mammalian origin regions suggests that clusters of modular structural elements may be a conserved feature of eukaryotic chromosomal origins of replication. Images PMID:7784181
Human Splice-Site Prediction with Deep Neural Networks.
Naito, Tatsuhiko
2018-04-18
Accurate splice-site prediction is essential to delineate gene structures from sequence data. Several computational techniques have been applied to create a system to predict canonical splice sites. For classification tasks, deep neural networks (DNNs) have achieved record-breaking results and often outperformed other supervised learning techniques. In this study, a new method of splice-site prediction using DNNs was proposed. The proposed system receives an input sequence data and returns an answer as to whether it is splice site. The length of input is 140 nucleotides, with the consensus sequence (i.e., "GT" and "AG" for the donor and acceptor sites, respectively) in the middle. Each input sequence model is applied to the pretrained DNN model that determines the probability that an input is a splice site. The model consists of convolutional layers and bidirectional long short-term memory network layers. The pretraining and validation were conducted using the data set tested in previously reported methods. The performance evaluation results showed that the proposed method can outperform the previous methods. In addition, the pattern learned by the DNNs was visualized as position frequency matrices (PFMs). Some of PFMs were very similar to the consensus sequence. The trained DNN model and the brief source code for the prediction system are uploaded. Further improvement will be achieved following the further development of DNNs.
[Prediction of ETA oligopeptides antagonists from Glycine max based on in silico proteolysis].
Qiao, Lian-Sheng; Jiang, Lu-di; Luo, Gang-Gang; Lu, Fang; Chen, Yan-Kun; Wang, Ling-Zhi; Li, Gong-Yu; Zhang, Yan-Ling
2017-02-01
Oligopeptides are one of the the key pharmaceutical effective constituents of traditional Chinese medicine(TCM). Systematic study on composition and efficacy of TCM oligopeptides is essential for the analysis of material basis and mechanism of TCM. In this study, the potential anti-hypertensive oligopeptides from Glycine max and their endothelin receptor A (ETA) antagonistic activity were discovered and predicted based on in silico technologies.Main protein sequences of G. max were collected and oligopeptides were obtained using in silico gastrointestinal tract proteolysis. Then, the pharmacophore of ETA antagonistic peptides was constructed and included one hydrophobic feature, one ionizable negative feature, one ring aromatic feature and five excluded volumes. Meanwhile, three-dimensional structure of ETA was developed by homology modeling methods for further docking studies. According to docking analysis and consensus score, the key amino acid of GLN165 was identified for ETA antagonistic activity. And 27 oligopeptides from G. max were predicted as the potential ETA antagonists by pharmacophore and docking studies.In silico proteolysis could be used to analyze the protein sequences from TCM. According to combination of in silico proteolysis and molecular simulation, the biological activities of oligopeptides could be predicted rapidly based on the known TCM protein sequence. It might provide the methodology basis for rapidly and efficiently implementing the mechanism analysis of TCM oligopeptides. Copyright© by the Chinese Pharmaceutical Association.
Evidence for Horizontal Gene Transfer in Evolution of Elongation Factor Tu in Enterococci
Ke, Danbing; Boissinot, Maurice; Huletsky, Ann; Picard, François J.; Frenette, Johanne; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.
2000-01-01
The elongation factor Tu, encoded by tuf genes, is a GTP binding protein that plays a central role in protein synthesis. One to three tuf genes per genome are present, depending on the bacterial species. Most low-G+C-content gram-positive bacteria carry only one tuf gene. We have designed degenerate PCR primers derived from consensus sequences of the tuf gene to amplify partial tuf sequences from 17 enterococcal species and other phylogenetically related species. The amplified DNA fragments were sequenced either by direct sequencing or by sequencing cloned inserts containing putative amplicons. Two different tuf genes (tufA and tufB) were found in 11 enterococcal species, including Enterococcus avium, Enterococcus casseliflavus, Enterococcus dispar, Enterococcus durans, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Enterococcus malodoratus, Enterococcus mundtii, Enterococcus pseudoavium, and Enterococcus raffinosus. For the other six enterococcal species (Enterococcus cecorum, Enterococcus columbae, Enterococcus faecalis, Enterococcus sulfureus, Enterococcus saccharolyticus, and Enterococcus solitarius), only the tufA gene was present. Based on 16S rRNA gene sequence analysis, the 11 species having two tuf genes all have a common ancestor, while the six species having only one copy diverged from the enterococcal lineage before that common ancestor. The presence of one or two copies of the tuf gene in enterococci was confirmed by Southern hybridization. Phylogenetic analysis of tuf sequences demonstrated that the enterococcal tufA gene branches with the Bacillus, Listeria, and Staphylococcus genera, while the enterococcal tufB gene clusters with the genera Streptococcus and Lactococcus. Primary structure analysis showed that four amino acid residues encoded within the sequenced regions are conserved and unique to the enterococcal tufB genes and the tuf genes of streptococci and Lactococcus lactis. The data suggest that an ancestral streptococcus or a streptococcus-related species may have horizontally transferred a tuf gene to the common ancestor of the 11 enterococcal species which now carry two tuf genes. PMID:11092850
Vera-Cabrera, L; Johnson, W M; Welsh, O; Resendiz-Uresti, F L; Salinas-Carmona, M C
1999-06-01
An immunodominant protein from Nocardia brasiliensis, P61, was subjected to amino-terminal and internal sequence analysis. Three sequences of 22, 17, and 38 residues, respectively, were obtained and compared with the protein database from GenBank by using the BLAST system. The sequences showed homology to some eukaryotic catalases and to a bromoperoxidase-catalase from Streptomyces violaceus. Its identity as a catalase was confirmed by analysis of its enzymatic activity on H2O2 and by a double-staining method on a nondenaturing polyacrylamide gel with 3,3'-diaminobenzidine and ferricyanide; the result showed only catalase activity, but no peroxidase. By using one of the internal amino acid sequences and a consensus catalase motif (VGNNTP), we were able to design a PCR assay that generated a 500-bp PCR product. The amplicon was analyzed, and the nucleotide sequence was compared to the GenBank database with the observation of high homology to other bacterial and eukaryotic catalases. A PCR assay based on this target sequence was performed with primers NB10 and NB11 to confirm the presence of the NB10-NB11 gene fragment in several N. brasiliensis strains isolated from mycetoma. The same assay was used to determine whether there were homologous sequences in several type strains from the genera Nocardia, Rhodococcus, Gordona, and Streptomyces. All of the N. brasiliensis strains presented a positive result but only some of the actinomycetes species tested were positive in the PCR assay. In order to confirm these findings, genomic DNA was subjected to Southern blot analysis. A 1.7-kbp band was observed in the N. brasiliensis strains, and bands of different molecular weight were observed in cross-reacting actinomycetes. Sequence analysis of the amplicons of selected actinomycetes showed high homology in this catalase fragment, thus demonstrating that this protein is highly conserved in this group of bacteria.
Amexis, Georgios; Oeth, Paul; Abel, Kenneth; Ivshina, Anna; Pelloquin, Francois; Cantor, Charles R.; Braun, Andreas; Chumakov, Konstantin
2001-01-01
RNA viruses exist as quasispecies, heterogeneous and dynamic mixtures of mutants having one or more consensus sequences. An adequate description of the genomic structure of such viral populations must include the consensus sequence(s) plus a quantitative assessment of sequence heterogeneities. For example, in quality control of live attenuated viral vaccines, the presence of even small quantities of mutants or revertants may indicate incomplete or unstable attenuation that may influence vaccine safety. Previously, we demonstrated the monitoring of oral poliovirus vaccine with the use of mutant analysis by PCR and restriction enzyme cleavage (MAPREC). In this report, we investigate genetic variation in live attenuated mumps virus vaccine by using both MAPREC and a platform (DNA MassArray) based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. Mumps vaccines prepared from the Jeryl Lynn strain typically contain at least two distinct viral substrains, JL1 and JL2, which have been characterized by full length sequencing. We report the development of assays for characterizing sequence variants in these substrains and demonstrate their use in quantitative analysis of substrains and sequence variations in mixed virus cultures and mumps vaccines. The results obtained from both the MAPREC and MALDI-TOF methods showed excellent correlation. This suggests the potential utility of MALDI-TOF for routine quality control of live viral vaccines and for assessment of genetic stability and quantitative monitoring of genetic changes in other RNA viruses of clinical interest. PMID:11593021
Colinet, Anne-Sophie; Thines, Louise; Deschamps, Antoine; Flémal, Gaëlle; Demaegd, Didier; Morsomme, Pierre
2017-07-01
The UPF0016 family is a recently identified group of poorly characterized membrane proteins whose function is conserved through evolution and that are defined by the presence of 1 or 2 copies of the E-φ-G-D-[KR]-[TS] consensus motif in their transmembrane domain. We showed that 2 members of this family, the human TMEM165 and the budding yeast Gdt1p, are functionally related and are likely to form a new group of Ca 2+ transporters. Mutations in TMEM165 have been demonstrated to cause a new type of rare human genetic diseases denominated as Congenital Disorders of Glycosylation. Using site-directed mutagenesis, we generated 17 mutations in the yeast Golgi-localized Ca 2+ transporter Gdt1p. Single alanine substitutions were targeted to the highly conserved consensus motifs, 4 acidic residues localized in the central cytosolic loop, and the arginine at position 71. The mutants were screened in a yeast strain devoid of both the endogenous Gdt1p exchanger and Pmr1p, the Ca 2+ -ATPase of the Golgi apparatus. We show here that acidic and polar uncharged residues of the consensus motifs play a crucial role in calcium tolerance and calcium transport activity and are therefore likely to be architectural components of the cation binding site of Gdt1p. Importantly, we confirm the essential role of the E53 residue whose mutation in humans triggers congenital disorders of glycosylation. © 2017 John Wiley & Sons Ltd.
Accurate Typing of Human Leukocyte Antigen Class I Genes by Oxford Nanopore Sequencing.
Liu, Chang; Xiao, Fangzhou; Hoisington-Lopez, Jessica; Lang, Kathrin; Quenzel, Philipp; Duffy, Brian; Mitra, Robi David
2018-04-03
Oxford Nanopore Technologies' MinION has expanded the current DNA sequencing toolkit by delivering long read lengths and extreme portability. The MinION has the potential to enable expedited point-of-care human leukocyte antigen (HLA) typing, an assay routinely used to assess the immunologic compatibility between organ donors and recipients, but the platform's high error rate makes it challenging to type alleles with accuracy. We developed and validated accurate typing of HLA by Oxford nanopore (Athlon), a bioinformatic pipeline that i) maps nanopore reads to a database of known HLA alleles, ii) identifies candidate alleles with the highest read coverage at different resolution levels that are represented as branching nodes and leaves of a tree structure, iii) generates consensus sequences by remapping the reads to the candidate alleles, and iv) calls the final diploid genotype by blasting consensus sequences against the reference database. Using two independent data sets generated on the R9.4 flow cell chemistry, Athlon achieved a 100% accuracy in class I HLA typing at the two-field resolution. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
CHROMA: consensus-based colouring of multiple alignments for publication.
Goodstadt, L; Ponting, C P
2001-09-01
CHROMA annotates multiple protein sequence alignments by consensus to produce formatted and coloured text suitable for incorporation into other documents for publication. The package is designed to be flexible and reliable, and has a simple-to-use graphical user interface running under Microsoft Windows. Both the executables and source code for CHROMA running under Windows and Linux (portable command-line only) are freely available at http://www.lg.ndirect.co.uk/chroma. Software enquiries should be directed to CHROMA@lg.ndirect.co.uk.
Solid phase sequencing of biopolymers
Cantor, Charles; Koster, Hubert
2010-09-28
This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.
The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides.
Tsirigos, Konstantinos D; Peters, Christoph; Shu, Nanjiang; Käll, Lukas; Elofsson, Arne
2015-07-01
TOPCONS (http://topcons.net/) is a widely used web server for consensus prediction of membrane protein topology. We hereby present a major update to the server, with some substantial improvements, including the following: (i) TOPCONS can now efficiently separate signal peptides from transmembrane regions. (ii) The server can now differentiate more successfully between globular and membrane proteins. (iii) The server now is even slightly faster, although a much larger database is used to generate the multiple sequence alignments. For most proteins, the final prediction is produced in a matter of seconds. (iv) The user-friendly interface is retained, with the additional feature of submitting batch files and accessing the server programmatically using standard interfaces, making it thus ideal for proteome-wide analyses. Indicatively, the user can now scan the entire human proteome in a few days. (v) For proteins with homology to a known 3D structure, the homology-inferred topology is also displayed. (vi) Finally, the combination of methods currently implemented achieves an overall increase in performance by 4% as compared to the currently available best-scoring methods and TOPCONS is the only method that can identify signal peptides and still maintain a state-of-the-art performance in topology predictions. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Aali, Mohsen; Moradi-Shahrbabak, Mohammad; Moradi-Shahrbabak, Hosein; Sadeghi, Mostafa
2014-03-01
Calpastatin has been introduced as a potential candidate gene for growth and meat quality traits. In this study, genetic variability was investigated in the exon 6 and its intron boundaries of ovine CAST gene by PCR-SSCP analysis and DNA sequencing. Also a protein sequence and structural analysis were performed to predict the possible impact of amino acid substitutions on physicochemical properties and structure of the CAST protein. A total of 487 animals belonging to four ancient Iranian sheep breeds with different fat metabolisms, Lori-Bakhtiari and Chall (fat-tailed), Zel-Atabay cross-bred (medium fat-tailed) and Zel (thin-tailed), were analyzed. Eight unique SSCP patterns, representing eight different sequences or haplotypes, CAST-1, CAST-2 and CAST-6 to CAST-11, were identified. Haplotypes CAST-1 and CAST-2 were most common with frequency of 0.365 and 0.295. The novel haplotype CAST-8 had considerable frequency in Iranian sheep breeds (0.129). All the consensus sequences showed 98-99%, 94-98%, 92-93% and 82-83% similarity to the published ovine, caprine, bovine and porcine CAST locus sequences, respectively. Sequence analysis revealed four SNPs in intron 5 (C24T, G62A, G65T and T69-) and three SNPs in exon 6 (c.197A>T, c.282G>T and c.296C>G). All three SNPs in exon 6 were missense mutations which would result in p.Gln 66 Leu, p.Glu 94 Asp and p.Pro 99 Arg substitutions, respectively, in CAST protein. All three amino acid substitutions affected the physicochemical properties of ovine CAST protein including hydrophobicity, amphiphilicity and net charge and subsequently might influence its structure and effect on the activity of Ca2+ channels; hence, they might regulate calpain activity and afterwards meat tenderness and growth rate. The Lori-Bakhtiari population showed the highest heterozygosity in the ovine CAST locus (0.802). Frequency difference of haplotypes CAST-10 and CAST-8 between Lori-Bakhtiari (fat-tailed) and Zel (thin-tailed) breeds was highly significant (P<0.001), indicating that these two haplotypes might be breed-specific haplotypes that distinguish between fat-tailed and thin-tailed sheep breeds. Copyright © 2013 Elsevier B.V. All rights reserved.
Rouanet, Carine; Reverchon, Sylvie; Rodionov, Dmitry A; Nasser, William
2004-07-16
In Erwinia chrysanthemi, production of pectic enzymes is modulated by a complex network involving several regulators. One of them, PecS, which belongs to the MarR family, also controls the synthesis of various other virulence factors, such as cellulases and indigoidine. Here, the PecS consensus-binding site is defined by combining a systematic evolution of ligands by an exponential enrichment approach and mutational analyses. The consensus consists of a 23-base pair palindromic-like sequence (C(-11)G(-10)A(-9)N(-8)W(-7)T(-6)C(-5)G(-4)T(-3)A(-2))T(-1)A(0)T(1)(T(2)A(3)C(4)G(5)A(6)N(7)N(8)N(9)C(10)G(11)). Mutational experiments revealed that (i) the palindromic organization is required for the binding of PecS, (ii) the very conserved part of the consensus (-6 to 6) allows for a specific interaction with PecS, but the presence of the relatively degenerated bases located apart significantly increases PecS affinity, (iii) the four bases G, A, T, and C are required for efficient binding of PecS, and (iv) the presence of several binding sites on the same promoter increases the affinity of PecS. This consensus is detected in the regions involved in PecS binding on the previously characterized target genes. This variable consensus is in agreement with the observation that the members of the MarR family are able to bind various DNA targets as dimers by means of a winged helix DNA-binding motif. Binding of PecS on a promoter region containing the defined consensus results in a repression of gene transcription in vitro. Preliminary scanning of the E. chrysanthemi genome sequence with the consensus revealed the presence of strong PecS-binding sites in the intergenic region between fliE and fliFGHIJKLMNOPQR which encode proteins involved in the biogenesis of flagellum. Accordingly, PecS directly represses fliE expression. Thus, PecS seems to control the synthesis of virulence factors required for the key steps of plant infection.
Diversity of acetic acid bacteria present in healthy grapes from the Canary Islands.
Valera, Maria José; Laich, Federico; González, Sara S; Torija, Maria Jesús; Mateo, Estibaliz; Mas, Albert
2011-11-15
The identification of acetic acid bacteria (AAB) from sound grapes from the Canary Islands is reported in the present study. No direct recovery of bacteria was possible in the most commonly used medium, so microvinifications were performed on grapes from Tenerife, La Palma and Lanzarote islands. Up to 396 AAB were isolated from those microvinifications and identified by 16S rRNA gene sequencing and phylogenetic analysis. With this method, Acetobacter pasteurianus, Acetobacter tropicalis, Gluconobacter japonicus and Gluconacetobacter saccharivorans were identified. However, no discrimination between the closely related species Acetobacter malorum and Acetobacter cerevisiae was possible. As previously described, 16S-23S rRNA gene internal transcribed spacer (ITS) region phylogenetic analysis was required to classify isolates as one of those species. These two species were the most frequently occurring, accounting for more than 60% of the isolates. For typing the AAB isolates, both the Enterobacterial Repetitive Intergenic Consensus (ERIC)-PCR and (GTG)5-PCR techniques gave similar resolution. A total of 60 profiles were identified. Thirteen of these profiles were found in more than one vineyard, and only one profile was found on two different islands (Tenerife and La Palma). Copyright © 2011 Elsevier B.V. All rights reserved.
Yatuv, Rivka; Robinson, Micah; Dayan, Inbal; Baru, Moshe
2010-02-01
Improving the pharmacodynamics of protein drugs has the potential to improve the care and the quality of life of patients suffering from a variety of diseases. Four approaches to improve protein drugs are described: PEGylation, amino acid substitution, fusion to carrier proteins and encapsulation. A new platform technology based on the binding of proteins/peptides to the outer surface of PEGylated liposomes (PEGLip) is then presented. Binding of proteins to PEGLip is non-covalent, highly specific and dependent on an amino acid consensus sequence within the proteins. Association of proteins with PEGLip results in substantial enhancement of the pharmacodynamic properties of proteins following administration. This has been demonstrated in preclinical studies and clinical trials with coagulation factors VIII and VIIa. It has also been demonstrated in preclinical studies with granulocyte colony-stimulating factor. A mechanism is presented that explains the improvements in hemostatic efficacy of PEGLip-formulated coagulation factors VIII and VIIa. The reader will gain an understanding of the advantages and disadvantages of each of the approaches discussed. PEGLip formulation is an important new approach to improve the pharmacodynamics of protein drugs. This approach may be applied to further therapeutic proteins in the future.
Predictive Structure and Topology of Peroxisomal ATP-Binding Cassette (ABC) Transporters
Andreoletti, Pierre; Raas, Quentin; Gondcaille, Catherine; Cherkaoui-Malki, Mustapha; Trompier, Doriane; Savary, Stéphane
2017-01-01
The peroxisomal ATP-binding Cassette (ABC) transporters, which are called ABCD1, ABCD2 and ABCD3, are transmembrane proteins involved in the transport of various lipids that allow their degradation inside the organelle. Defective ABCD1 leads to the accumulation of very long-chain fatty acids and is associated with a complex and severe neurodegenerative disorder called X-linked adrenoleukodystrophy (X-ALD). Although the nucleotide-binding domain is highly conserved and characterized within the ABC transporters family, solid data are missing for the transmembrane domain (TMD) of ABCD proteins. The lack of a clear consensus on the secondary and tertiary structure of the TMDs weakens any structure-function hypothesis based on the very diverse ABCD1 mutations found in X-ALD patients. Therefore, we first reinvestigated thoroughly the structure-function data available and performed refined alignments of ABCD protein sequences. Based on the 2.85 Å resolution crystal structure of the mitochondrial ABC transporter ABCB10, here we propose a structural model of peroxisomal ABCD proteins that specifies the position of the transmembrane and coupling helices, and highlight functional motifs and putative important amino acid residues. PMID:28737695
Sattler, Ursula; Khosravi, Mojtaba; Avila, Mislay; Pilo, Paola; Langedijk, Johannes P; Ader-Ebert, Nadine; Alves, Lisa A; Plattet, Philippe; Origgi, Francesco C
2014-07-01
The hemagglutinin (H) gene of canine distemper virus (CDV) encodes the receptor-binding protein. This protein, together with the fusion (F) protein, is pivotal for infectivity since it contributes to the fusion of the viral envelope with the host cell membrane. Of the two receptors currently known for CDV (nectin-4 and the signaling lymphocyte activation molecule [SLAM]), SLAM is considered the most relevant for host susceptibility. To investigate how evolution might have impacted the host-CDV interaction, we examined the functional properties of a series of missense single nucleotide polymorphisms (SNPs) naturally accumulating within the H-gene sequences during the transition between two distinct but related strains. The two strains, a wild-type strain and a consensus strain, were part of a single continental outbreak in European wildlife and occurred in distinct geographical areas 2 years apart. The deduced amino acid sequence of the two H genes differed at 5 residues. A panel of mutants carrying all the combinations of the SNPs was obtained by site-directed mutagenesis. The selected mutant, wild type, and consensus H proteins were functionally evaluated according to their surface expression, SLAM binding, fusion protein interaction, and cell fusion efficiencies. The results highlight that the most detrimental functional effects are associated with specific sets of SNPs. Strikingly, an efficient compensational system driven by additional SNPs appears to come into play, virtually neutralizing the negative functional effects. This system seems to contribute to the maintenance of the tightly regulated function of the H-gene-encoded attachment protein. Importance: To investigate how evolution might have impacted the host-canine distemper virus (CDV) interaction, we examined the functional properties of naturally occurring single nucleotide polymorphisms (SNPs) in the hemagglutinin gene of two related but distinct strains of CDV. The hemagglutinin gene encodes the attachment protein, which is pivotal for infection. Our results show that few SNPs have a relevant detrimental impact and they generally appear in specific combinations (molecular signatures). These drastic negative changes are neutralized by compensatory mutations, which contribute to maintenance of an overall constant bioactivity of the attachment protein. This compensational mechanism might reflect the reaction of the CDV machinery to the changes occurring in the virus following antigenic variations critical for virulence. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Yang, V W; Marks, J A; Davis, B P; Jeffries, T W
1994-01-01
This paper describes the first high-efficiency transformation system for the xylose-fermenting yeast Pichia stipitis. The system includes integrating and autonomously replicating plasmids based on the gene for orotidine-5'-phosphate decarboxylase (URA3) and an autonomous replicating sequence (ARS) element (ARS2) isolated from P. stipitis CBS 6054. Ura- auxotrophs were obtained by selecting for resistance to 5-fluoroorotic acid and were identified as ura3 mutants by transformation with P. stipitis URA3. P. stipitis URA3 was cloned by its homology to Saccharomyces cerevisiae URA3, with which it is 69% identical in the coding region. P. stipitis ARS elements were cloned functionally through plasmid rescue. These sequences confer autonomous replication when cloned into vectors bearing the P. stipitis URA3 gene. P. stipitis ARS2 has features similar to those of the consensus ARS of S. cerevisiae and other ARS elements. Circular plasmids bearing the P. stipitis URA3 gene with various amounts of flanking sequences produced 600 to 8,600 Ura+ transformants per micrograms of DNA by electroporation. Most transformants obtained with circular vectors arose without integration of vector sequences. One vector yielded 5,200 to 12,500 Ura+ transformants per micrograms of DNA after it was linearized at various restriction enzyme sites within the P. stipitis URA3 insert. Transformants arising from linearized vectors produced stable integrants, and integration events were site specific for the genomic ura3 in 20% of the transformants examined. Plasmids bearing the P. stipitis URA3 gene and ARS2 element produced more than 30,000 transformants per micrograms of plasmid DNA. Autonomously replicating plasmids were stable for at least 50 generations in selection medium and were present at an average of 10 copies per nucleus. Images PMID:7811063
Altet, Laura; Francino, Olga; Solano-Gallego, Laia; Renier, Corinne; Sánchez, Armand
2002-01-01
The NRAMP1 gene (Slc11a1) encodes an ion transporter protein involved in the control of intraphagosomal replication of parasites and in macrophage activation. It has been described in mice as the determinant of natural resistance or susceptibility to infection with antigenically unrelated pathogens, including Leishmania. Our aims were to sequence and map the canine Slc11a1 gene and to identify mutations that may be associated with resistance or susceptibility to Leishmania infection. The canine Slc11a1 gene has been mapped to dog chromosome CFA37 and covers 9 kb, including a 700-bp promoter region, 15 exons, and a polymorphic microsatellite in intron 1. It encodes a 547-amino-acid protein that has over 87% identity with the Slc11a1 proteins of different mammalian species. A case-control study with 33 resistant and 84 susceptible dogs showed an association between allele 145 of the microsatellite and susceptible dogs. Sequence variant analysis was performed by direct sequencing of the cDNA and the promoter region of four unrelated beagles experimentally infected with Leishmania infantum to search for possible functional mutations. Two of the dogs were classified as susceptible and the other two were classified as resistant based on their immune responses. Two important mutations were found in susceptible dogs: a G-rich region in the promoter that was common to both animals and a complete deletion of exon 11, which encodes the consensus transport motif of the protein, in the unique susceptible dog that needed an additional and prolonged treatment to avoid continuous relapses. A study with a larger dog population would be required to prove the association of these sequence variants with disease susceptibility. PMID:12010961
Optimal treatment sequence in COPD: Can a consensus be found?
Ferreira, J; Drummond, M; Pires, N; Reis, G; Alves, C; Robalo-Cordeiro, C
2016-01-01
There is currently no consensus on the treatment sequence in chronic obstructive pulmonary disease (COPD), although it is recognized that early diagnosis is of paramount importance to start treatment in the early stages of the disease. Although it is fairly consensual that initial treatment should be with an inhaled short-acting beta agonist, a short-acting muscarinic antagonist, a long-acting beta-agonist or a long-acting muscarinic antagonist. As the disease progresses, several therapeutic options are available, and which to choose at each disease stage remains controversial. When and in which patients to use dual bronchodilation? When to use inhaled corticosteroids? And triple therapy? Are the existing non-inhaled therapies, such as mucolytic agents, antibiotics, phosphodiesterase-4 inhibitors, methylxanthines and immunostimulating agents, useful? If so, which patients would benefit? Should co-morbidities be taken into account when choosing COPD therapy for a patient? This paper reviews current guidelines and available evidence and proposes a therapeutic scheme for COPD patients. We also propose a treatment algorithm in the hope that it will help physicians to decide the best approach for their patients. The authors conclude that, at present, a full consensus on optimal treatment sequence in COPD cannot be found, mainly due to disease heterogeneity and lack of biomarkers to guide treatment. For the time being, and although some therapeutic approaches are consensual, treatment of COPD should be patient-oriented. Copyright © 2015 Sociedade Portuguesa de Pneumologia. Published by Elsevier España, S.L.U. All rights reserved.
Sampled-data consensus in switching networks of integrators based on edge events
NASA Astrophysics Data System (ADS)
Xiao, Feng; Meng, Xiangyu; Chen, Tongwen
2015-02-01
This paper investigates the event-driven sampled-data consensus in switching networks of multiple integrators and studies both the bidirectional interaction and leader-following passive reaction topologies in a unified framework. In these topologies, each information link is modelled by an edge of the information graph and assigned a sequence of edge events, which activate the mutual data sampling and controller updates of the two linked agents. Two kinds of edge-event-detecting rules are proposed for the general asynchronous data-sampling case and the synchronous periodic event-detecting case. They are implemented in a distributed fashion, and their effectiveness in reducing communication costs and solving consensus problems under a jointly connected topology condition is shown by both theoretical analysis and simulation examples.
Fontán-Gabás, Lorena; Oliemuller, Erik; Martínez-Irujo, Juan José; de Miguel, Carlos; Rouzaut, Ana
2007-01-01
Neurons are highly polarized cells composed of two structurally and functionally distinct parts, the axon and the dendrite. The establishment of this asymmetric structure is a tightly regulated process. In fact, alterations in the proteins involved in the configuration of the microtubule lattice are frequent in neuro-oncologic diseases. One of these cytoplasmic mediators is the protein known as collapsin response mediator protein-2, which interacts with and promotes tubulin polymerization. In this study, we investigated collapsin response mediator protein-2 transcriptional regulation during all-trans-retinoic acid-induced differentiation of SH-SY5Y neuroblastoma cells. All-trans-retinoic acid is considered to be a potential preventive and therapeutic agent, and has been extensively used to differentiate neuroblastoma cells in vitro. Therefore, we first demonstrated that collapsin response mediator protein-2 mRNA levels are downregulated during the differentiation process. After completion of deletion construct analysis and mutagenesis and mobility shift assays, we concluded that collapsin response mediator protein-2 basal promoter activity is regulated by the transcription factors AP-2 and Pax-3, whereas E2F, Sp1 and NeuroD1 seem not to participate in its regulation. Furthermore, we finally established that reduced expression of collapsin response mediator protein-2 after all-trans-retinoic acid exposure is associated with impaired Pax-3 and AP-2 binding to their consensus sequences in the collapsin response mediator protein-2 promoter. Decreased attachment of AP-2 is a consequence of its accumulation in the cytoplasm. On the other hand, Pax-3 shows lower binding due to all-trans-retinoic acid-mediated transcriptional repression. Unraveling the molecular mechanisms behind the action of all-trans-retinoic acid on neuroblastoma cells may well offer new perspectives for its clinical application.
Evolution of amino acid metabolism inferred through cladistic analysis.
Cunchillos, Chomin; Lecointre, Guillaume
2003-11-28
Because free amino acids were most probably available in primitive abiotic environments, their metabolism is likely to have provided some of the very first metabolic pathways of life. What were the first enzymatic reactions to emerge? A cladistic analysis of metabolic pathways of the 16 aliphatic amino acids and 2 portions of the Krebs cycle was performed using four criteria of homology. The analysis is not based on sequence comparisons but, rather, on coding similarities in enzyme properties. The properties used are shared specific enzymatic activity, shared enzymatic function without substrate specificity, shared coenzymes, and shared functional family. The tree shows that the earliest pathways to emerge are not portions of the Krebs cycle but metabolisms of aspartate, asparagine, glutamate, and glutamine. The views of Horowitz (Horowitz, N. H. (1945) Proc. Natl. Acad. Sci. U. S. A. 31, 153-157) and Cordón (Cordón, F. (1990) Tratado Evolucionista de Biologia, Aguilar, Madrid, Spain), according to which the upstream reactions in the catabolic pathways and the downstream reactions in the anabolic pathways are the earliest in evolution, are globally corroborated; however, with some exceptions. These are due to later opportunistic connections of pathways (actually already suggested by these authors). Earliest enzymatic functions are mostly catabolic; they were deaminations, transaminations, and decarboxylations. From the consensus tree we extracted four time spans for amino acid metabolism development. For some amino acids catabolism and biosynthesis occurred at the same time (Asp, Glu, Lys, Leu, Ala, Val, Ile, Pro, Arg). For others ultimate reactions that use amino acids as a substrate or as a product are distinct in time, with catabolism preceding anabolism for Asn, Gln, and Cys and anabolism preceding catabolism for Ser, Met, and Thr. Cladistic analysis of the structure of biochemical pathways makes hypotheses in biochemical evolution explicit and parsimonious.
Detection of nucleic acid sequences by invader-directed cleavage
Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert
1999-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.
Standardised neonatal parenteral nutrition formulations – an Australasian group consensus 2012
2014-01-01
Standardised parenteral nutrition formulations are routinely used in the neonatal intensive care units in Australia and New Zealand. In 2010, a multidisciplinary group was formed to achieve a consensus on the formulations acceptable to majority of the neonatal intensive care units. Literature review was undertaken for each nutrient and recommendations were developed in a series of meetings held between November 2010 and April 2011. Three standard and 2 optional amino acid/dextrose formulations and one lipid emulsion were agreed by majority participants in the consensus. This has a potential to standardise neonatal parenteral nutrition guidelines, reduce costs and prescription errors. PMID:24548745
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2011 CFR
2011-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2013 CFR
2013-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2012 CFR
2012-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2010 CFR
2010-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.
Code of Federal Regulations, 2014 CFR
2014-07-01
... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.
Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M
2017-08-16
High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.
Papandreou, Nikos C.; Iconomidou, Vassiliki A.; Willis, Judith H.; Hamodrakas, Stavros J.
2010-01-01
The physical properties of cuticle are determined by the structure of its two major components, cuticular proteins (CPs) and chitin, and, also, by their interactions. A common consensus region (extended R&R Consensus) found in the majority of cuticular proteins, the CPRs, binds to chitin. Previous work established that β-pleated sheet predominates in the Consensus region and we proposed that it is responsible for the formation of helicoidal cuticle. Remote sequence similarity between CPRs and a lipocalin, bovine plasma retinol binding protein (RBP), led us to suggest an antiparallel β-sheet half-barrel structure as the basic folding motif of the R&R Consensus. There are several other families of cuticular proteins. One of the best defined is CPF. Its four members in Anopheles gambiae are expressed during the early stages of either pharate pupal or pharate adult development, suggesting that the proteins contribute to the outer regions of the cuticle, the epi- and/or exocuticle. These proteins did not bind to chitin in the same assay used successfully for CPRs. Although CPFs are distinct in sequence from CPRs, the same lipocalin could also be used to derive homology models for one Anopheles gambiae and one Drosophila melanogaster CPF. For the CPFs, the basic folding motif predicted is an eight-stranded, antiparallel β-sheet, full-barrel structure. Possible implications of this structure are discussed and docking experiments were carried out with one possible Drosophila ligand, 7(Z), 11(Z)-heptacosadiene. PMID:20417215
Pyrin gene and mutants thereof, which cause familial Mediterranean fever
Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL
2003-09-30
The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.
Ofran, Yanay; Schlessinger, Avner; Rost, Burkhard
2008-11-01
Exact identification of complementarity determining regions (CDRs) is crucial for understanding and manipulating antigenic interactions. One way to do this is by marking residues on the antibody that interact with B cell epitopes on the antigen. This, of course, requires identification of B cell epitopes, which could be done by marking residues on the antigen that bind to CDRs, thus requiring identification of CDRs. To circumvent this vicious circle, existing tools for identifying CDRs are based on sequence analysis or general biophysical principles. Often, these tools, which are based on partial data, fail to agree on the boundaries of the CDRs. Herein we present an automated procedure for identifying CDRs and B cell epitopes using consensus structural regions that interact with the antigens in all known antibody-protein complexes. Consequently, we provide the first comprehensive analysis of all CDR-epitope complexes of known three-dimensional structure. The CDRs we identify only partially overlap with the regions suggested by existing methods. We found that the general physicochemical properties of both CDRs and B cell epitopes are rather peculiar. In particular, only four amino acids account for most of the sequence of CDRs, and several types of amino acids almost never appear in them. The secondary structure content and the conservation of B cell epitopes are found to be different than previously thought. These characteristics of CDRs and epitopes may be instrumental in choosing which residues to mutate in experimental search for epitopes. They may also assist in computational design of antibodies and in predicting B cell epitopes.
Hammoudi, D; Moubareck, C Ayoub; Hakime, N; Houmani, M; Barakat, A; Najjar, Z; Suleiman, M; Fayad, N; Sarraf, R; Sarkis, D Karam
2015-07-01
The acquisition of carbapenemases by Acinetobacter baumannii is reported increasingly worldwide, but data from Lebanon are limited. The aims of this study were to evaluate the prevalence of imipenem-resistant A. baumannii in Lebanon, identify resistance determinants, and detect clonal relatedness. Imipenem-resistant A. baumannii were collected from nine Lebanese hospitals during 2012. Antimicrobial susceptibility, the cloxacillin effect, and ethylenediaminetetraacetic acid (EDTA) synergy were determined. Genes encoding carbapenemases and insertion sequence ISAba1 were screened via PCR sequencing. ISAba1 position relative to genes encoding Acinetobacter-derived cephalosporinases (ADCs) and OXA-23 was studied by PCR mapping. Clonal linkage was examined by enterobacterial repetitive intergenic consensus PCR (ERIC-PCR). Out of 724 A. baumannii isolated in 2012, 638 (88%) were imipenem-resistant. Of these, 142 were analyzed. Clavulanic acid-imipenem synergy suggested carbapenem-hydrolyzing extended-spectrum β-lactamase. A positive cloxacillin test indicated ADCs, while EDTA detection strips were negative. Genotyping indicated that 90% of isolates co-harbored blaOXA-23 and blaGES-11. The remaining strains had blaOXA-23, blaOXA-24, blaGES-11, or blaOXA-24 with blaGES-11. ISAba1 was located upstream of blaADC and blaOXA-23 in 97% and 100% of isolates, respectively. ERIC-PCR fingerprinting revealed 18 pulsotypes spread via horizontal gene transfer and clonal dissemination. This survey established baseline evidence of OXA-23 and GES-11-producing A. baumannii in Lebanon, indicating the need for further surveillance. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Barnes, Anna; Alonzi, Roberto; Blackledge, Matthew; Charles-Edwards, Geoff; Collins, David J; Cook, Gary; Coutts, Glynn; Goh, Vicky; Graves, Martin; Kelly, Charles; Koh, Dow-Mu; McCallum, Hazel; Miquel, Marc E; O'Connor, James; Padhani, Anwar; Pearson, Rachel; Priest, Andrew; Rockall, Andrea; Stirling, James; Taylor, Stuart; Tunariu, Nina; van der Meulen, Jan; Walls, Darren; Winfield, Jessica; Punwani, Shonit
2018-01-01
Application of whole body diffusion-weighted MRI (WB-DWI) for oncology are rapidly increasing within both research and routine clinical domains. However, WB-DWI as a quantitative imaging biomarker (QIB) has significantly slower adoption. To date, challenges relating to accuracy and reproducibility, essential criteria for a good QIB, have limited widespread clinical translation. In recognition, a UK workgroup was established in 2016 to provide technical consensus guidelines (to maximise accuracy and reproducibility of WB-MRI QIBs) and accelerate the clinical translation of quantitative WB-DWI applications for oncology. A panel of experts convened from cancer centres around the UK with subspecialty expertise in quantitative imaging and/or the use of WB-MRI with DWI. A formal consensus method was used to obtain consensus agreement regarding best practice. Questions were asked about the appropriateness or otherwise on scanner hardware and software, sequence optimisation, acquisition protocols, reporting, and ongoing quality control programs to monitor precision and accuracy and agreement on quality control. The consensus panel was able to reach consensus on 73% (255/351) items and based on consensus areas made recommendations to maximise accuracy and reproducibly of quantitative WB-DWI studies performed at 1.5T. The panel were unable to reach consensus on the majority of items related to quantitative WB-DWI performed at 3T. This UK Quantitative WB-DWI Technical Workgroup consensus provides guidance on maximising accuracy and reproducibly of quantitative WB-DWI for oncology. The consensus guidance can be used by researchers and clinicians to harmonise WB-DWI protocols which will accelerate clinical translation of WB-DWI-derived QIBs.
USDA-ARS?s Scientific Manuscript database
One-hundred-thirty-six expressed sequence tags (ESTs) encoding alpha gliadins from Triticum aestivum cv Butte 86 were identified in public databases and assembled into 19 contigs. Consensus sequences for 12 of the contigs encoded complete alpha gliadin proteins, but only two were identical to protei...
Simone, Domenico; Bay, Denice C.; Leach, Thorin; Turner, Raymond J.
2013-01-01
Background The twin-arginine translocation (Tat) protein export system enables the transport of fully folded proteins across a membrane. This system is composed of two integral membrane proteins belonging to TatA and TatC protein families and in some systems a third component, TatB, a homolog of TatA. TatC participates in substrate protein recognition through its interaction with a twin arginine leader peptide sequence. Methodology/Principal Findings The aim of this study was to explore TatC diversity, evolution and sequence conservation in bacteria to identify how TatC is evolving and diversifying in various bacterial phyla. Surveying bacterial genomes revealed that 77% of all species possess one or more tatC loci and half of these classes possessed only tatC and tatA genes. Phylogenetic analysis of diverse TatC homologues showed that they were primarily inherited but identified a small subset of taxonomically unrelated bacteria that exhibited evidence supporting lateral gene transfer within an ecological niche. Examination of bacilli tatCd/tatCy isoform operons identified a number of known and potentially new Tat substrate genes based on their frequent association to tatC loci. Evolutionary analysis of these Bacilli isoforms determined that TatCy was the progenitor of TatCd. A bacterial TatC consensus sequence was determined and highlighted conserved and variable regions within a three dimensional model of the Escherichia coli TatC protein. Comparative analysis between the TatC consensus sequence and Bacilli TatCd/y isoform consensus sequences revealed unique sites that may contribute to isoform substrate specificity or make TatA specific contacts. Synonymous to non-synonymous nucleotide substitution analyses of bacterial tatC homologues determined that tatC sequence variation differs dramatically between various classes and suggests TatC specialization in these species. Conclusions/Significance TatC proteins appear to be diversifying within particular bacterial classes and its specialization may be driven by the substrates it transports and the environment of its host. PMID:24236045
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-29
... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.
2007-12-11
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
1999-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
2002-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.
2010-11-09
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
2000-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.
2005-04-05
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Jones, Frank R.; Gabitzsch, Elizabeth S.; Xu, Younong; Balint, Joseph P.; Borisevich, Viktoriya; Smith, Jennifer; Smith, Jeanon; Peng, Bi-Hung; Walker, Aida; Salazar, Magda; Paessler, Slobodan
2013-01-01
Vaccines against emerging pathogens such as the 2009 H1N1 pandemic virus can benefit from current technologies such as rapid genomic sequencing to construct the most biologically relevant vaccine. A novel platform (Ad5 [E1-, E2b-]) has been utilized to induce immune responses to various antigenic targets. We employed this vector platform to express hemagglutinin (HA) and neuraminidase (NA) genes from 2009 H1N1 pandemic viruses. Inserts were consensuses sequences designed from viral isolate sequences and the vaccine was rapidly constructed and produced. Vaccination induced H1N1 immune responses in mice, which afforded protection from lethal virus challenge. In ferrets, vaccination protected from disease development and significantly reduced viral titers in nasal washes. H1N1 cell mediated immunity as well as antibody induction correlated with the prevention of disease symptoms and reduction of virus replication. The Ad5 [E1-, E2b-] should be evaluated for the rapid development of effective vaccines against infectious diseases. PMID:21821082
Pessôa, Rodrigo; Watanabe, Jaqueline Tomoko; Nukui, Youko; Pereira, Juliana; Casseb, Jorge; Kasseb, Jorge; de Oliveira, Augusto César Penalva; Segurado, Aluisio Cotrim; Sanabani, Sabri Saeed
2014-01-01
Here, we report on the partial and full-length genomic (FLG) variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs), 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) and 7 adult T-cell leukemia/lymphoma (ATLL) patients, using an Illumina paired-end protocol. Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14) and FLG (n = 76) data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5%) individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA) and that 4 individuals (4.5%) were infected with the Japanese sub-subtypes (aB). A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data will add to our current understanding of the evolutionary history of this medically important virus.
Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M
2012-02-01
Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
NASA Astrophysics Data System (ADS)
El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Kobeissy, Firas
2017-01-01
The crucial biological role of proteases has been visible with the development of degradomics discipline involved in the determination of the proteases/substrates resulting in breakdown-products (BDPs) that can be utilized as putative biomarkers associated with different biological-clinical significance. In the field of cancer biology, matrix metalloproteinases (MMPs) have shown to result in MMPs-generated protein BDPs that are indicative of malignant growth in cancer, while in the field of neural injury, calpain-2 and caspase-3 proteases generate BDPs fragments that are indicative of different neural cell death mechanisms in different injury scenarios. Advanced proteomic techniques have shown a remarkable progress in identifying these BDPs experimentally. In this work, we present a bioinformatics-based prediction method that identifies protease-associated BDPs with high precision and efficiency. The method utilizes state-of-the-art sequence matching and alignment algorithms. It starts by locating consensus sequence occurrences and their variants in any set of protein substrates, generating all fragments resulting from cleavage. The complexity exists in space O(mn) as well as in O(Nmn) time, where N, m, and n are the number of protein sequences, length of the consensus sequence, and length per protein sequence, respectively. Finally, the proposed methodology is validated against βII-spectrin protein, a brain injury validated biomarker.
Method for nucleic acid hybridization using single-stranded DNA binding protein
Tabor, Stanley; Richardson, Charles C.
1996-01-01
Method of nucleic acid hybridization for detecting the presence of a specific nucleic acid sequence in a population of different nucleic acid sequences using a nucleic acid probe. The nucleic acid probe hybridizes with the specific nucleic acid sequence but not with other nucleic acid sequences in the population. The method includes contacting a sample (potentially including the nucleic acid sequence) with the nucleic acid probe under hybridizing conditions in the presence of a single-stranded DNA binding protein provided in an amount which stimulates renaturation of a dilute solution (i.e., one in which the t.sub.1/2 of renaturation is longer than 3 weeks) of single-stranded DNA greater than 500 fold (i.e., to a t.sub.1/2 less than 60 min, preferably less than 5 min, and most preferably about 1 min.) in the absence of nucleotide triphosphates.
Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.
Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami
2012-08-01
Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or <0.5% or >15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.
Singh, Vinod Kumar; Krishnamachari, Annangarachari
2016-09-01
Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS) requires an essential consensus sequence (ACS) for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC) denoted as ORC-ACS and non-replicating ACS sequences (nrACS), that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.
Consensus pan-genome assembly of the specialised wine bacterium Oenococcus oeni.
Sternes, Peter R; Borneman, Anthony R
2016-04-27
Oenococcus oeni is a lactic acid bacterium that is specialised for growth in the ecological niche of wine, where it is noted for its ability to perform the secondary, malolactic fermentation that is often required for many types of wine. Expanding the understanding of strain-dependent genetic variations in its small and streamlined genome is important for realising its full potential in industrial fermentation processes. Whole genome comparison was performed on 191 strains of O. oeni; from this rich source of genomic information consensus pan-genome assemblies of the invariant (core) and variable (flexible) regions of this organism were established. Genetic variation in amino acid biosynthesis and sugar transport and utilisation was found to be common between strains. Furthermore, we characterised previously-unreported intra-specific genetic variations in the natural competence of this microbe. By assembling a consensus pan-genome from a large number of strains, this study provides a tool for researchers to readily compare protein-coding genes across strains and infer functional relationships between genes in conserved syntenic regions. This establishes a foundation for further genetic, and thus phenotypic, research of this industrially-important species.
2010-01-01
Background Epimedium sagittatum (Sieb. Et Zucc.) Maxim, a traditional Chinese medicinal plant species, has been used extensively as genuine medicinal materials. Certain Epimedium species are endangered due to commercial overexploition, while sustainable application studies, conservation genetics, systematics, and marker-assisted selection (MAS) of Epimedium is less-studied due to the lack of molecular markers. Here, we report a set of expressed sequence tags (ESTs) and simple sequence repeats (SSRs) identified in these ESTs for E. sagittatum. Results cDNAs of E. sagittatum are sequenced using 454 GS-FLX pyrosequencing technology. The raw reads are cleaned and assembled into a total of 76,459 consensus sequences comprising of 17,231 contigs and 59,228 singlets. About 38.5% (29,466) of the consensus sequences significantly match to the non-redundant protein database (E-value < 1e-10), 22,295 of which are further annotated using Gene Ontology (GO) terms. A total of 2,810 EST-SSRs is identified from the Epimedium EST dataset. Trinucleotide SSR is the dominant repeat type (55.2%) followed by dinucleotide (30.4%), tetranuleotide (7.3%), hexanucleotide (4.9%), and pentanucleotide (2.2%) SSR. The dominant repeat motif is AAG/CTT (23.6%) followed by AG/CT (19.3%), ACC/GGT (11.1%), AT/AT (7.5%), and AAC/GTT (5.9%). Thirty-two SSR-ESTs are randomly selected and primer pairs are synthesized for testing the transferability across 52 Epimedium species. Eighteen primer pairs (85.7%) could be successfully transferred to Epimedium species and sixteen of those show high genetic diversity with 0.35 of observed heterozygosity (Ho) and 0.65 of expected heterozygosity (He) and high number of alleles per locus (11.9). Conclusion A large EST dataset with a total of 76,459 consensus sequences is generated, aiming to provide sequence information for deciphering secondary metabolism, especially for flavonoid pathway in Epimedium. A total of 2,810 EST-SSRs is identified from EST dataset and ~1580 EST-SSR markers are transferable. E. sagittatum EST-SSR transferability to the major Epimedium germplasm is up to 85.7%. Therefore, this EST dataset and EST-SSRs will be a powerful resource for further studies such as taxonomy, molecular breeding, genetics, genomics, and secondary metabolism in Epimedium species. PMID:20141623
Single molecule sequencing of the M13 virus genome without amplification
Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X.; Yan, Qin; Deem, Michael W.; He, Jiankui
2017-01-01
Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias. PMID:29253901
Single molecule sequencing of the M13 virus genome without amplification.
Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X; Yan, Qin; Deem, Michael W; He, Jiankui
2017-01-01
Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias.
Qiu, Jing; Kleineidam, Anna; Gouraud, Sabine; Yao, Song Tieng; Greenwood, Mingkwan; Hoe, See Ziau; Hindmarch, Charles
2014-01-01
The supraoptic nucleus (SON) of the hypothalamus is responsible for maintaining osmotic stability in mammals through its elaboration of the antidiuretic hormone arginine vasopressin. Upon dehydration, the SON undergoes a function-related plasticity, which includes remodeling of morphology, electrical properties, and biosynthetic activity. This process occurs alongside alterations in steady state transcript levels, which might be mediated by changes in the activity of transcription factors. In order to identify which transcription factors might be involved in changing patterns of gene expression, an Affymetrix protein-DNA array analysis was carried out. Nuclear extracts of SON from dehydrated and control male rats were analyzed for binding to the 345 consensus DNA transcription factor binding sequences of the array. Statistical analysis revealed significant changes in binding to 26 consensus elements, of which EMSA confirmed increased binding to signal transducer and activator of transcription (Stat) 1/Stat3, cellular Myelocytomatosis virus-like cellular proto-oncogene (c-Myc)-Myc-associated factor X (Max), and pre-B cell leukemia transcription factor 1 sequences after dehydration. Focusing on c-Myc and Max, we used quantitative PCR to confirm previous transcriptomic analysis that had suggested an increase in c-Myc, but not Max, mRNA levels in the SON after dehydration, and we demonstrated c-Myc- and Max-like immunoreactivities in SON arginine vasopressin-expressing cells. Finally, by comparing new data obtained from Roche-NimbleGen chromatin immunoprecipitation arrays with previously published transcriptomic data, we have identified putative c-Myc target genes whose expression changes in the SON after dehydration. These include known c-Myc targets, such as the Slc7a5 gene, which encodes the L-type amino acid transporter 1, ribosomal protein L24, histone deactylase 2, and the Rat sarcoma proto-oncogene (Ras)-related nuclear GTPase. PMID:25144923
Martínez, José M.; Kok, Jan; Sanders, Jan W.; Hernández, Pablo E.
2000-01-01
Antibodies against enterocin A were obtained by immunization of rabbits with synthetic peptides PH4 and PH5 designed, respectively, on the N- and C-terminal amino acid sequences of enterocin A and conjugated to the carrier protein KLH. Anti-PH4-KLH antibodies not only recognized enterocin A but also pediocin PA-1, enterocin P, and sakacin A, three bacteriocins which share the N-terminal class IIa consensus motif (YGNGVXC) that is contained in the sequence of the peptide PH4. In contrast, anti-PH5-KLH antibodies only reacted with enterocin A because the amino acid sequences of the C-terminal parts of class IIa bacteriocins are highly variable. Enterocin A and/or pediocin PA-1 structural and immunity genes were introduced in Lactococcus lactis IL1403 to achieve (co)production of the bacteriocins. The level of production of the two bacteriocins was significantly lower than that obtained by the wild-type producers, a fact that suggests a low efficiency of transport and/or maturation of these bacteriocins by the chromosomally encoded bacteriocin translocation machinery of IL1403. Despite the low production levels, both bacteriocins could be specifically detected and quantified with the anti-PH5-KLH (anti-enterocin A) antibodies isolated in this study and the anti-PH2-KLH (anti-pediocin PA-1) antibodies previously generated (J. M. Martínez, M. I. Martínez, A. M. Suárez, C. Herranz, P. Casaus, L. M. Cintas, J. M. Rodríguez, and P. E. Hernández, Appl. Environ. Microbiol. 64:4536–4545, 1998). In this work, the availability of antibodies for the specific detection and quantification of enterocin A and pediocin PA-1 was crucial to demonstrate coproduction of both bacteriocins by L. lactis IL1403(pJM04), because indicator strains that are selectively inhibited by each bacteriocin are not available. PMID:10919819
Salton, S R; Fischberg, D J; Dong, K W
1991-05-01
Nerve growth factor (NGF) plays a critical role in the development and survival of neurons in the peripheral nervous system. Following treatment with NGF but not epidermal growth factor, rat pheochromocytoma (PC12) cells undergo neural differentiation. We have cloned a nervous system-specific mRNA, NGF33.1, that is rapidly and relatively selectively induced by treatment of PC12 cells with NGF and basic fibroblast growth factor in comparison with epidermal growth factor. Analysis of the nucleic acid and predicted amino acid sequences of the NGF33.1 cDNA clone suggested that this clone corresponded to the NGF-inducible mRNA called VGF (A. Levi, J. D. Eldridge, and B. M. Paterson, Science 229:393-395, 1985; R. Possenti, J. D. Eldridge, B. M. Paterson, A. Grasso, and A. Levi, EMBO J. 8:2217-2223, 1989). We have used the NGF33.1 cDNA clone to isolate and characterize the VGF gene, and in this paper we report the complete sequence of the VGF gene, including 853 bases of 5' flank revealed TATAA and CCAAT elements, several GC boxes, and a consensus cyclic AMP response element-binding protein binding site. The VGF promoter contains sequences homologous to other NGF-inducible, neuronal promoters. We further show that VGF mRNA is induced in PC12 cells to a greater extent by depolarization and by phorbol-12-myristate-13-acetate treatment than by 8-bromo-cyclic AMP treatment. By Northern (RNA) and RNase protection analysis, VGF mRNA is detectable in embryonic and postnatal central and peripheral nervous tissues but not in a number of nonneural tissues. In the cascade of events which ultimately leads to the neural differentiation of NGF-treated PC12 cells, the VGF gene encodes the most rapidly and selectively regulated, nervous-system specific mRNA yet identified.
Proteome-wide inference of human endophilin 1-binding peptides.
Wu, Gang; Zhang, Zeng-Li; Fu, Chun-Jiang; Lv, Feng-Lin; Tian, Fei-Fei
2012-10-01
Human endophilin 1 (hEndo1) is a multifunctional protein that was found to bind a wide spectrum of prolinerich endocytic proteins through its Src homology 3 (SH3) domain. In order to elucidate the unknown biological functions of hEndo1, it is essential to find out the cytoplasmic components that hEndo1 recognizes and binds. However, it is too time-consuming and expensive to synthesize all peptide candidates found in the human proteome and to perform hEndo1 SH3-peptide affinity assay to identify the hEndo1-binding partners. In the present work, we describe a structure/ sequence-hybrid approach to perform proteome-wide inference of human hEndo1-binding peptides using the information gained from both the primary sequence of affinity-known peptides and the interaction profile involved in hEndo1 SH3-peptide complex three-dimensional structures. Modeling results show that (i) different residue positions contribute distinctly to peptide affinity and specificity; P-1, P2 and P4 are most important, P1 and P3 are also effective, and P-3, P-2, P0, P5 and P6 are relatively insignificant, (ii) the consensus core PXXP motif is necessary but not sufficient for determining high affinity of peptides, and some other positions must be also essential in the hEndo1 SH3-peptide binding, and (iii) the alternating arrangement of polar and nonpolar amino acids along peptide sequence is critical for the high specificity of peptide recognition by hEndo1 SH3 domain. In addition, we also find that the residue type at a specific position of hEndo1-binding peptides is not stringently invariable; amino acids that possess similar polarity could replace each other without substantial influence on peptide affinity. In this way, hEndo1 presents a broad specificity in the peptide ligands that it binds.
Cotmore, S F; Christensen, J; Nüesch, J P; Tattersall, P
1995-01-01
A DNA fragment containing the minute virus of mice 3' replication origin was specifically coprecipitated in immune complexes containing the virally coded NS1, but not the NS2, polypeptide. Antibodies directed against the amino- or carboxy-terminal regions of NS1 precipitated the NS1-origin complexes, but antibodies directed against NS1 amino acids 284 to 459 blocked complex formation. Using affinity-purified histidine-tagged NS1 preparations, we have shown that the specific protein-DNA interaction is of moderate affinity, being stable in 0.1 M salt but rapidly lost at higher salt concentrations. In contrast, generalized (or nonspecific) DNA binding by NS1 could be demonstrated only in low salt. Addition of ATP or gamma S-ATP enhanced specific DNA binding by wild-type NS1 severalfold, but binding was lost under conditions which favored ATP hydrolysis. NS1 molecules with mutations in a critical lysine residue (amino acid 405) in the consensus ATP-binding site bound to the origin, but this binding could not be enhanced by ATP addition. DNase I protection assays carried out with wild-type NS1 in the presence of gamma S-ATP gave footprints which extended over 43 nucleotides on both DNA strands, from the middle of the origin bubble sequence to a position some 14 bp beyond the nick site. The DNA-binding site for NS1 was mapped to a 22-bp fragment from the middle of the 3' replication origin which contains the sequence ACCAACCA. This conforms to a reiterated motif (ACCA)2-3, which occurs, in more or less degenerate form, at many sites throughout the minute virus of mice genome (J. W. Bodner, Virus Genes 2:167-182, 1989). Insertion of a single copy of the sequence (ACCA)3 was shown to be sufficient to confer NS1 binding on an otherwise unrecognized plasmid fragment. The functions of NS1 in the viral life cycle are reevaluated in the light of this result. PMID:7853501
Cotmore, S F; Christensen, J; Nüesch, J P; Tattersall, P
1995-03-01
A DNA fragment containing the minute virus of mice 3' replication origin was specifically coprecipitated in immune complexes containing the virally coded NS1, but not the NS2, polypeptide. Antibodies directed against the amino- or carboxy-terminal regions of NS1 precipitated the NS1-origin complexes, but antibodies directed against NS1 amino acids 284 to 459 blocked complex formation. Using affinity-purified histidine-tagged NS1 preparations, we have shown that the specific protein-DNA interaction is of moderate affinity, being stable in 0.1 M salt but rapidly lost at higher salt concentrations. In contrast, generalized (or nonspecific) DNA binding by NS1 could be demonstrated only in low salt. Addition of ATP or gamma S-ATP enhanced specific DNA binding by wild-type NS1 severalfold, but binding was lost under conditions which favored ATP hydrolysis. NS1 molecules with mutations in a critical lysine residue (amino acid 405) in the consensus ATP-binding site bound to the origin, but this binding could not be enhanced by ATP addition. DNase I protection assays carried out with wild-type NS1 in the presence of gamma S-ATP gave footprints which extended over 43 nucleotides on both DNA strands, from the middle of the origin bubble sequence to a position some 14 bp beyond the nick site. The DNA-binding site for NS1 was mapped to a 22-bp fragment from the middle of the 3' replication origin which contains the sequence ACCAACCA. This conforms to a reiterated motif (ACCA)2-3, which occurs, in more or less degenerate form, at many sites throughout the minute virus of mice genome (J. W. Bodner, Virus Genes 2:167-182, 1989). Insertion of a single copy of the sequence (ACCA)3 was shown to be sufficient to confer NS1 binding on an otherwise unrecognized plasmid fragment. The functions of NS1 in the viral life cycle are reevaluated in the light of this result.
The role of recombination in the origin and evolution of Alu subfamilies.
Teixeira-Silva, Ana; Silva, Raquel M; Carneiro, João; Amorim, António; Azevedo, Luísa
2013-01-01
Alus are the most abundant and successful short interspersed nuclear elements found in primate genomes. In humans, they represent about 10% of the genome, although few are retrotransposition-competent and are clustered into subfamilies according to the source gene from which they evolved. Recombination between them can lead to genomic rearrangements of clinical and evolutionary significance. In this study, we have addressed the role of recombination in the origin of chimeric Alu source genes by the analysis of all known consensus sequences of human Alus. From the allelic diversity of Alu consensus sequences, validated in extant elements resulting from whole genome searches, distinct events of recombination were detected in the origin of particular subfamilies of AluS and AluY source genes. These results demonstrate that at least two subfamilies are likely to have emerged from ectopic Alu-Alu recombination, which stimulates further research regarding the potential of chimeric active Alus to punctuate the genome.
σ54-Dependent Response to Nitrogen Limitation and Virulence in Burkholderia cenocepacia Strain H111
Lardi, Martina; Aguilar, Claudio; Pedrioli, Alessandro; Omasits, Ulrich; Suppiger, Angela; Cárcamo-Oyarce, Gerardo; Schmid, Nadine; Ahrens, Christian H.
2015-01-01
Members of the genus Burkholderia are versatile bacteria capable of colonizing highly diverse environmental niches. In this study, we investigated the global response of the opportunistic pathogen Burkholderia cenocepacia H111 to nitrogen limitation at the transcript and protein expression levels. In addition to a classical response to nitrogen starvation, including the activation of glutamine synthetase, PII proteins, and the two-component regulatory system NtrBC, B. cenocepacia H111 also upregulated polyhydroxybutyrate (PHB) accumulation and exopolysaccharide (EPS) production in response to nitrogen shortage. A search for consensus sequences in promoter regions of nitrogen-responsive genes identified a σ54 consensus sequence. The mapping of the σ54 regulon as well as the characterization of a σ54 mutant suggests an important role of σ54 not only in control of nitrogen metabolism but also in the virulence of this organism. PMID:25841012
Deans, Zandra C; Costa, Jose Luis; Cree, Ian; Dequeker, Els; Edsjö, Anders; Henderson, Shirley; Hummel, Michael; Ligtenberg, Marjolijn Jl; Loddo, Marco; Machado, Jose Carlos; Marchetti, Antonio; Marquis, Katherine; Mason, Joanne; Normanno, Nicola; Rouleau, Etienne; Schuuring, Ed; Snelson, Keeda-Marie; Thunnissen, Erik; Tops, Bastiaan; Williams, Gareth; van Krieken, Han; Hall, Jacqueline A
2017-01-01
The clinical demand for mutation detection within multiple genes from a single tumour sample requires molecular diagnostic laboratories to develop rapid, high-throughput, highly sensitive, accurate and parallel testing within tight budget constraints. To meet this demand, many laboratories employ next-generation sequencing (NGS) based on small amplicons. Building on existing publications and general guidance for the clinical use of NGS and learnings from germline testing, the following guidelines establish consensus standards for somatic diagnostic testing, specifically for identifying and reporting mutations in solid tumours. These guidelines cover the testing strategy, implementation of testing within clinical service, sample requirements, data analysis and reporting of results. In conjunction with appropriate staff training and international standards for laboratory testing, these consensus standards for the use of NGS in molecular pathology of solid tumours will assist laboratories in implementing NGS in clinical services.
Stephen, Alexa A; Leone, Angelique M; Toplon, David E; Archer, Linda L; Wellehan, James F X
2016-12-01
A juvenile female bald eagle ( Haliaeetus leucocephalus ) was presented with emaciation and proliferative periocular lesions. The eagle did not respond to supportive therapy and was euthanatized. Histopathologic examination of the skin lesions revealed plaques of marked epidermal hyperplasia parakeratosis, marked acanthosis and spongiosis, and eosinophilic intracytoplasmic inclusion bodies. Novel polymerase chain reaction (PCR) assays were done to amplify and sequence DNA polymerase and rpo147 genes. The 4b gene was also analyzed by a previously developed assay. Bayesian and maximum likelihood phylogenetic analyses of the obtained sequences found it to be poxvirus of the genus Avipoxvirus and clustered with other raptor isolates. Better phylogenetic resolution was found in rpo147 rather than the commonly used DNA polymerase. The novel consensus rpo147 PCR assay will create more accurate phylogenic trees and allow better insight into poxvirus history.
Saito, T; Ochiai, H
1999-10-01
cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
Ardui, Simon; Ameur, Adam; Vermeesch, Joris R; Hestand, Matthew S
2018-01-01
Abstract Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing. PMID:29401301
Rasmussen, C.; Purcell, M.K.; Gregg, J.L.; LaPatra, S.E.; Winton, J.R.; Hershberger, P.K.
2010-01-01
The mesomycetozoean parasite Ichthyophonus hoferi is most commonly associated with marine fish hosts but also occurs in some components of the freshwater rainbow trout Oncorhynchus mykiss aquaculture industry in Idaho, USA. It is not certain how the parasite was introduced into rainbow trout culture, but it might have been associated with the historical practice of feeding raw, ground common carp Cyprinus carpio that were caught by commercial fisherman. Here, we report a major genetic division between west coast freshwater and marine isolates of Ichthyophonus hoferi. Sequence differences were not detected in 2 regions of the highly conserved small subunit (18S) rDNA gene; however, nucleotide variation was seen in internal transcribed spacer loci (ITS1 and ITS2), both within and among the isolates. Intra-isolate variation ranged from 2.4 to 7.6 nucleotides over a region consisting of ~740 bp. Majority consensus sequences from marine/anadromous hosts differed in only 0 to 3 nucleotides (99.6 to 100% nucleotide identity), while those derived from freshwater rainbow trout had no nucleotide substitutions relative to each other. However, the consensus sequences between isolates from freshwater rainbow trout and those from marine/anadromous hosts differed in 13 to 16 nucleotides (97.8 to 98.2% nucleotide identity).
Accurate multiplex polony sequencing of an evolved bacterial genome.
Shendure, Jay; Porreca, Gregory J; Reppas, Nikos B; Lin, Xiaoxia; McCutcheon, John P; Rosenbaum, Abraham M; Wang, Michael D; Zhang, Kun; Mitra, Robi D; Church, George M
2005-09-09
We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents.
Composition for nucleic acid sequencing
Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY
2008-08-26
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules
Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu
2006-06-06
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules
Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu
2006-05-30
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC
NASA Astrophysics Data System (ADS)
Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.
2000-02-01
Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.
Interactive web-based identification and visualization of transcript shared sequences.
Azhir, Alaleh; Merino, Louis-Henri; Nauen, David W
2018-05-12
We have developed TraC (Transcript Consensus), a web-based tool for detecting and visualizing shared sequences among two or more mRNA transcripts such as splice variants. Results including exon-exon boundaries are returned in a highly intuitive, data-rich, interactive plot that permits users to explore the similarities and differences of multiple transcript sequences. The online tool (http://labs.pathology.jhu.edu/nauen/trac/) is free to use. The source code is freely available for download (https://github.com/nauenlab/TraC). Copyright © 2018 Elsevier Inc. All rights reserved.
EGVII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2014-02-25
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2006-05-16
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA
2008-04-01
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2010-10-12
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVIII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2006-05-23
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl8, and the corresponding EGVIII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVIII, recombinant EGVIII proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2010-10-05
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2006-06-06
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA
2009-05-05
The present invention provides an endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2013-07-16
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA
2012-02-14
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2015-04-14
The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
Roman, S; Gyawali, C P; Savarino, E; Yadlapati, R; Zerbib, F; Wu, J; Vela, M; Tutuian, R; Tatum, R; Sifrim, D; Keller, J; Fox, M; Pandolfino, J E; Bredenoord, A J
2017-10-01
An international group of experts evaluated and revised recommendations for ambulatory reflux monitoring for the diagnosis of gastro-esophageal reflux disease (GERD). Literature search was focused on indications and technical recommendations for GERD testing and phenotypes definitions. Statements were proposed and discussed during several structured meetings. Reflux testing should be performed after cessation of acid suppressive medication in patients with a low likelihood of GERD. In this setting, testing can be either catheter-based or wireless pH-monitoring or pH-impedance monitoring. In patients with a high probability of GERD (esophagitis grade C and D, histology proven Barrett's mucosa >1 cm, peptic stricture, previous positive pH monitoring) and persistent symptoms, pH-impedance monitoring should be performed on treatment. Recommendations are provided for data acquisition and analysis. Esophageal acid exposure is considered as pathological if acid exposure time (AET) is greater than 6% on pH testing. Number of reflux episodes and baseline impedance are exploratory metrics that may complement AET. Positive symptom reflux association is defined as symptom index (SI) >50% or symptom association probability (SAP) >95%. A positive symptom-reflux association in the absence of pathological AET defines hypersensitivity to reflux. The consensus group determined that grade C or D esophagitis, peptic stricture, histology proven Barrett's mucosa >1 cm, and esophageal acid exposure greater >6% are sufficient to define pathological GERD. Further testing should be considered when none of these criteria are fulfilled. © 2017 John Wiley & Sons Ltd.
Kit for detecting nucleic acid sequences using competitive hybridization probes
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
2001-01-01
A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.
Congenital analbuminemia caused by a novel aberrant splicing in the albumin gene.
Caridi, Gianluca; Dagnino, Monica; Erdeve, Omer; Di Duca, Marco; Yildiz, Duran; Alan, Serdar; Atasay, Begum; Arsan, Saadet; Campagnoli, Monica; Galliano, Monica; Minchiotti, Lorenzo
2014-01-01
Congenital analbuminemia is a rare autosomal recessive disorder manifested by the presence of a very low amount of circulating serum albumin. It is an allelic heterogeneous defect, caused by variety of mutations within the albumin gene in homozygous or compound heterozygous state. Herein we report the clinical and molecular characterization of a new case of congenital analbuminemia diagnosed in a female newborn of consanguineous (first degree cousins) parents from Ankara, Turkey, who presented with a low albumin concentration (< 8 g/L) and severe clinical symptoms. The albumin gene of the index case was screened by single-strand conformation polymorphism, heteroduplex analysis, and direct DNA sequencing. The effect of the splicing mutation was evaluated by examining the cDNA obtained by reverse transcriptase - polymerase chain reaction (RT-PCR) from the albumin mRNA extracted from proband's leukocytes. DNA sequencing revealed that the proband is homozygous, and both parents are heterozygous, for a novel G>A transition at position c.1652+1, the first base of intron 12, which inactivates the strongly conserved GT dinucleotide at the 5' splice site consensus sequence of this intron. The splicing defect results in the complete skipping of the preceding exon (exon 12) and in a frame-shift within exon 13 with a premature stop codon after the translation of three mutant amino acid residues. Our results confirm the clinical diagnosis of congenital analbuminemia in the proband and the inheritance of the trait and contribute to shed light on the molecular genetics of analbuminemia.
Ullah, Ihsan; Jang, Eun-Kyung; Kim, Min-Sung; Shin, Jin-Ho; Park, Gun-Seok; Khan, Abdur Rahim; Hong, Sung-Jun; Jung, Byung-Kwon; Choi, JungBae; Park, YeongJun; Kwak, Yunyoung; Shin, Jae-Ho
2014-01-01
Photorhabdus temperata is an entomopathogenic enterobacterium; it is a nematode symbiont that possesses pathogenicity islands involved in insect virulence. Herein, we constructed a P. temperata M1021 cosmid library in Escherichia coli XL1-Blue MRF` and obtained 7.14 × 105 clones. However, only 1020 physiologically active clones were screened for insect virulence factors by injection of each E. coli cosmid clone into Galleria mellonella and Tenebrio molitor larvae. A single cosmid clone, PtC1015, was consequently selected due to its characteristic virulent properties, e.g., loss of body turgor followed by death of larvae when the clone was injected into the hemocoel. The sequence alignment against the available sequences in Swiss-Prot and NCBI databases, confirmed the presence of the mcf gene homolog in the genome of P. temperata M1021 showing 85% homology and 98% query coverage with the P. luminescens counterpart. Furthermore, a 2932 amino acid long Mcf protein revealed limited similarity with three protein domains. The N-terminus of the Mcf encompassed consensus sequence for a BH3 domain, the central region revealed similarity to toxin B, and the C-terminus of Mcf revealed similarity to the bacterial export domain of ApxIVA, an RTX-like toxin. In short, the Mcf toxin is likely to play a role in the elimination of insect pests, making it a promising model for use in the agricultural field. PMID:25014195
Chen, W L; Luo, D F; Gao, C; Ding, Y; Wang, S Y
2015-07-01
The familial acute myeloid leukemia related factor gene (FAMLF) was previously identified from a familial AML subtractive cDNA library and shown to undergo alternative splicing. This study used real-time quantitative PCR to investigate the expression of the FAMLF alternative-splicing transcript consensus sequence (FAMLF-CS) in peripheral blood mononuclear cells (PBMCs) from 119 patients with de novo acute leukemia (AL) and 104 healthy controls, as well as in CD34+ cells from 12 AL patients and 10 healthy donors. A 429-bp fragment from a novel splicing variant of FAMLF was obtained, and a 363-bp consensus sequence was targeted to quantify total FAMLF expression. Kruskal-Wallis, Nemenyi, Spearman's correlation, and Mann-Whitney U-tests were used to analyze the data. FAMLF-CS expression in PBMCs from AL patients and CD34+ cells from AL patients and controls was significantly higher than in control PBMCs (P < 0.0001). Moreover, FAMLF-CS expression in PBMCs from the AML group was positively correlated with red blood cell count (rs =0.317, P=0.006), hemoglobin levels (rs = 0.210, P = 0.049), and percentage of peripheral blood blasts (rs = 0.256, P = 0.027), but inversely correlated with hemoglobin levels in the control group (rs = -0.391, P < 0.0001). AML patients with high CD34+ expression showed significantly higher FAMLF-CS expression than those with low CD34+ expression (P = 0.041). Our results showed that FAMLF is highly expressed in both normal and malignant immature hematopoietic cells, but that expression is lower in normal mature PBMCs.
Khalil, Farghama; Yueyu, Xu; Naiyan, Xiao; Di, Liu; Tayyab, Muhammad; Hengbo, Wang; Islam, Waqar; Rauf, Saeed; Pinghua, Chen
2018-05-04
Sugarcane is an essential crop for sugar and biofuel. Globally, its production is severely affected by sugarcane yellow leaf disease (SCYLD) caused by Sugarcane Yellow Leaf Virus (SCYLV). Many aphid vectors are involved in the spread of the disease which reduced the effectiveness of cultural and chemical management. Empirical methods of plant breeding such as introgression from wild and cultivated germplasm were not possible or at least challenging due to the absence of resistance in cultivated and wild germplasm of sugarcane. RNA interference (RNAi) transformation is an effective method to create virus-resistant varieties. Nevertheless, limited progress has been made due to lack of comprehensive research program on SCYLV based on RNAi technique. In order to show improvement and to propose future strategies for the feasibility of the RNAi technique to cope SCYLV, genome-wide consensus sequences of SCYLV were analyzed through GenBank. The coverage rates of every consensus sequence in SCYLV isolates were calculated to evaluate their practicability. Our analysis showed that single consensus sequence from SCYLV could not work well for RNAi based sugarcane breeding programs. This may be due to high mutation rate and continuous recombination within and between various viral strains. Alternative multi-target RNAi strategy is suggested to combat several strains of the viruses and to reduce the silencing escape. The multi-target small interfering RNA (siRNA) can be used together to construct RNAi plant expression plasmid, and to transform sugarcane tissues to develop new sugarcane varieties resistant to SCYLV. Copyright © 2018 Elsevier Ltd. All rights reserved.
Bobes, Raúl J.; Navarrete-Perea, José; Ochoa-Leyva, Adrián; Anaya, Víctor Hugo; Hernández, Marisela; Cervantes-Torres, Jacquelynne; Estrada, Karel; Sánchez-Lopez, Filiberto; Soberón, Xavier; Rosas, Gabriela; Nunes, Cáris Maroni; García-Varela, Martín; Sotelo-Mundo, Rogerio Rafael; López-Zavala, Alonso Alexis; Gevorkian, Goar; Acero, Gonzalo; Laclette, Juan P.; Fragoso, Gladis
2017-01-01
ABSTRACT Taenia solium cysticercosis, a parasitic disease that affects human health in various regions of the world, is preventable by vaccination. Both the 97-amino-acid-long KETc7 peptide and its carboxyl-terminal, 18-amino-acid-long sequence (GK-1) are found in Taenia crassiceps. Both peptides have proven protective capacity against cysticercosis and are part of the highly conserved, cestode-native, 264-amino-acid long protein KE7. KE7 belongs to a ubiquitously distributed family of proteins associated with membrane processes and may participate in several vital cell pathways. The aim of this study was to identify the T. solium KE7 (TsKE7) full-length protein and to determine its immunogenic properties. Recombinant TsKE7 (rTsKE7) was expressed in Escherichia coli Rosetta2 cells and used to obtain mouse polyclonal antibodies. Anti-rTsKE7 antibodies detected the expected native protein among the 350 spots developed from T. solium cyst vesicular fluid in a mass spectrometry-coupled immune proteomic analysis. These antibodies were then used to screen a phage-displayed 7-random-peptide library to map B-cell epitopes. The recognized phages displayed 9 peptides, with the consensus motif Y(F/Y)PS sequence, which includes YYYPS (named GK-1M, for being a GK-1 mimotope), exactly matching a part of GK-1. GK-1M was recognized by 58% of serum samples from cysticercotic pigs with 100% specificity but induced weak protection against murine cysticercosis. In silico analysis revealed a universal T-cell epitope(s) in native TsKE7 potentially capable of stimulating cytotoxic T lymphocytes and helper T lymphocytes under different major histocompatibility complex class I and class II mouse haplotypes. Altogether, these results provide a rationale for the efficacy of the KETc7, rTsKE7, and GK-1 peptides as vaccines. PMID:28923896
Bobes, Raúl J; Navarrete-Perea, José; Ochoa-Leyva, Adrián; Anaya, Víctor Hugo; Hernández, Marisela; Cervantes-Torres, Jacquelynne; Estrada, Karel; Sánchez-Lopez, Filiberto; Soberón, Xavier; Rosas, Gabriela; Nunes, Cáris Maroni; García-Varela, Martín; Sotelo-Mundo, Rogerio Rafael; López-Zavala, Alonso Alexis; Gevorkian, Goar; Acero, Gonzalo; Laclette, Juan P; Fragoso, Gladis; Sciutto, Edda
2017-12-01
Taenia solium cysticercosis, a parasitic disease that affects human health in various regions of the world, is preventable by vaccination. Both the 97-amino-acid-long KETc7 peptide and its carboxyl-terminal, 18-amino-acid-long sequence (GK-1) are found in Taenia crassiceps Both peptides have proven protective capacity against cysticercosis and are part of the highly conserved, cestode-native, 264-amino-acid long protein KE7. KE7 belongs to a ubiquitously distributed family of proteins associated with membrane processes and may participate in several vital cell pathways. The aim of this study was to identify the T. solium KE7 (TsKE7) full-length protein and to determine its immunogenic properties. Recombinant TsKE7 (rTsKE7) was expressed in Escherichia coli Rosetta2 cells and used to obtain mouse polyclonal antibodies. Anti-rTsKE7 antibodies detected the expected native protein among the 350 spots developed from T. solium cyst vesicular fluid in a mass spectrometry-coupled immune proteomic analysis. These antibodies were then used to screen a phage-displayed 7-random-peptide library to map B-cell epitopes. The recognized phages displayed 9 peptides, with the consensus motif Y(F/Y)PS sequence, which includes YYYPS (named GK-1M, for being a GK-1 mimotope), exactly matching a part of GK-1. GK-1M was recognized by 58% of serum samples from cysticercotic pigs with 100% specificity but induced weak protection against murine cysticercosis. In silico analysis revealed a universal T-cell epitope(s) in native TsKE7 potentially capable of stimulating cytotoxic T lymphocytes and helper T lymphocytes under different major histocompatibility complex class I and class II mouse haplotypes. Altogether, these results provide a rationale for the efficacy of the KETc7, rTsKE7, and GK-1 peptides as vaccines. Copyright © 2017 American Society for Microbiology.
Molecular genetic basis for fluoroquinolone-induced retinal degeneration in cats.
Ramirez, Christina J; Minch, Jonathan D; Gay, John M; Lahmers, Sunshine M; Guerra, Dan J; Haldorson, Gary J; Schneider, Terri; Mealey, Katrina L
2011-02-01
Distribution of fluoroquinolones to the retina is normally restricted by ABCG2 at the blood-retinal barrier. As the cat develops a species-specific adverse reaction to photoreactive fluoroquinolones, our goal was to investigate ABCG2 as a candidate gene for fluoroquinolone-induced retinal degeneration and blindness in cats. Feline ABCG2 was sequenced and the consensus amino acid sequence was compared with that of 10 other mammalian species. Expression of ABCG2 in feline retina was assessed by immunoblot. cDNA constructs for feline and human ABCG2 were constructed in a pcDNA3 expression vector and expressed in HEK-293 cells, and ABCG2 expression was analyzed by western blot and immunofluorescence. Mitoxantrone and BODIPY-prazosin efflux measured by flow cytometry and a phototoxicity assay were used to assess feline and human ABCG2 function. Four feline-specific (compared with 10 other mammalian species) amino acid changes in conserved regions of ABCG2 were identified. Expression of ABCG2 on plasma membranes was confirmed in feline retina and in cells transfected with human and feline ABCG2, although some intracellular expression of feline ABCG2 was detected by immunofluorescence. Function of feline ABCG2, compared with human ABCG2, was found to be deficient as determined by flow cytometric measurement of mitoxantrone and BODIPY-prazosin efflux and enrofloxacin-induced phototoxicity assays. Feline-specific amino acid changes in ABCG2 cause a functional defect of the transport protein in cats. This functional defect may be owing, in part, to defective cellular localization of feline ABCG2. Regardless, dysfunction of ABCG2 at the blood-retinal barrier likely results in accumulation of photoreactive fluoroquinolones in feline retina. Exposure of the retina to light would then generate reactive oxygen species that would cause the characteristic retinal degeneration and blindness documented in some cats receiving high doses of some fluoroquinolones. Pharmacological inhibition of ABCG2 in other species might result in retinal damage if fluoroquinolones are concurrently administered.
Chang, M X; Nie, P; Xie, H X; Sun, B J; Gao, Q
2005-01-01
The cDNAs and genes of two different types of leucine-rich repeat-containing proteins from grass carp (Ctenopharyngodon idellus) were cloned. Homology search revealed that the two genes, designated as GC-GARP and GC-LRG, have 37% and 32% deduced amino-acid sequence similarities with human glycoprotein A repetitions predominant precursor (GARP) and leucine-rich alpha2-glycoprotein (LRG), respectively. The cDNAs of GC-GARP and GC-LRG encoded 664 and 339 amino acid residues, respectively. GC-GARP and GC-LRG contain many distinct structural and/or functional motifs of the leucine-rich repeat (LRR) subfamily, such as multiple conserved 11-residue segments with the consensus sequence LxxLxLxxN/CxL (x can be any amino acid). The genes GC-GARP and GC-LRG consist of two exons, with 4,782 bp and 2,119 bp in total length, respectively. The first exon of each gene contains a small 5'-untranslated region and partial open reading frame. The putative promoter region of GC-GARP was found to contain transcription factor binding sites for GATA-1, IRF4, Oct-1, IRF-7, IRF-1, AP1, GATA-box and NFAT, and the promoter region of GC-LRG for MYC-MAX, MEIS1, ISRE, IK3, HOXA9 and C/EBP alpha. Phylogenetic analysis showed that GC-GARP and mammalian GARPs were clustered into one branch, while GC-LRG and mammalian LRGs were in another branch. The GC-GARP gene was only detected in head kidney, and GC-LRG in the liver, spleen and heart in the copepod (Sinergasilus major)-infected grass carp, indicating the induction of gene expression by the parasite infection. The results obtained in the present study provide insight into the structure of fish LRR genes, and further study should be carried out to understand the importance of LRR proteins in host-pathogen interactions.
Molecular diagnosis in clinical parasitology: when and why?
Wong, Samson S Y; Fung, Kitty S C; Chau, Sandy; Poon, Rosana W S; Wong, Sally C Y; Yuen, Kwok-Yung
2014-11-01
Microscopic detection and morphological identification of parasites from clinical specimens are the gold standards for the laboratory diagnosis of parasitic infections. The limitations of such diagnostic assays include insufficient sensitivity and operator dependence. Immunoassays for parasitic antigens are not available for most parasitic infections and have not significantly improved the sensitivity of laboratory detection. Advances in molecular detection by nucleic acid amplification may improve the detection in asymptomatic infections with low parasitic burden. Rapidly accumulating genomic data on parasites allow the design of polymerase chain reaction (PCR) primers directed towards multi-copy gene targets, such as the ribosomal and mitochondrial genes, which further improve the sensitivity. Parasitic cell or its free circulating parasitic DNA can be shed from parasites into blood and excreta which may allow its detection without the whole parasite being present within the portion of clinical sample used for DNA extraction. Multiplex nucleic acid amplification technology allows the simultaneous detection of many parasitic species within a single clinical specimen. In addition to improved sensitivity, nucleic acid amplification with sequencing can help to differentiate different parasitic species at different stages with similar morphology, detect and speciate parasites from fixed histopathological sections and identify anti-parasitic drug resistance. The use of consensus primer and PCR sequencing may even help to identify novel parasitic species. The key limitation of molecular detection is the technological expertise and expense which are usually lacking in the field setting at highly endemic areas. However, such tests can be useful for screening important parasitic infections in asymptomatic patients, donors or recipients coming from endemic areas in the settings of transfusion service or tertiary institutions with transplantation service. Such tests can also be used for monitoring these recipients or highly immunosuppressed patients, so that early preemptive treatment can be given for reactivated parasitic infections while the parasitic burden is still low. © 2014 by the Society for Experimental Biology and Medicine.
Chip-based sequencing nucleic acids
Beer, Neil Reginald
2014-08-26
A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.
LongISLND: in silico sequencing of lengthy and noisy datatypes
Lau, Bayo; Mohiyuddin, Marghoob; Mu, John C.; Fang, Li Tai; Bani Asadi, Narges; Dallett, Carolina; Lam, Hugo Y. K.
2016-01-01
Summary: LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling. Availability and Implementation: LongISLND is implemented in Java and available at http://bioinform.github.io/longislnd Contact: hugo.lam@roche.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27667791
Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F M; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L; Glocker, Michael O
2015-03-01
Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein G´ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α-N-gluconoylation and α-N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α-N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant (K(d)) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins.
Thomsen, Martin Christen Frølund; Nielsen, Morten
2012-01-01
Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583
Khan, Ibrar; Qayyum, Sadia; Ahmed, Shehzad; Maqbool, Farhana; Tauseef, Isfahan; Haleem, Kashif Syed; Chi, Zhen-Ming
2017-03-20
In this study, a pyruvate carboxylase gene (PYC) from a marine fungus Penicillium viticola 152 isolated from marine algae was cloned and characterized by using Genome Walking method. An open reading frame (ORF) of The PYC gene (accession number: KM593097) had 3582bp encoding 1193 amino acid protein (isoelectric point: 5.01) with a calculated molecular weight of 131.2757kDa. A putative promoter (intronless) of the gene was located at -666bp and contained a TATA box, several CAAT boxes, the 5'-SYGGRG-3' and a 5'-HGATAR-3' sequences. A consensus polyadenylation site (AATAAA) was also observed at +10bp downstream of the ORF. The protein deduced from the PYC gene had no signal peptide, was a homotetramer (4), and had the four functional domains. Furthermore, PYC protein also had three potential N-linked glycosylation sites, among them, -N-S-T-I- at 36 amino acid, -N-G-T-V- at 237 amino acid, and -N-G-S-S- at 517 amino acid were the most possible N-glycosylation sites. After expression of the PYC gene of P. viticola 152 in medium supplemented with CSL and biotin, it was found that the specific pyruvate carboxylase activity in MA production medium supplemented with CSL was much higher (0.5U/mg) than in MA medium supplemented with biotin (0.3U/mg), suggesting that optimal concentration of CSL is required for increased expression of the PYC gene, which is responsible for high level production of malic acid in P. viticola 152 strain. Copyright © 2016 Elsevier B.V. All rights reserved.
Integrated consensus genetic and physical maps of flax (Linum usitatissimum L.).
Cloutier, Sylvie; Ragupathy, Raja; Miranda, Evelyn; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Ward, Kerry; Rowland, Gordon; Duguid, Scott; Banik, Mitali
2012-12-01
Three linkage maps of flax (Linum usitatissimum L.) were constructed from populations CDC Bethune/Macbeth, E1747/Viking and SP2047/UGG5-5 containing between 385 and 469 mapped markers each. The first consensus map of flax was constructed incorporating 770 markers based on 371 shared markers including 114 that were shared by all three populations and 257 shared between any two populations. The 15 linkage group map corresponds to the haploid number of chromosomes of this species. The marker order of the consensus map was largely collinear in all three individual maps but a few local inversions and marker rearrangements spanning short intervals were observed. Segregation distortion was present in all linkage groups which contained 1-52 markers displaying non-Mendelian segregation. The total length of the consensus genetic map is 1,551 cM with a mean marker density of 2.0 cM. A total of 670 markers were anchored to 204 of the 416 fingerprinted contigs of the physical map corresponding to ~274 Mb or 74 % of the estimated flax genome size of 370 Mb. This high resolution consensus map will be a resource for comparative genomics, genome organization, evolution studies and anchoring of the whole genome shotgun sequence.
Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi
2010-01-01
Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279–284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614
Johnson, J A; Parra, G I; Levenson, E A; Green, K Y
2017-06-01
Historical outbreaks can be an important source of information in the understanding of norovirus evolution and epidemiology. Here, we revisit an outbreak of undiagnosed gastroenteritis that occurred in Shippensburg, Pennsylvania in 1972. Nearly 5000 people fell ill over the course of 10 days. Symptoms included diarrhea, vomiting, stomach cramps, and fever, lasting for a median of 24 h. Using current techniques, including next-generation sequencing of full-length viral genomic amplicons, we identified an unusual norovirus recombinant (GII.Pg/GII.3) in nine of 15 available stool samples from the outbreak. This particular recombinant virus has not been reported in recent decades, although GII.3 and GII.Pg genotypes have been detected individually in current epidemic strains. The consensus nucleotide sequences were nearly identical among the four viral genomes analysed, although each strain had three to seven positions in the genome with heterogenous non-synonymous nucleotide subpopulations. Two of these resulting amino acid polymorphisms were conserved in frequency among all four cases, consistent with common source exposure and successful transmission of a mixed viral population. Continued investigation of variant nucleotide populations and recombination events among ancestral norovirus strains such as the Shippensburg virus may provide unique insight into the origin of contemporary strains.
Consensus proposals for classification of the family Hepeviridae.
Smith, Donald B; Simmonds, Peter; Jameel, Shahid; Emerson, Suzanne U; Harrison, Tim J; Meng, Xiang-Jin; Okamoto, Hiroaki; Van der Poel, Wim H M; Purdy, Michael A
2014-10-01
The family Hepeviridae consists of positive-stranded RNA viruses that infect a wide range of mammalian species, as well as chickens and trout. A subset of these viruses infects humans and can cause a self-limiting acute hepatitis that may become chronic in immunosuppressed individuals. Current published descriptions of the taxonomical divisions within the family Hepeviridae are contradictory in relation to the assignment of species and genotypes. Through analysis of existing sequence information, we propose a taxonomic scheme in which the family is divided into the genera Orthohepevirus (all mammalian and avian hepatitis E virus (HEV) isolates) and Piscihepevirus (cutthroat trout virus). Species within the genus Orthohepevirus are designated Orthohepevirus A (isolates from human, pig, wild boar, deer, mongoose, rabbit and camel), Orthohepevirus B (isolates from chicken), Orthohepevirus C (isolates from rat, greater bandicoot, Asian musk shrew, ferret and mink) and Orthohepevirus D (isolates from bat). Proposals are also made for the designation of genotypes within the human and rat HEVs. This hierarchical system is congruent with hepevirus phylogeny, and the three classification levels (genus, species and genotype) are consistent with, and reflect discontinuities in the ranges of pairwise distances between amino acid sequences. Adoption of this system would include the avoidance of host names in taxonomic identifiers and provide a logical framework for the assignment of novel variants.
Acylation-dependent protein export in Leishmania.
Denny, P W; Gokool, S; Russell, D G; Field, M C; Smith, D F
2000-04-14
The surface of the protozoan parasite Leishmania is unusual in that it consists predominantly of glycosylphosphatidylinositol-anchored glycoconjugates and proteins. Additionally, a family of hydrophilic acylated surface proteins (HASPs) has been localized to the extracellular face of the plasma membrane in infective parasite stages. These surface polypeptides lack a recognizable endoplasmic reticulum secretory signal sequence, transmembrane spanning domain, or glycosylphosphatidylinositol-anchor consensus sequence, indicating that novel mechanisms are involved in their transport and localization. Here, we show that the N-terminal domain of HASPB contains primary structural information that directs both N-myristoylation and palmitoylation and is essential for correct localization of the protein to the plasma membrane. Furthermore, the N-terminal 18 amino acids of HASPB, encoding the dual acylation site, are sufficient to target the heterologous Aequorea victoria green fluorescent protein to the cell surface of Leishmania. Mutagenesis of the predicted acylated residues confirms that modification by both myristate and palmitate is required for correct trafficking. These data suggest that HASPB is a representative of a novel class of proteins whose translocation onto the surface of eukaryotic cells is dependent upon a "non-classical" pathway involving N-myristoylation/palmitoylation. Significantly, HASPB is also translocated on to the extracellular face of the plasma membrane of transfected mammalian cells, indicating that the export signal for HASPB is recognized by a higher eukaryotic export mechanism.
Jamroz, E; Paprocka, J; Sokół, M; Popowska, E; Ciara, E
2013-01-01
Ornithine transcarbamylase (OTC) deficiency, an X-linked, semidominant disorder, is the most common inherited de-fect in ureagenesis, resulting in hyperammonaemia type II. The OTC gene, localised on chromosome X, has been mapp-ed to band Xp21.1, proximate to the Duchenne muscular dystrophy (DMD) gene. More than 350 different mutations, including missense, nonsense, splice-site changes, small de-letions or insertions and gross deletions, have been describ-ed so far. Almost all mutations in consensus splicing sites confer a neonatal phenotype. Most mutations in the OTC gene are 'private' and are distributed throughout the gene with a paucity of mutation in the sequence encoding the leader peptide (exon 1 and beginning of exon 2) and in exon 7. They have familial origin or occur de novo. Even with sequencing of the entire reading frame and exon/intron boundaries, only about 80% of the mutations are detected in patients with proven OTC deficiency. The remainder probably occur within the introns or in regulatory domains. The authors present a 4-year-old boy with the unreported missense mutation c.802A>G. The nucleotide transition leads to amino acid substitution Met to Val at codon 268 of the OTC protein.
Galeano, Carlos H.; Fernandez, Andrea C.; Franco-Herrera, Natalia; Cichy, Karen A.; McClean, Phillip E.; Vanderleyden, Jos; Blair, Matthew W.
2011-01-01
Map-based cloning and fine mapping to find genes of interest and marker assisted selection (MAS) requires good genetic maps with reproducible markers. In this study, we saturated the linkage map of the intra-gene pool population of common bean DOR364×BAT477 (DB) by evaluating 2,706 molecular markers including SSR, SNP, and gene-based markers. On average the polymorphism rate was 7.7% due to the narrow genetic base between the parents. The DB linkage map consisted of 291 markers with a total map length of 1,788 cM. A consensus map was built using the core mapping populations derived from inter-gene pool crosses: DOR364×G19833 (DG) and BAT93×JALO EEP558 (BJ). The consensus map consisted of a total of 1,010 markers mapped, with a total map length of 2,041 cM across 11 linkage groups. On average, each linkage group on the consensus map contained 91 markers of which 83% were single copy markers. Finally, a synteny analysis was carried out using our highly saturated consensus maps compared with the soybean pseudo-chromosome assembly. A total of 772 marker sequences were compared with the soybean genome. A total of 44 syntenic blocks were identified. The linkage group Pv6 presented the most diverse pattern of synteny with seven syntenic blocks, and Pv9 showed the most consistent relations with soybean with just two syntenic blocks. Additionally, a co-linear analysis using common bean transcript map information against soybean coding sequences (CDS) revealed the relationship with 787 soybean genes. The common bean consensus map has allowed us to map a larger number of markers, to obtain a more complete coverage of the common bean genome. Our results, combined with synteny relationships provide tools to increase marker density in selected genomic regions to identify closely linked polymorphic markers for indirect selection, fine mapping or for positional cloning. PMID:22174773
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reiser, Steven E.; Somerville, Chris R.
The present invention relates to bacterial enzymes, in particular to an acyl-CoA reductase and a gene encoding an acyl-CoA reductase, the amino acid and nucleic acid sequences corresponding to the reductase polypeptide and gene, respectively, and to methods of obtaining such enzymes, amino acid sequences and nucleic acid sequences. The invention also relates to the use of such sequences to provide transgenic host cells capable of producing fatty alcohols and fatty aldehydes.
RNA-Seq analysis and transcriptome assembly for blackberry (Rubus sp. Var. Lochness) fruit.
Garcia-Seco, Daniel; Zhang, Yang; Gutierrez-Mañero, Francisco J; Martin, Cathie; Ramos-Solano, Beatriz
2015-01-22
There is an increasing interest in berries, especially blackberries in the diet, because of recent reports of their health benefits due to their high content of flavonoids. A broad range of genomic tools are available for other Rosaceae species but these tools are still lacking in the Rubus genus, thus limiting gene discovery and the breeding of improved varieties. De novo RNA-seq of ripe blackberries grown under field conditions was performed using Illumina Hiseq 2000. Almost 9 billion nucleotide bases were sequenced in total. Following assembly, 42,062 consensus sequences were detected. For functional annotation, 33,040 (NR), 32,762 (NT), 21,932 (Swiss-Prot), 20,134 (KEGG), 13,676 (COG), 24,168 (GO) consensus sequences were annotated using different databases; in total 34,552 annotated sequences were identified. For protein prediction analysis, the number of coding DNA sequences (CDS) that mapped to the protein database was 32,540. Non redundant (NR), annotation showed that 25,418 genes (73.5%) has the highest similarity with Fragaria vesca subspecies vesca. Reanalysis was undertaken by aligning the reads with this reference genome for a deeper analysis of the transcriptome. We demonstrated that de novo assembly, using Trinity and later annotation with Blast using different databases, were complementary to alignment to the reference sequence using SOAPaligner/SOAP2. The Fragaria reference genome belongs to a species in the same family as blackberry (Rosaceae) but to a different genus. Since blackberries are tetraploids, the possibility of artefactual gene chimeras resulting from mis-assembly was tested with one of the genes sequenced by RNAseq, Chalcone Synthase (CHS). cDNAs encoding this protein were cloned and sequenced. Primers designed to the assembled sequences accurately distinguished different contigs, at least for chalcone synthase genes. We prepared and analysed transcriptome data from ripe blackberries, for which prior genomic information was limited. This new sequence information will improve the knowledge of this important and healthy fruit, providing an invaluable new tool for biological research.
Viral Linkage in HIV-1 Seroconverters and Their Partners in an HIV-1 Prevention Clinical Trial
Campbell, Mary S.; Mullins, James I.; Hughes, James P.; Celum, Connie; Wong, Kim G.; Raugi, Dana N.; Sorensen, Stefanie; Stoddard, Julia N.; Zhao, Hong; Deng, Wenjie; Kahle, Erin; Panteleeff, Dana; Baeten, Jared M.; McCutchan, Francine E.; Albert, Jan; Leitner, Thomas; Wald, Anna; Corey, Lawrence; Lingappa, Jairam R.
2011-01-01
Background Characterization of viruses in HIV-1 transmission pairs will help identify biological determinants of infectiousness and evaluate candidate interventions to reduce transmission. Although HIV-1 sequencing is frequently used to substantiate linkage between newly HIV-1 infected individuals and their sexual partners in epidemiologic and forensic studies, viral sequencing is seldom applied in HIV-1 prevention trials. The Partners in Prevention HSV/HIV Transmission Study (ClinicalTrials.gov #NCT00194519) was a prospective randomized placebo-controlled trial that enrolled serodiscordant heterosexual couples to determine the efficacy of genital herpes suppression in reducing HIV-1 transmission; as part of the study analysis, HIV-1 sequences were examined for genetic linkage between seroconverters and their enrolled partners. Methodology/Principal Findings We obtained partial consensus HIV-1 env and gag sequences from blood plasma for 151 transmission pairs and performed deep sequencing of env in some cases. We analyzed sequences with phylogenetic techniques and developed a Bayesian algorithm to evaluate the probability of linkage. For linkage, we required monophyletic clustering between enrolled partners' sequences and a Bayesian posterior probability of ≥50%. Adjudicators classified each seroconversion, finding 108 (71.5%) linked, 40 (26.5%) unlinked, and 3 (2.0%) indeterminate transmissions, with linkage determined by consensus env sequencing in 91 (84%). Male seroconverters had a higher frequency of unlinked transmissions than female seroconverters. The likelihood of transmission from the enrolled partner was related to time on study, with increasing numbers of unlinked transmissions occurring after longer observation periods. Finally, baseline viral load was found to be significantly higher among linked transmitters. Conclusions/Significance In this first use of HIV-1 sequencing to establish endpoints in a large clinical trial, more than one-fourth of transmissions were unlinked to the enrolled partner, illustrating the relevance of these methods in the design of future HIV-1 prevention trials in serodiscordant couples. A hierarchy of sequencing techniques, analysis methods, and expert adjudication contributed to the linkage determination process. PMID:21399681
BGL7 beta-glucosidase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Ward, Michael
2013-01-29
The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL6 .beta.-glucosidase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Ward, Michael
2012-10-02
The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL5 .beta.-glucosidase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian
2006-02-28
The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
BGL5 .beta.-glucosidase and nucleic acids encoding the same
Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA
2008-03-18
The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunn-Coleman, Nigel; Ward, Michael
The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same
Dunn-Coleman, Nigel; Ward, Michael
2014-03-04
The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.