Sample records for residues sequence analysis

  1. PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.

    PubMed

    Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S

    2007-10-11

    By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.

  2. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    PubMed Central

    Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting

    2016-01-01

    ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181

  3. Active Site Characterization of Proteases Sequences from Different Species of Aspergillus.

    PubMed

    Morya, V K; Yadav, Virendra K; Yadav, Sangeeta; Yadav, Dinesh

    2016-09-01

    A total of 129 proteases sequences comprising 43 serine proteases, 36 aspartic proteases, 24 cysteine protease, 21 metalloproteases, and 05 neutral proteases from different Aspergillus species were analyzed for the catalytically active site residues using MEROPS database and various bioinformatics tools. Different proteases have predominance of variable active site residues. In case of 24 cysteine proteases of Aspergilli, the predominant active site residues observed were Gln193, Cys199, His364, Asn384 while for 43 serine proteases, the active site residues namely Asp164, His193, Asn284, Ser349 and Asp325, His357, Asn454, Ser519 were frequently observed. The analysis of 21 metalloproteases of Aspergilli revealed Glu298 and Glu388, Tyr476 as predominant active site residues. In general, Aspergilli species-specific active site residues were observed for different types of protease sequences analyzed. The phylogenetic analysis of these 129 proteases sequences revealed 14 different clans representing different types of proteases with diverse active site residues.

  4. Sequence Complexity of Amyloidogenic Regions in Intrinsically Disordered Human Proteins

    PubMed Central

    Das, Swagata; Pal, Uttam; Das, Supriya; Bagga, Khyati; Roy, Anupam; Mrigwani, Arpita; Maiti, Nakul C.

    2014-01-01

    An amyloidogenic region (AR) in a protein sequence plays a significant role in protein aggregation and amyloid formation. We have investigated the sequence complexity of AR that is present in intrinsically disordered human proteins. More than 80% human proteins in the disordered protein databases (DisProt+IDEAL) contained one or more ARs. With decrease of protein disorder, AR content in the protein sequence was decreased. A probability density distribution analysis and discrete analysis of AR sequences showed that ∼8% residue in a protein sequence was in AR and the region was in average 8 residues long. The residues in the AR were high in sequence complexity and it seldom overlapped with low complexity regions (LCR), which was largely abundant in disorder proteins. The sequences in the AR showed mixed conformational adaptability towards α-helix, β-sheet/strand and coil conformations. PMID:24594841

  5. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    PubMed

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.

  6. Characterizing protein conformations by correlation analysis of coarse-grained contact matrices.

    PubMed

    Lindsay, Richard J; Siess, Jan; Lohry, David P; McGee, Trevor S; Ritchie, Jordan S; Johnson, Quentin R; Shen, Tongye

    2018-01-14

    We have developed a method to capture the essential conformational dynamics of folded biopolymers using statistical analysis of coarse-grained segment-segment contacts. Previously, the residue-residue contact analysis of simulation trajectories was successfully applied to the detection of conformational switching motions in biomolecular complexes. However, the application to large protein systems (larger than 1000 amino acid residues) is challenging using the description of residue contacts. Also, the residue-based method cannot be used to compare proteins with different sequences. To expand the scope of the method, we have tested several coarse-graining schemes that group a collection of consecutive residues into a segment. The definition of these segments may be derived from structural and sequence information, while the interaction strength of the coarse-grained segment-segment contacts is a function of the residue-residue contacts. We then perform covariance calculations on these coarse-grained contact matrices. We monitored how well the principal components of the contact matrices is preserved using various rendering functions. The new method was demonstrated to assist the reduction of the degrees of freedom for describing the conformation space, and it potentially allows for the analysis of a system that is approximately tenfold larger compared with the corresponding residue contact-based method. This method can also render a family of similar proteins into the same conformational space, and thus can be used to compare the structures of proteins with different sequences.

  7. Characterizing protein conformations by correlation analysis of coarse-grained contact matrices

    NASA Astrophysics Data System (ADS)

    Lindsay, Richard J.; Siess, Jan; Lohry, David P.; McGee, Trevor S.; Ritchie, Jordan S.; Johnson, Quentin R.; Shen, Tongye

    2018-01-01

    We have developed a method to capture the essential conformational dynamics of folded biopolymers using statistical analysis of coarse-grained segment-segment contacts. Previously, the residue-residue contact analysis of simulation trajectories was successfully applied to the detection of conformational switching motions in biomolecular complexes. However, the application to large protein systems (larger than 1000 amino acid residues) is challenging using the description of residue contacts. Also, the residue-based method cannot be used to compare proteins with different sequences. To expand the scope of the method, we have tested several coarse-graining schemes that group a collection of consecutive residues into a segment. The definition of these segments may be derived from structural and sequence information, while the interaction strength of the coarse-grained segment-segment contacts is a function of the residue-residue contacts. We then perform covariance calculations on these coarse-grained contact matrices. We monitored how well the principal components of the contact matrices is preserved using various rendering functions. The new method was demonstrated to assist the reduction of the degrees of freedom for describing the conformation space, and it potentially allows for the analysis of a system that is approximately tenfold larger compared with the corresponding residue contact-based method. This method can also render a family of similar proteins into the same conformational space, and thus can be used to compare the structures of proteins with different sequences.

  8. Direct identification of non-polio enteroviruses in residual paralysis cases by analysis of VP1 sequences.

    PubMed

    Rahimi, Pooneh; Tabatabaie, H; Gouya, Mohammad M; Mahmudi, M; Musavi, T; Rad, K Samimi; Azad, T Mokhtari; Nategh, R

    2009-06-01

    The 66 serotypes of human enteroviruses (EVs) are classified into four species A-D, based on phylogenetic relationships in multiple genome regions. Partial VP(1) amplification and sequence analysis are reliable methods for identifying non-polio enterovirus serotypes, especially in negative cell culture specimens from patients with residual paralysis. In Iran during the years 2000-2002, there were 29 residual paralysis cases with negative cell (RD, HEp(2) and L(20)B) culture results. The genomic RNA was extracted from stool specimens from cases of residual paralysis and detected by amplification of the 5'-nontranslated region using RT-PCR with Pan-EV primers. Partial VP(1) amplification by semi-nested RT-PCR (snRT-PCR) and sequence analysis were done. Specimens from the 29 culture-negative cases contained echoviruses of six different serotypes. The global eradication of wild polioviruses is near and study of non-polio enteroviruses, which can cause poliomyelitis, is increasingly important to understand their pathogenesis. The VP(1) sequences, derived from the snRT-PCR products, allowed rapid molecular analysis of these non-polio strains.

  9. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    PubMed

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  10. Complete amino acid sequence of the myoglobin from the Pacific sei whale, Balaenoptera borealis.

    PubMed

    Jones, B N; Rothgeb, T M; England, R D; Gurd, F R

    1979-04-25

    The complete amino acid sequence of the major component myoglobin from Pacific sei whale, Balaenoptera borealis, was determined by specific cleavage of the protein to obtain large peptides which are readily degraded by the automatic sequencer. The acetimidated apomyoglobin was selectively cleaved at its two methionyl residues with cyanogen bromide and at its three arginyl residues by trypsin. From the sequence analysis of four of these peptides and the apomyoglobin, over 75% of the covalent structure of the protein was obtained. The remainder of the primary structure was determined by the sequence analysis of peptides that resulted from further digestion of the amino-terminal and central cyanogen bromide fragments. The amino-terminal fragment was specifically cleaved at its two tryptophanyl residues with N-chlorosuccinimide and the central cyanogen bromide fragment was cleaved at its glutamyl residues with staphylococcal protease and at its single tyrosyl residue with N-bromosuccinimide. The primary structure of this myoglobin proved identical with that from the gray whale but differs from that of the finback whale at four positions, from that of the minke whale at three positions and from the myoglobin of the humpback whale at one position. The above sequence identities and differences reflect the close taxonomic relationship of these five species of Cetacea.

  11. Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

    PubMed Central

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-01-01

    Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389

  12. Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

    PubMed

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-12-27

    Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

  13. Potential ligand-binding residues in rat olfactory receptors identified by correlated mutation analysis

    NASA Technical Reports Server (NTRS)

    Singer, M. S.; Oliveira, L.; Vriend, G.; Shepherd, G. M.

    1995-01-01

    A family of G-protein-coupled receptors is believed to mediate the recognition of odor molecules. In order to identify potential ligand-binding residues, we have applied correlated mutation analysis to receptor sequences from the rat. This method identifies pairs of sequence positions where residues remain conserved or mutate in tandem, thereby suggesting structural or functional importance. The analysis supported molecular modeling studies in suggesting several residues in positions that were consistent with ligand-binding function. Two of these positions, dominated by histidine residues, may play important roles in ligand binding and could confer broad specificity to mammalian odor receptors. The presence of positive (overdominant) selection at some of the identified positions provides additional evidence for roles in ligand binding. Higher-order groups of correlated residues were also observed. Each group may interact with an individual ligand determinant, and combinations of these groups may provide a multi-dimensional mechanism for receptor diversity.

  14. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

    PubMed

    Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

    2011-06-02

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

  15. ComplexContact: a web server for inter-protein contact prediction using deep learning.

    PubMed

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-05-22

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  16. Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

    PubMed Central

    Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

    2007-01-01

    We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688

  17. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    PubMed

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  18. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zemla, A; Lang, D; Kostova, T

    2010-11-29

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less

  19. The production of Multiple Small Peptaibol Families by Single 14-Module Peptide Synthetases in Trichoderma/Hypocrea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Degenkolb, Thomas; Aghchehb, Razieh Karimi; Dieckmann, Ralf

    2012-03-01

    The most common peptaibibiotic structures are 11-residue peptaibols found widely distributed in the genus Trichoderma/Hypocrea. Frequently associated are 14-residue peptaibols sharing partial sequence identity. Genome sequencing projects of 3 Trichoderma strains of the major clades reveal the presence of up to 3 types of nonribosomal peptide synthetases with 7, 14, or 18-20 amino acid adding modules. We here provide evidence that the 14-module NRPS type found in T. virens, T. reesei (teleomorph Hypocrea jecorina) and T. atroviride produces both 11- and 14- residue peptaibols based on the disruption of the respective NRPS gene of T. reesei, and bioinformatic analysis ofmore » their amino acid activating domains and modules. The structures of these peptides may be predicted from the gene structures and have been confirmed by analysis of families of 11- and 14-residue peptaibols from the strain 618, termed hypojecorins A (23 sequences determined, 4 new) and B (3 new sequences), and the recently established trichovirins A from T. virens. The distribution of 11- and 14-residue products is strain-specific and depends on growth conditions as well. Possible mechanisms of module skipping are discussed.« less

  20. Synthetic signal sequences that enable efficient secretory protein production in the yeast Kluyveromyces marxianus.

    PubMed

    Yarimizu, Tohru; Nakamura, Mikiko; Hoshida, Hisashi; Akada, Rinji

    2015-02-14

    Targeting of cellular proteins to the extracellular environment is directed by a secretory signal sequence located at the N-terminus of a secretory protein. These signal sequences usually contain an N-terminal basic amino acid followed by a stretch containing hydrophobic residues, although no consensus signal sequence has been identified. In this study, simple modeling of signal sequences was attempted using Gaussia princeps secretory luciferase (GLuc) in the yeast Kluyveromyces marxianus, which allowed comprehensive recombinant gene construction to substitute synthetic signal sequences. Mutational analysis of the GLuc signal sequence revealed that the GLuc hydrophobic peptide length was lower limit for effective secretion and that the N-terminal basic residue was indispensable. Deletion of the 16th Glu caused enhanced levels of secreted protein, suggesting that this hydrophilic residue defined the boundary of a hydrophobic peptide stretch. Consequently, we redesigned this domain as a repeat of a single hydrophobic amino acid between the N-terminal Lys and C-terminal Glu. Stretches consisting of Phe, Leu, Ile, or Met were effective for secretion but the number of residues affected secretory activity. A stretch containing sixteen consecutive methionine residues (M16) showed the highest activity; the M16 sequence was therefore utilized for the secretory production of human leukemia inhibitory factor protein in yeast, resulting in enhanced secreted protein yield. We present a new concept for the provision of secretory signal sequence ability in the yeast K. marxianus, determined by the number of residues of a single hydrophobic residue located between N-terminal basic and C-terminal acidic amino acid boundaries.

  1. Isolation and characterization of full-length putative alcohol dehydrogenase genes from polygonum minus

    NASA Astrophysics Data System (ADS)

    Hamid, Nur Athirah Abd; Ismail, Ismanizan

    2013-11-01

    Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.

  2. Identification of the sequence motif of glycoside hydrolase 13 family members

    PubMed Central

    Kumar, Vikash

    2011-01-01

    A bioinformatics analysis of sequences of enzymes of the glycoside hydrolase (GH) 13 family members such as α-amylase, cyclodextrin glycosyltransferase (CGTase), branching enzyme and cyclomaltodextrinase has been carried out in order to find out the sequence motifs that govern the reactions specificities of these enzymes by using hidden Markov model (HMM) profile. This analysis suggests the existence of such sequence motifs and residues of these motifs constituting the −1 to +3 catalytic subsites of the enzyme. Hence, by introducing mutations in the residues of these four subsites, one can change the reaction specificities of the enzymes. In general it has been observed that α -amylase sequence motif have low sequence conservation than rest of the motifs of the GH13 family members. PMID:21544166

  3. The primary structure of stinging nettle (Urtica dioica) agglutinin. A two-domain member of the hevein family.

    PubMed

    Beintema, J J; Peumans, W J

    1992-03-09

    The primary structure of stinging nettle (Urtica dioica) agglutinin has been determined by sequence analysis of peptides obtained from three overlapping proteolytic digests. The sequence of 80 residues consists of two hevein-like domains with the same spacing of half-cystine residues and several other conserved residues as observed earlier in other proteins with hevein-like domains. The hinge region between the two domains is four residues longer than those between the four domains in cereal lectins like wheat germ agglutinin.

  4. The cDNA sequence of mouse Pgp-1 and homology to human CD44 cell surface antigen and proteoglycan core/link proteins.

    PubMed

    Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T

    1990-01-05

    We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.

  5. Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

    PubMed

    da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

    2017-10-28

    Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Analysis of correlated mutations in HIV-1 protease using spectral clustering.

    PubMed

    Liu, Ying; Eyal, Eran; Bahar, Ivet

    2008-05-15

    The ability of human immunodeficiency virus-1 (HIV-1) protease to develop mutations that confer multi-drug resistance (MDR) has been a major obstacle in designing rational therapies against HIV. Resistance is usually imparted by a cooperative mechanism that can be elucidated by a covariance analysis of sequence data. Identification of such correlated substitutions of amino acids may be obscured by evolutionary noise. HIV-1 protease sequences from patients subjected to different specific treatments (set 1), and from untreated patients (set 2) were subjected to sequence covariance analysis by evaluating the mutual information (MI) between all residue pairs. Spectral clustering of the resulting covariance matrices disclosed two distinctive clusters of correlated residues: the first, observed in set 1 but absent in set 2, contained residues involved in MDR acquisition; and the second, included those residues differentiated in the various HIV-1 protease subtypes, shortly referred to as the phylogenetic cluster. The MDR cluster occupies sites close to the central symmetry axis of the enzyme, which overlap with the global hinge region identified from coarse-grained normal-mode analysis of the enzyme structure. The phylogenetic cluster, on the other hand, occupies solvent-exposed and highly mobile regions. This study demonstrates (i) the possibility of distinguishing between the correlated substitutions resulting from neutral mutations and those induced by MDR upon appropriate clustering analysis of sequence covariance data and (ii) a connection between global dynamics and functional substitution of amino acids.

  7. From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction

    PubMed Central

    Cocco, Simona; Monasson, Remi; Weigt, Martin

    2013-01-01

    Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764

  8. Adenine specific DNA chemical sequencing reaction.

    PubMed Central

    Iverson, B L; Dervan, P B

    1987-01-01

    Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067

  9. A mechanistic insight into the amyloidogenic structure of hIAPP peptide revealed from sequence analysis and molecular dynamics simulation.

    PubMed

    Chakraborty, Sandipan; Chatterjee, Barnali; Basu, Soumalee

    2012-07-01

    A collective approach of sequence analysis, phylogenetic tree and in silico prediction of amyloidogenecity using bioinformatics tools have been used to correlate the observed species-specific variations in IAPP sequences with the amyloid forming propensity. Observed substitution patterns indicate that probable changes in local hydrophobicity are instrumental in altering the aggregation propensity of the peptide. In particular, residues at 17th, 22nd and 23rd positions of the IAPP peptide are found to be crucial for amyloid formation. Proline25 primarily dictates the observed non-amyloidogenecity in rodents. Furthermore, extensive molecular dynamics simulation of 0.24 μs have been carried out with human IAPP (hIAPP) fragment 19-27, the portion showing maximum sequence variation across different species, to understand the native folding characteristic of this region. Principal component analysis in combination with free energy landscape analysis illustrates a four residue turn spanning from residue 22 to 25. The results provide a structural insight into the intramolecular β-sheet structure of amylin which probably is the template for nucleation of fibril formation and growth, a pathogenic feature of type II diabetes. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Streptococcal phosphoenolpyruvate-sugar phosphotransferase system: amino acid sequence and site of ATP-dependent phosphorylation of HPr

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deutscher, J.; Pevec, B.; Beyreuther, K.

    1986-10-21

    The amino acid sequence of histidine-containing protein (HPr) from Streptococcus faecalis has been determined by direct Edman degradation of intact HPr and by amino acid sequence analysis of tryptic peptides, V8 proteolyptic peptides, thermolytic peptides, and cyanogen bromide cleavage products. HPr from S. faecalis was found to contain 89 amino acid residues, corresponding to a molecular weight of 9438. The amino acid sequence of HPr from S. faecalis shows extended homology to the primary structure of HPr proteins from other bacteria. Besides the phosphoenolpyruvate-dependent phosphorylation of a histidyl residue in HPr, catalyzed by enzyme I of the bacterial phosphotransferase system,more » HPr was also found to be phosphorylated at a seryl residue in an ATP-dependent protein kinase catalyzed reaction. The site of ATP-dependent phosphorylation in HPr of S faecalis has now been determined. (/sup 32/P)P-Ser-HPr was digested with three different proteases, and in each case, a single labeled peptide was isolated. Following digestion with subtilisin, they obtained a peptide with the sequence -(P)Ser-Ile-Met-. Using chymotrypsin, they isolated a peptide with the sequence -Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-Gly-Val-Met-. The longest labeled peptide was obtained with V8 staphylococcal protease. According to amino acid analysis, this peptide contained 36 out of the 89 amino acid residues of HPr. The following sequence of 12 amino acid residues of the V8 peptide was determined: -Tyr-Lys-Gly-Lys-Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-. Thus, the site of ATP-dependent phosphorylation was determined to be Ser-46 within the primary structure of HPr.« less

  11. Turn stability in beta-hairpin peptides: Investigation of peptides containing 3:5 type I G1 bulge turns.

    PubMed

    Blandl, Tamas; Cochran, Andrea G; Skelton, Nicholas J

    2003-02-01

    The turn-forming ability of a series of three-residue sequences was investigated by substituting them into a well-characterized beta-hairpin peptide. The starting scaffold, bhpW, is a disulfide-cyclized 10-residue peptide that folds into a stable beta-hairpin with two antiparallel strands connected by a two-residue reverse turn. Substitution of the central two residues with the three-residue test sequences leads to less stable hairpins, as judged by thiol-disulfide equilibrium measurements. However, analysis of NMR parameters indicated that each molecule retains a significant folded population, and that the type of turn adopted by the three-residue sequence is the same in all cases. The solution structure of a selected peptide with a PDG turn contained an antiparallel beta-hairpin with a 3:5 type I + G1 bulge turn. Analysis of the energetic contributions of individual turn residues in the series of peptides indicates that substitution effects have significant context dependence, limiting the predictive power of individual amino acid propensities for turn formation. The most stable and least stable sequences were also substituted into a more stable disulfide-cyclized scaffold and a linear beta-hairpin scaffold. The relative stabilities remained the same, suggesting that experimental measurements in the bhpW context are a useful way to evaluate turn stability for use in protein design projects. Moreover, these scaffolds are capable of displaying a diverse set of turns, which can be exploited for the mimicry of protein loops or for generating libraries of reverse turns.

  12. CCR2 and CCR5 receptor-binding properties of herpesvirus-8 vMIP-II based on sequence analysis and its solution structure.

    PubMed

    Shao, W; Fernandez, E; Sachpatzidis, A; Wilken, J; Thompson, D A; Schweitzer, B I; Lolis, E

    2001-05-01

    Human herpesvirus-8 (HHV-8) is the infectious agent responsible for Kaposi's sarcoma and encodes a protein, macrophage inflammatory protein-II (vMIP-II), which shows sequence similarity to the human CC chemokines. vMIP-II has broad receptor specificity that crosses chemokine receptor subfamilies, and inhibits HIV-1 viral entry mediated by numerous chemokine receptors. In this study, the solution structure of chemically synthesized vMIP-II was determined by nuclear magnetic resonance. The protein is a monomer and possesses the chemokine fold consisting of a flexible N-terminus, three antiparallel beta strands, and a C-terminal alpha helix. Except for the N-terminal residues (residues 1-13) and the last two C-terminal residues (residues 73-74), the structure of vMIP-II is well-defined, exhibiting average rmsd of 0.35 and 0.90 A for the backbone heavy atoms and all heavy atoms of residues 14-72, respectively. Taking into account the sequence differences between the various CC chemokines and comparing their three-dimensional structures allows us to implicate residues that influence the quaternary structure and receptor binding and activation of these proteins in solution. The analysis of the sequence and three-dimensional structure of vMIP-II indicates the presence of epitopes involved in binding two receptors CCR2 and CCR5. We propose that vMIP-II was initially specific for CCR5 and acquired receptor-binding properties to CCR2 and other chemokine receptors.

  13. Sequence composition and environment effects on residue fluctuations in protein structures

    NASA Astrophysics Data System (ADS)

    Ruvinsky, Anatoly M.; Vakser, Ilya A.

    2010-10-01

    Structure fluctuations in proteins affect a broad range of cell phenomena, including stability of proteins and their fragments, allosteric transitions, and energy transfer. This study presents a statistical-thermodynamic analysis of relationship between the sequence composition and the distribution of residue fluctuations in protein-protein complexes. A one-node-per-residue elastic network model accounting for the nonhomogeneous protein mass distribution and the interatomic interactions through the renormalized inter-residue potential is developed. Two factors, a protein mass distribution and a residue environment, were found to determine the scale of residue fluctuations. Surface residues undergo larger fluctuations than core residues in agreement with experimental observations. Ranking residues over the normalized scale of fluctuations yields a distinct classification of amino acids into three groups: (i) highly fluctuating-Gly, Ala, Ser, Pro, and Asp, (ii) moderately fluctuating-Thr, Asn, Gln, Lys, Glu, Arg, Val, and Cys, and (iii) weakly fluctuating-Ile, Leu, Met, Phe, Tyr, Trp, and His. The structural instability in proteins possibly relates to the high content of the highly fluctuating residues and a deficiency of the weakly fluctuating residues in irregular secondary structure elements (loops), chameleon sequences, and disordered proteins. Strong correlation between residue fluctuations and the sequence composition of protein loops supports this hypothesis. Comparing fluctuations of binding site residues (interface residues) with other surface residues shows that, on average, the interface is more rigid than the rest of the protein surface and Gly, Ala, Ser, Cys, Leu, and Trp have a propensity to form more stable docking patches on the interface. The findings have broad implications for understanding mechanisms of protein association and stability of protein structures.

  14. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins

    NASA Astrophysics Data System (ADS)

    Champeimont, Raphaël; Laine, Elodie; Hu, Shuang-Wei; Penin, Francois; Carbone, Alessandra

    2016-05-01

    A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.

  15. Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

    PubMed

    Pietrowski, D; Förster, M

    2000-01-01

    The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).

  16. Residual Stresses and Critical Initial Flaw Size Analyses of Welds

    NASA Technical Reports Server (NTRS)

    Brust, Frederick W.; Raju, Ivatury, S.; Dawocke, David S.; Cheston, Derrick

    2009-01-01

    An independent assessment was conducted to determine the critical initial flaw size (CIFS) for the flange-to-skin weld in the Ares I-X Upper Stage Simulator (USS). A series of weld analyses are performed to determine the residual stresses in a critical region of the USS. Weld residual stresses both increase constraint and mean stress thereby having an important effect on the fatigue life. The purpose of the weld analyses was to model the weld process using a variety of sequences to determine the 'best' sequence in terms of weld residual stresses and distortions. The many factors examined in this study include weld design (single-V, double-V groove), weld sequence, boundary conditions, and material properties, among others. The results of this weld analysis are included with service loads to perform a fatigue and critical initial flaw size evaluation.

  17. Structure-related statistical singularities along protein sequences: a correlation study.

    PubMed

    Colafranceschi, Mauro; Colosimo, Alfredo; Zbilut, Joseph P; Uversky, Vladimir N; Giuliani, Alessandro

    2005-01-01

    A data set composed of 1141 proteins representative of all eukaryotic protein sequences in the Swiss-Prot Protein Knowledge base was coded by seven physicochemical properties of amino acid residues. The resulting numerical profiles were submitted to correlation analysis after the application of a linear (simple mean) and a nonlinear (Recurrence Quantification Analysis, RQA) filter. The main RQA variables, Recurrence and Determinism, were subsequently analyzed by Principal Component Analysis. The RQA descriptors showed that (i) within protein sequences is embedded specific information neither present in the codes nor in the amino acid composition and (ii) the most sensitive code for detecting ordered recurrent (deterministic) patterns of residues in protein sequences is the Miyazawa-Jernigan hydrophobicity scale. The most deterministic proteins in terms of autocorrelation properties of primary structures were found (i) to be involved in protein-protein and protein-DNA interactions and (ii) to display a significantly higher proportion of structural disorder with respect to the average data set. A study of the scaling behavior of the average determinism with the setting parameters of RQA (embedding dimension and radius) allows for the identification of patterns of minimal length (six residues) as possible markers of zones specifically prone to inter- and intramolecular interactions.

  18. An intuitive graphical webserver for multiple-choice protein sequence search.

    PubMed

    Banky, Daniel; Szalkai, Balazs; Grolmusz, Vince

    2014-04-10

    Every day tens of thousands of sequence searches and sequence alignment queries are submitted to webservers. The capitalized word "BLAST" becomes a verb, describing the act of performing sequence search and alignment. However, if one needs to search for sequences that contain, for example, two hydrophobic and three polar residues at five given positions, the query formation on the most frequently used webservers will be difficult. Some servers support the formation of queries with regular expressions, but most of the users are unfamiliar with their syntax. Here we present an intuitive, easily applicable webserver, the Protein Sequence Analysis server, that allows the formation of multiple choice queries by simply drawing the residues to their positions; if more than one residue are drawn to the same position, then they will be nicely stacked on the user interface, indicating the multiple choice at the given position. This computer-game-like interface is natural and intuitive, and the coloring of the residues makes possible to form queries requiring not just certain amino acids in the given positions, but also small nonpolar, negatively charged, hydrophobic, positively charged, or polar ones. The webserver is available at http://psa.pitgroup.org. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. A critical analysis of computational protein design with sparse residue interaction graphs

    PubMed Central

    Georgiev, Ivelin S.

    2017-01-01

    Protein design algorithms enumerate a combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC). To efficiently find the GMEC, protein design algorithms must methodically reduce the conformational search space. By applying distance and energy cutoffs, the protein system to be designed can thus be represented using a sparse residue interaction graph, where the number of interacting residue pairs is less than all pairs of mutable residues, and the corresponding GMEC is called the sparse GMEC. However, ignoring some pairwise residue interactions can lead to a change in the energy, conformation, or sequence of the sparse GMEC vs. the original or the full GMEC. Despite the widespread use of sparse residue interaction graphs in protein design, the above mentioned effects of their use have not been previously analyzed. To analyze the costs and benefits of designing with sparse residue interaction graphs, we computed the GMECs for 136 different protein design problems both with and without distance and energy cutoffs, and compared their energies, conformations, and sequences. Our analysis shows that the differences between the GMECs depend critically on whether or not the design includes core, boundary, or surface residues. Moreover, neglecting long-range interactions can alter local interactions and introduce large sequence differences, both of which can result in significant structural and functional changes. Designs on proteins with experimentally measured thermostability show it is beneficial to compute both the full and the sparse GMEC accurately and efficiently. To this end, we show that a provable, ensemble-based algorithm can efficiently compute both GMECs by enumerating a small number of conformations, usually fewer than 1000. This provides a novel way to combine sparse residue interaction graphs with provable, ensemble-based algorithms to reap the benefits of sparse residue interaction graphs while avoiding their potential inaccuracies. PMID:28358804

  20. Protein sequence analysis, cloning, and expression of flammutoxin, a pore-forming cytolysin from Flammulina velutipes. Maturation of dimeric precursor to monomeric active form by carboxyl-terminal truncation.

    PubMed

    Tomita, Toshio; Mizumachi, Yoshihiro; Chong, Kang; Ogawa, Kanako; Konishi, Norihide; Sugawara-Tomita, Noriko; Dohmae, Naoshi; Hashimoto, Yohichi; Takio, Koji

    2004-12-24

    Flammutoxin (FTX), a 31-kDa pore-forming cytolysin from Flammulina velutipes, is specifically expressed during the fruiting body formation. We cloned and expressed the cDNA encoding a 272-residue protein with an identical N-terminal sequence with that of FTX but failed to obtain hemolytically active protein. This, together with the presence of multiple FTX family proteins in the mushroom, prompted us to determine the complete primary structure of FTX by protein sequence analysis. The N-terminal 72 and C-terminal 107 residues were sequenced by Edman degradation of the fragments generated from the alkylated FTX by enzymatic digestions with Achromobacter protease I or Staphylococcus aureus V8 protease and by chemical cleavages with CNBr, hydroxylamine, or 1% formic acid. The central part of FTX was sequenced with a surface-adhesive 7-kDa fragment, which was generated by a tryptic digestion of FTX and recovered by rinsing the wall of a test tube with 6 M guanidine HCl. The 7-kDa peptide was cleaved with 12 M HCl, thermolysin, or S. aureus V8 protease to produce smaller peptides for sequence analysis. As a result, FTX consisted of 251 residues, and protein and nucleotide sequences were in accord except for the lack of the initial Met and the C-terminal 20 residues in protein. Recombinant FTX (rFTX) with or without the C-terminal 20 residues (rFTX271 or rFTX251, respectively) was prepared to study the maturation process of FTX. Like natural FTX, rFTX251 existed as a monomer in solution and assembled into an SDS-stable, ring-shaped pore complex on human erythrocytes, causing hemolysis. In contrast, rFTX271, existing as a dimer in solution, bound to the cells but failed to form pore complex. The dimeric rFTX271 was converted to hemolytically active monomers upon the cleavage between Lys(251) and Met(252) by trypsin.

  1. Deep Sequencing of Random Mutant Libraries Reveals the Active Site of the Narrow Specificity CphA Metallo-β-Lactamase is Fragile to Mutations.

    PubMed

    Sun, Zhizeng; Mehta, Shrenik C; Adamski, Carolyn J; Gibbs, Richard A; Palzkill, Timothy

    2016-09-12

    CphA is a Zn(2+)-dependent metallo-β-lactamase that efficiently hydrolyzes only carbapenem antibiotics. To understand the sequence requirements for CphA function, single codon random mutant libraries were constructed for residues in and near the active site and mutants were selected for E. coli growth on increasing concentrations of imipenem, a carbapenem antibiotic. At high concentrations of imipenem that select for phenotypically wild-type mutants, the active-site residues exhibit stringent sequence requirements in that nearly all residues in positions that contact zinc, the substrate, or the catalytic water do not tolerate amino acid substitutions. In addition, at high imipenem concentrations a number of residues that do not directly contact zinc or substrate are also essential and do not tolerate substitutions. Biochemical analysis confirmed that amino acid substitutions at essential positions decreased the stability or catalytic activity of the CphA enzyme. Therefore, the CphA active - site is fragile to substitutions, suggesting active-site residues are optimized for imipenem hydrolysis. These results also suggest that resistance to inhibitors targeted to the CphA active site would be slow to develop because of the strong sequence constraints on function.

  2. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    PubMed

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  3. In silico analysis of L-asparaginase from different source organisms.

    PubMed

    Dwivedi, Vivek Dhar; Mishra, Sarad Kumar

    2014-06-01

    L-asparaginases are widely distributed enzymes among plants, fungi and bacteria. This enzyme catalyzes the conversion of l-asparagine to l-aspartate and ammonia and to a lesser extent the formation of l-glutamate from l-glutamine. In the present study, forty-five full-length amino acid sequences of L-asparaginases from bacteria, fungi and plants were collected and subjected to multiple sequence alignment (MSA), domain identification, discovering individual amino acid composition, and phylogenetic tree construction. MSA revealed that two glycine residues were identically found in all analyzed species, two glycine residues were also identically found in all the fungal and bacterial sources and three glycine residues were identically found in all plant and bacterial sources while no residue was identically found in plant and fungal L-asparaginases. Two major sequence clusters were constructed by phylogenetic analysis. One cluster contains eleven species of fungi, twelve species of bacteria, and one species of plant, whereas the other one contains fourteen species of plant, four species of fungi and three species bacteria. The amino acid composition result revealed that the average frequency of amino acid alanine is 10.77 percent that is very high in comparison to other amino acids in all analyzed species.

  4. RAMONA-3B application to Browns Ferry ATWS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slovik, G.C.; Neymotin, L.Y.; Saha, P.

    1985-01-01

    The Anticipated Transient Without Scram (ATWS) is known to be a dominant accident sequence for possible core melt in a Boiling Water Reactor (BWR). A recent Probabilistic Risk Assessment (PRA) analysis for the Browns Ferry nuclear power plant indicates that ATWS is the second most dominant transient for core melt in BWR/4 with Mark I containment. The most dominant sequence being the failure of long term decay heat removal function of the Residual Heat Removal (RHR) system. Of all the various ATWS scenarios, the Main Steam Isolation Valve (MSIV) closure ATWS sequence was chosen for present analysis because of itsmore » relatively high frequency of occurrence and its challenge to the residual heat removal system and containment integrity. The objective of this paper is to discuss four MSIV closure ATWS calculations using the RAMONA-3B code. The paper is a summary of a report being prepared for the USNRC Severe Accident Sequence Analysis (SASA) program which should be referred to for details. 10 refs., 20 figs., 3 tabs.« less

  5. Selective Loss of Cysteine Residues and Disulphide Bonds in a Potato Proteinase Inhibitor II Family

    PubMed Central

    Li, Xiu-Qing; Zhang, Tieling; Donnelly, Danielle

    2011-01-01

    Disulphide bonds between cysteine residues in proteins play a key role in protein folding, stability, and function. Loss of a disulphide bond is often associated with functional differentiation of the protein. The evolution of disulphide bonds is still actively debated; analysis of naturally occurring variants can promote understanding of the protein evolutionary process. One of the disulphide bond-containing protein families is the potato proteinase inhibitor II (PI-II, or Pin2, for short) superfamily, which is found in most solanaceous plants and participates in plant development, stress response, and defence. Each PI-II domain contains eight cysteine residues (8C), and two similar PI-II domains form a functional protein that has eight disulphide bonds and two non-identical reaction centres. It is still unclear which patterns and processes affect cysteine residue loss in PI-II. Through cDNA sequencing and data mining, we found six natural variants missing cysteine residues involved in one or two disulphide bonds at the first reaction centre. We named these variants Pi7C and Pi6C for the proteins missing one or two pairs of cysteine residues, respectively. This PI-II-7C/6C family was found exclusively in potato. The missing cysteine residues were in bonding pairs but distant from one another at the nucleotide/protein sequence level. The non-synonymous/synonymous substitution (Ka/Ks) ratio analysis suggested a positive evolutionary gene selection for Pi6C and various Pi7C. The selective deletion of the first reaction centre cysteine residues that are structure-level-paired but sequence-level-distant in PI-II illustrates the flexibility of PI-II domains and suggests the functionality of their transient gene versions during evolution. PMID:21494600

  6. Nucleotide sequence analysis of the gene encoding the Deinococcus radiodurans surface protein, derived amino acid sequence, and complementary protein chemical studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peters, J.; Peters, M.; Lottspeich, F.

    1987-11-01

    The complete nucleotide sequence of the gene encoding the surface (hexagonally packed intermediate (HPI))-layer polypeptide of Deinococcus radiodurans Sark was determined and found to encode a polypeptide of 1036 amino acids. Amino acid sequence analysis of about 30% of the residues revealed that the mature polypeptide consists of at least 978 amino acids. The N terminus was blocked to Edman degradation. The results of proteolytic modification of the HPI layer in situ and M/sub r/ estimations of the HPI polypeptide expressed in Escherichia coli indicated that there is a leader sequence. The N-terminal region contained a very high percentage (29%)more » of threonine and serine, including a cluster of nine consecutive serine or threonine residues, whereas a stretch near the C terminus was extremely rich in aromatic amino acids (29%). The protein contained at least two disulfide bridges, as well as tightly bound reducing sugars and fatty acids.« less

  7. An Amino Acid Code for β-sheet Packing Structure

    PubMed Central

    Joo, Hyun; Tsai, Jerry

    2014-01-01

    To understand the relationship between protein sequence and structure, this work extends the knob-socket model in an investigation of β-sheet packing. Over a comprehensive set of β-sheet folds, the contacts between residues were used to identify packing cliques: sets of residues that all contact each other. These packing cliques were then classified based on size and contact order. From this analysis, the 2 types of 4 residue packing cliques necessary to describe β-sheet packing were characterized. Both occur between 2 adjacent hydrogen bonded β-strands. First, defining the secondary structure packing within β-sheets, the combined socket or XY:HG pocket consists of 4 residues i,i+2 on one strand and j,j+2 on the other. Second, characterizing the tertiary packing between β-sheets, the knob-socket XY:H+B consists of a 3 residue XY:H socket (i,i+2 on one strand and j on the other) packed against a knob B residue (residue k distant in sequence). Depending on the packing depth of the knob B residue, 2 types of knob-sockets are found: side-chain and main-chain sockets. The amino acid composition of the pockets and knob-sockets reveal the sequence specificity of β-sheet packing. For β-sheet formation, the XY:HG pocket clearly shows sequence specificity of amino acids. For tertiary packing, the XY:H+B side-chain and main-chain sockets exhibit distinct amino acid preferences at each position. These relationships define an amino acid code for β-sheet structure and provide an intuitive topological mapping of β-sheet packing. PMID:24668690

  8. Molecular cloning and sequence analysis of the gene coding for the 57kDa soluble antigen of the salmonid fish pathogen Renibacterium salmoninarum

    USGS Publications Warehouse

    Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.

    1992-01-01

    The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.

  9. Unique Structural Features and Sequence Motifs of Proline Utilization A (PutA)

    PubMed Central

    Singh, Ranjan K.; Tanner, John J.

    2013-01-01

    Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20–30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100–200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760

  10. Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe.

    PubMed

    Necci, Marco; Piovesan, Damiano; Tosatto, Silvio C E

    2016-12-01

    Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures. © 2016 The Protein Society.

  11. Biochemical and genetic characterization of enterocin A from Enterococcus faecium, a new antilisterial bacteriocin in the pediocin family of bacteriocins.

    PubMed Central

    Aymerich, T; Holo, H; Håvarstein, L S; Hugas, M; Garriga, M; Nes, I F

    1996-01-01

    A new bacteriocin has been isolated from an Enterococcus faecium strain. The bacteriocin, termed enterocin A, was purified to homogeneity as judged by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and mass spectrometry analysis. By combining the data obtained from amino acid and DNA sequencing, the primary structure of enterocin A was determined. It consists of 47 amino acid residues, and the molecular weight was calculated to be 4,829, assuming that the four cysteine residues form intramolecular disulfide bridges. This molecular weight was confirmed by mass spectrometry analysis. The amino acid sequence of enterocin A shared significant homology with a group of bacteriocins (now termed pediocin-like bacteriocins) isolated from a variety of lactic acid-producing bacteria, which include members of the genera Lactobacillus, Pediococcus, Leuconostoc, and Carnobacterium. Sequencing of the structural gene of enterocin A, which is located on the bacterial chromosome, revealed an N-terminal leader sequence of 18 amino acid residues, which was removed during the maturation process. The enterocin A leader belongs to the double-glycine leaders which are found among most other small nonlantibiotic bacteriocins, some lantibiotics, and colicin V. Downstream of the enterocin A gene was located a second open reading frame, encoding a putative protein of 103 amino acid residues. This gene may encode the immunity factor of enterocin A, and it shares 40% identity with a similar open reading frame in the operon of leucocin AUL 187, another pediocin-like bacteriocin. PMID:8633865

  12. Effects of stacking sequence on impact damage resistance and residual strength for quasi-isotropic laminates

    NASA Technical Reports Server (NTRS)

    Dost, Ernest F.; Ilcewicz, Larry B.; Avery, William B.; Coxon, Brian R.

    1991-01-01

    Residual strength of an impacted composite laminate is dependent on details of the damage state. Stacking sequence was varied to judge its effect on damage caused by low-velocity impact. This was done for quasi-isotropic layups of a toughened composite material. Experimental observations on changes in the impact damage state and postimpact compressive performance were presented for seven different laminate stacking sequences. The applicability and limitations of analysis compared to experimental results were also discussed. Postimpact compressive behavior was found to be a strong function of the laminate stacking sequence. This relationship was found to depend on thickness, stacking sequence, size, and location of sublaminates that comprise the impact damage state. The postimpact strength for specimens with a relatively symmetric distribution of damage through the laminate thickness was accurately predicted by models that accounted for sublaminate stability and in-plane stress redistribution. An asymmetric distribution of damage in some laminate stacking sequences tended to alter specimen stability. Geometrically nonlinear finite element analysis was used to predict this behavior.

  13. Replacement of all arginine residues with canavanine in MazF-bs mRNA interferase changes its specificity.

    PubMed

    Ishida, Yojiro; Park, Jung-Ho; Mao, Lili; Yamaguchi, Yoshihiro; Inouye, Masayori

    2013-03-15

    Replacement of a specific amino acid residue in a protein with nonnatural analogues is highly challenging because of their cellular toxicity. We demonstrate for the first time the replacement of all arginine (Arg) residues in a protein with canavanine (Can), a toxic Arg analogue. All Arg residues in the 5-base specific (UACAU) mRNA interferase from Bacillus subtilis (MazF-bs(arg)) were replaced with Can by using the single-protein production system in Escherichia coli. The resulting MazF-bs(can) gained a 6-base recognition sequence, UACAUA, for RNA cleavage instead of the 5-base sequence, UACAU, for MazF-bs(arg). Mass spectrometry analysis confirmed that all Arg residues were replaced with Can. The present system offers a novel approach to create new functional proteins by replacing a specific amino acid in a protein with its analogues.

  14. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos.

    PubMed

    Roca, Alberto I

    2014-01-01

    The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.

  15. Molecular cloning, sequence and structural analysis of dehairing Mn(2+) dependent alkaline serine protease (MASPT) of Bacillus pumilus TMS55.

    PubMed

    Ibrahim, Kalibulla Syed; Muniyandi, Jeyaraj; Pandian, Shunmugiah Karutha

    2011-10-01

    Leather industries release a large amount of pollution-causing chemicals which creates one of the major industrial pollutions. The development of enzyme based processes as a potent alternative to pollution-causing chemicals is useful to overcome this issue. Proteases are enzymes which have extensive applications in leather processing and in several bioremediation processes due to their high alkaline protease activity and dehairing efficacy. In the present study, we report cloning, characterization of a Mn2+ dependent alkaline serine protease gene (MASPT) of Bacillus pumilus TMS55. The gene encoding the protease from B. pumilus TMS55 was cloned and its nucleotide sequence was determined. This gene has an open reading frame (ORF) of 1,149 bp that encodes a polypeptide of 383 amino acid residues. Our analysis showed that this polypeptide is composed of 29 residues N-terminal signal peptide, a propeptide of 79 residues and a mature protein of 275 amino acids. We performed bioinformatics analysis to compare MASPT enzyme with other proteases. Homology modeling was employed to model three dimensional structure for MASPT. Structural analysis showed that MASPT structure is composed of nine α-helices and nine β-strands. It has 3 catalytic residues and 14 metal binding residues. Docking analysis showed that residues S223, A260, N263, T328 and S329 interact with Mn2+. This study allows initial inferences about the structure of the protease and will allow the rational design of its derivatives for structure-function studies and also for further improvement of the enzyme.

  16. Isolation and characterization of the chicken trypsinogen gene family.

    PubMed Central

    Wang, K; Gan, L; Lee, I; Hood, L

    1995-01-01

    Based on genomic Southern hybridizations and cDNA sequence analyses, the chicken trypsinogen gene family can be divided into two multi-member subfamilies, a six-member trypsinogen I subfamily which encodes the cationic trypsin isoenzymes and a three-member trypsinogen II subfamily which encodes the anionic trypsin isoenzymes. The chicken cDNA and genomic clones containing these two subfamilies were isolated and characterized by DNA sequence analysis. The results indicated that the chicken trypsinogen genes encoded a signal peptide of 15 to 16 amino acid residues, an activation peptide of 9 to 10 residues and a trypsin of 223 amino acid residues. The chicken trypsinogens contain all the common catalytic and structural features for trypsins, including the catalytic triad His, Asp and Ser and the six disulphide bonds. The trypsinogen I and II subfamilies share approximately 70% sequence identity at the nucleotide and amino acid level. The sequence comparison among chicken trypsinogen subfamily members and trypsin sequences from other species suggested that the chicken trypsinogen genes may have evolved in coincidental or concerted fashion. Images Figure 6 Figure 7 PMID:7733885

  17. Mechanism of degradation of 2'-deoxycytidine by formamide: implications for chemical DNA sequencing procedures.

    PubMed

    Saladino, R; Crestini, C; Mincione, E; Costanzo, G; Di Mauro, E; Negri, R

    1997-11-01

    We describe the reaction of formamide with 2'-deoxycytidine to give pyrimidine ring opening by nucleophilic addition on the electrophilic C(6) and C(4) positions. This information is confirmed by the analysis of the products of formamide attack on 2'-deoxycytidine, 5-methyl-2'-deoxycytidine, and 5-bromo-2'-deoxycytidine, residues when the latter are incorporated into oligonucleotides by DNA polymerase-driven polymerization and solid-phase phosphoramidite procedure. The increased sensitivity of 5-bromo-2'-deoxycytidine relative to that of 2'-deoxycytidine is pivotal for the improvement of the one-lane chemical DNA sequencing procedure based on the base-selective reaction of formamide with DNA. In many DNA sequencing cases it will in fact be possible to incorporate this base analogue into the DNA to be sequenced, thus providing a complete discrimination between its UV absorption signal and that of the thymidine residues. The wide spectrum of different sensitivities to formamide displayed by the 2'-deoxycytidine analogues solves, in the DNA single-lane chemical sequencing procedure, the possible source of errors due to low discrimination between C and T residues.

  18. Prediction of Spontaneous Protein Deamidation from Sequence-Derived Secondary Structure and Intrinsic Disorder.

    PubMed

    Lorenzo, J Ramiro; Alonso, Leonardo G; Sánchez, Ignacio E

    2015-01-01

    Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage "Protein and nucleic acid structure and sequence analysis".

  19. Tn5401, a new class II transposable element from Bacillus thuringiensis.

    PubMed Central

    Baum, J A

    1994-01-01

    A new class II (Tn3-like) transposable element, designated Tn5401, was recovered from a sporulation-deficient variant of Bacillus thuringiensis subsp. morrisoni EG2158 following its insertion into a recombinant plasmid. Sequence analysis of the insert revealed a 4,837-bp transposon with two large open reading frames, in the same orientation, encoding proteins of 36 kDa (306 residues) and 116 kDa (1,005 residues) and 53-bp terminal inverted repeats. The deduced amino acid sequence for the 36-kDa protein shows 24% sequence identity with the TnpI recombinase of the B. thuringiensis transposon Tn4430, a member of the phage integrase family of site-specific recombinases. The deduced amino acid sequence for the 116-kDa protein shows 42% sequence identity with the transposase of Tn3 but only 28% identity with the TnpA transposase of Tn4430. Two small open reading frames of unknown function, designated orf1 (85 residues) and orf2 (74 residues), were also identified. Southern blot analysis indicated that Tn5401, in contrast to Tn4430, is not commonly found among different subspecies of B. thuringiensis and is not typically associated with known insecticidal crystal protein genes. Transposition was studied with B. thuringiensis by using plasmid pEG922, a temperature-sensitive shuttle vector containing Tn5401. Tn5401 transposed to both chromosomal and plasmid target sites but displayed an apparent preference for plasmid sites. Transposition was replicative and resulted in the generation of a 5-bp duplication at the target site. Transcriptional start sites within Tn5401 were mapped by primer extension analysis. Two promoters, designated PL and PR, direct the transcription of orf1-orf2 and tnpI-tnpA, respectively, and are negatively regulated by TnpI. Sequence comparison of the promoter regions of Tn5401 and Tn4430 suggests that the conserved sequence element ATGTCCRCTAAY mediates TnpI binding and cointegrate resolution. The same element is contained within the 53-bp terminal inverted repeats, thus accounting for their unusual lengths and suggesting an additional role for TnpI in regulating Tn5401 transposition. Images PMID:7514590

  20. Reinventing Cell Penetrating Peptides Using Glycosylated Methionine Sulfonium Ion Sequences.

    PubMed

    Kramer, Jessica R; Schmidt, Nathan W; Mayle, Kristine M; Kamei, Daniel T; Wong, Gerard C L; Deming, Timothy J

    2015-05-27

    Cell penetrating peptides (CPPs) are intriguing molecules that have received much attention, both in terms of mechanistic analysis and as transporters for intracellular therapeutic delivery. Most CPPs contain an abundance of cationic charged residues, typically arginine, where the amino acid compositions, rather than specific sequences, tend to determine their ability to enter cells. Hydrophobic residues are often added to cationic sequences to create efficient CPPs, but typically at the penalty of increased cytotoxicity. Here, we examined polypeptides containing glycosylated, cationic derivatives of methionine, where we found these hydrophilic polypeptides to be surprisingly effective as CPPs and to also possess low cytotoxicity. X-ray analysis of how these new polypeptides interact with lipid membranes revealed that the incorporation of sterically demanding hydrophilic cationic groups in polypeptides is an unprecedented new concept for design of potent CPPs.

  1. Reinventing cell penetrating peptides using glycosylated methionine sulfonium ion sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kramer, Jessica R.; Schmidt, Nathan W.; Mayle, Kristine M.

    2015-04-15

    Cell penetrating peptides (CPPs) are intriguing molecules that have received much attention, both in terms of mechanistic analysis and as transporters for intracellular therapeutic delivery. Most CPPs contain an abundance of cationic charged residues, typically arginine, where the amino acid compositions, rather than specific sequences, tend to determine their ability to enter cells. Hydrophobic residues are often added to cationic sequences to create efficient CPPs, but typically at the penalty of increased cytotoxicity. Here, we examined polypeptides containing glycosylated, cationic derivatives of methionine, where we found these hydrophilic polypeptides to be surprisingly effective as CPPs and to also possess lowmore » cytotoxicity. X-ray analysis of how these new polypeptides interact with lipid membranes revealed that the incorporation of sterically demanding hydrophilic cationic groups in polypeptides is an unprecedented new concept for design of potent CPPs.« less

  2. STING Millennium: a web-based suite of programs for comprehensive and simultaneous analysis of protein structure and sequence

    PubMed Central

    Neshich, Goran; Togawa, Roberto C.; Mancini, Adauto L.; Kuser, Paula R.; Yamagishi, Michel E. B.; Pappas, Georgios; Torres, Wellington V.; Campos, Tharsis Fonseca e; Ferreira, Leonardo L.; Luna, Fabio M.; Oliveira, Adilton G.; Miura, Ronald T.; Inoue, Marcus K.; Horita, Luiz G.; de Souza, Dimas F.; Dominiquini, Fabiana; Álvaro, Alexandre; Lima, Cleber S.; Ogawa, Fabio O.; Gomes, Gabriel B.; Palandrani, Juliana F.; dos Santos, Gabriela F.; de Freitas, Esther M.; Mattiuz, Amanda R.; Costa, Ivan C.; de Almeida, Celso L.; Souza, Savio; Baudet, Christian; Higa, Roberto H.

    2003-01-01

    STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). SMS operates with a collection of both publicly available data (PDB, HSSP, Prosite) and its own data (contacts, interface contacts, surface accessibility). Biologists find SMS useful because it provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. Using SMS it is now possible to analyze sequence to structure relationships, the quality of the structure, nature and volume of atomic contacts of intra and inter chain type, relative conservation of amino acids at the specific sequence position based on multiple sequence alignment, indications of folding essential residue (FER) based on the relationship of the residue conservation to the intra-chain contacts and Cα–Cα and Cβ–Cβ distance geometry. Specific emphasis in SMS is given to interface forming residues (IFR)—amino acids that define the interactive portion of the protein surfaces. SMS may simultaneously display and analyze previously superimposed structures. PDB updates trigger SMS updates in a synchronized fashion. SMS is freely accessible for public data at http://www.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS and http://trantor.bioc.columbia.edu/SMS. PMID:12824333

  3. A Novel Cylindrical Representation for Characterizing Intrinsic Properties of Protein Sequences.

    PubMed

    Yu, Jia-Feng; Dou, Xiang-Hua; Wang, Hong-Bo; Sun, Xiao; Zhao, Hui-Ying; Wang, Ji-Hua

    2015-06-22

    The composition and sequence order of amino acid residues are the two most important characteristics to describe a protein sequence. Graphical representations facilitate visualization of biological sequences and produce biologically useful numerical descriptors. In this paper, we propose a novel cylindrical representation by placing the 20 amino acid residue types in a circle and sequence positions along the z axis. This representation allows visualization of the composition and sequence order of amino acids at the same time. Ten numerical descriptors and one weighted numerical descriptor have been developed to quantitatively describe intrinsic properties of protein sequences on the basis of the cylindrical model. Their applications to similarity/dissimilarity analysis of nine ND5 proteins indicated that these numerical descriptors are more effective than several classical numerical matrices. Thus, the cylindrical representation obtained here provides a new useful tool for visualizing and charactering protein sequences. An online server is available at http://biophy.dzu.edu.cn:8080/CNumD/input.jsp .

  4. Principal sequence pattern analysis of episodes of excess mortality due to heat in the Barcelona metropolitan area.

    PubMed

    Peña, Juan Carlos; Aran, Montserrat; Raso, José Miguel; Pérez-Zanón, Nuria

    2015-04-01

    The aim of the study is to classify the synoptic sequences associated with excess mortality during the warm season in the Barcelona metropolitan area. To achieve this purpose, we undertook a principal sequence pattern analysis that incorporates different atmospheric levels, in an attempt at identifying the main features that account for dynamic and thermodynamic atmospheric processes. The sequence length was determined by the short-term displacement between temperature and mortality. To detect this lag, we applied the cross-correlation function to the residuals obtained from the modelling of the daily temperature and mortality series of summer. These residuals were estimated by means of an autoregressive integrated moving average (ARIMA) model. A 7-day sequence emerged as the basic temporal unit for evaluating the synoptic background that triggers the temperature related to excess mortality in the Barcelona metropolitan area. The principal sequence pattern analysis distinguished three main synoptic patterns: two dynamic configurations produced by southern fluxes related to an Atlantic low, which can be associated with heat waves recorded in southern Europe, and a third pattern identified by a stagnation situation associated with the persistence of a blocking anticyclone over Europe, related to heat waves recorded in northern and central western Europe.

  5. Amino acid sequence of tyrosinase from Neurospora crassa.

    PubMed Central

    Lerch, K

    1978-01-01

    The amino-acid sequence of tyrosinase from Neurospora crassa (monophenol,dihydroxyphenylalanine:oxygen oxidoreductase, EC 1.14.18.1) is reported. This copper-containing oxidase consists of a single polypeptide chain of 407 amino acids. The primary structure was determined by automated and manual sequence analysis on fragments produced by cleavage with cyanogen bromide and on peptides obtained by digestion with trypsin, pepsin, thermolysin, or chymotrypsin. The amino terminus of the protein is acetylated and the single cysteinyl residue 96 is covalently linked via a thioether bridge to histidyl residue 94. The formation and the possible role of this unusual structure in Neurospora tyrosinase is discussed. Dye-sensitized photooxidation of apotyrosinase and active-site-directed inactivation of the native enzyme indicate the possible involvement of histidyl residues 188, 192, 289, and 305 or 306 as ligands to the active-site copper as well as in the catalytic mechanism of this monooxygenase. PMID:151279

  6. Molecular cloning of pepsinogens A and C from adult newt (Cynops pyrrhogaster) stomach.

    PubMed

    Inokuchi, Tomofumi; Ikuzawa, Masayuki; Yamazaki, Shin; Watanabe, Yukari; Shiota, Koushiro; Katoh, Takuma; Kobayashi, Ken-Ichiro

    2013-08-01

    The full-length cDNAs of three pepsinogens (Pgs) were cloned from the stomach of newt, Cynops pyrrhogaster, and nucleotide sequences of the full-length cDNAs were determined. Molecular phylogenetic analysis showed that two Pgs, named PgC1 and PgC2, belong to the pepsinogen C group, and one Pg, named PgA, belongs to the pepsinogen A group. The sequences contain an open reading frame (ORF) encoding 385 amino acid residues for PgC1, 383 amino acid residues for PgC2 and 377 amino acid residues for PgA. In addition, all of the three amino acid sequences conserve some unique characteristics such as six cysteine residues and putative active site two aspartic acid residues. All of the pepsinogen mRNAs were detected in the stomach by RT-PCR but not in other organs. Although a slight difference at the time of the start of expression was seen among the three pepsinogen genes, all of them were expressed in the larval stage after hatching. This is the first report on cloning of pepsinogens from urodele stomach. Copyright © 2013 Elsevier Inc. All rights reserved.

  7. Combined sequence and structure analysis of the fungal laccase family.

    PubMed

    Kumar, S V Suresh; Phale, Prashant S; Durani, S; Wangikar, Pramod P

    2003-08-20

    Plant and fungal laccases belong to the family of multi-copper oxidases and show much broader substrate specificity than other members of the family. Laccases have consequently been of interest for potential industrial applications. We have analyzed the essential sequence features of fungal laccases based on multiple sequence alignments of more than 100 laccases. This has resulted in identification of a set of four ungapped sequence regions, L1-L4, as the overall signature sequences that can be used to identify the laccases, distinguishing them within the broader class of multi-copper oxidases. The 12 amino acid residues in the enzymes serving as the copper ligands are housed within these four identified conserved regions, of which L2 and L4 conform to the earlier reported copper signature sequences of multi-copper oxidases while L1 and L3 are distinctive to the laccases. The mapping of regions L1-L4 on to the three-dimensional structure of the Coprinus cinerius laccase indicates that many of the non-copper-ligating residues of the conserved regions could be critical in maintaining a specific, more or less C-2 symmetric, protein conformational motif characterizing the active site apparatus of the enzymes. The observed intraprotein homologies between L1 and L3 and between L2 and L4 at both the structure and the sequence levels suggest that the quasi C-2 symmetric active site conformational motif may have arisen from a structural duplication event that neither the sequence homology analysis nor the structure homology analysis alone would have unraveled. Although the sequence and structure homology is not detectable in the rest of the protein, the relative orientation of region L1 with L2 is similar to that of L3 with L4. The structure duplication of first-shell and second-shell residues has become cryptic because the intraprotein sequence homology noticeable for a given laccase becomes significant only after comparing the conservation pattern in several fungal laccases. The identified motifs, L1-L4, can be useful in searching the newly sequenced genomes for putative laccase enzymes. Copyright 2003 Wiley Periodicals, Inc. Biotechnol Bioeng 83: 386-394, 2003.

  8. Primary and secondary structural analyses of glutathione S-transferase pi from human placenta.

    PubMed

    Ahmad, H; Wilson, D E; Fritz, R R; Singh, S V; Medh, R D; Nagle, G T; Awasthi, Y C; Kurosky, A

    1990-05-01

    The primary structure of glutathione S-transferase (GST) pi from a single human placenta was determined. The structure was established by chemical characterization of tryptic and cyanogen bromide peptides as well as automated sequence analysis of the intact enzyme. The structural analysis indicated that the protein is comprised of 209 amino acid residues and gave no evidence of post-translational modifications. The amino acid sequence differed from that of the deduced amino acid sequence determined by nucleotide sequence analysis of a cDNA clone (Kano, T., Sakai, M., and Muramatsu, M., 1987, Cancer Res. 47, 5626-5630) at position 104 which contained both valine and isoleucine whereas the deduced sequence from nucleotide sequence analysis identified only isoleucine at this position. These results demonstrated that in the one individual placenta studied at least two GST pi genes are coexpressed, probably as a result of allelomorphism. Computer assisted consensus sequence evaluation identified a hydrophobic region in GST pi (residues 155-181) that was predicted to be either a buried transmembrane helical region or a signal sequence region. The significance of this hydrophobic region was interpreted in relation to the mode of action of the enzyme especially in regard to the potential involvement of a histidine in the active site mechanism. A comparison of the chemical similarity of five known human GST complete enzyme structures, one of pi, one of mu, two of alpha, and one microsomal, gave evidence that all five enzymes have evolved by a divergent evolutionary process after gene duplication, with the microsomal enzyme representing the most divergent form.

  9. Multivariate sequence analysis reveals additional function impacting residues in the SDR superfamily.

    PubMed

    Tiwari, Pratibha; Singh, Noopur; Dixit, Aparna; Choudhury, Devapriya

    2014-10-01

    The "extended" type of short chain dehydrogenases/reductases (SDR), share a remarkable similarity in their tertiary structures inspite of being highly divergent in their functions and sequences. We have carried out principal component analysis (PCA) on structurally equivalent residue positions of 10 SDR families using information theoretic measures like Jensen-Shannon divergence and average shannon entropy as variables. The results classify residue positions in the SDR fold into six groups, one of which is characterized by low Shannon entropies but high Jensen-Shannon divergence against the reference family SDR1E, suggesting that these positions are responsible for the specific functional identities of individual SDR families, distinguishing them from the reference family SDR1E. Site directed mutagenesis of three residues from this group in the enzyme UDP-Galactose 4-epimerase belonging to SDR1E shows that the mutants promote the formation of NADH containing abortive complexes. Finally, molecular dynamics simulations have been used to suggest a mechanism by which the mutants interfere with the re-oxidation of NADH leading to the formation of abortive complexes. © 2014 Wiley Periodicals, Inc.

  10. Using noble gas tracers to estimate residual CO2 saturation in the field: results from the CO2CRC Otway residual saturation and dissolution test

    NASA Astrophysics Data System (ADS)

    LaForce, T.; Ennis-King, J.; Paterson, L.

    2013-12-01

    Residual CO2 saturation is a critically important parameter in CO2 storage as it can have a large impact on the available secure storage volume and post-injection CO2 migration. A suite of single-well tests to measure residual trapping was conducted at the Otway test site in Victoria, Australia during 2011. One or more of these tests could be conducted at a prospective CO2 storage site before large-scale injection. The test involved injection of 150 tonnes of pure carbon dioxide followed by 454 tonnes of CO2-saturated formation water to drive the carbon dioxide to residual saturation. This work presents a brief overview of the full test sequence, followed by the analysis and interpretation of the tests using noble gas tracers. Prior to CO2 injection krypton (Kr) and xenon (Xe) tracers were injected and back-produced to characterise the aquifer under single-phase conditions. After CO2 had been driven to residual the two tracers were injected and produced again. The noble gases act as non-partitioning aqueous-phase tracers in the undisturbed aquifer and as partitioning tracers in the presence of residual CO2. To estimate residual saturation from the tracer test data a one-dimensional radial model of the near-well region is used. In the model there are only two independent parameters: the apparent dispersivity of each tracer and the residual CO2 saturation. Independent analysis of the Kr and Xe tracer production curves gives the same estimate of residual saturation to within the accuracy of the method. Furthermore the residual from the noble gas tracer tests is consistent with other measurements in the sequence of tests.

  11. RECOVIR Software for Identifying Viruses

    NASA Technical Reports Server (NTRS)

    Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui

    2013-01-01

    Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.

  12. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

    PubMed Central

    2014-01-01

    Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393

  13. Large‐scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe

    PubMed Central

    Necci, Marco; Piovesan, Damiano

    2016-01-01

    Abstract Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large‐scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence‐based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures. PMID:27636733

  14. Iterative refinement of structure-based sequence alignments by Seed Extension

    PubMed Central

    Kim, Changhoon; Tai, Chin-Hsien; Lee, Byungkook

    2009-01-01

    Background Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. Results RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. Conclusion RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs. PMID:19589133

  15. Sequence analysis of PROTEOLYSIS 6 from Solanum lycopersicum

    NASA Astrophysics Data System (ADS)

    Roslan, Nur Farhana; Chew, Bee Lyn; Goh, Hoe-Han; Isa, Nurulhikma Md

    2018-04-01

    The N-end rule pathway is a protein degradation pathway that relates the protein half-life with the identity of its N-terminal residues. A destabilizing N-terminal residues is created by enzymatic reaction or chemical modifications. This destabilized substrate will be recognized by PROTEOLYSIS 6 (PRT6) protein, which encodes an E3 ligase enzyme and resulted in substrate degradation by proteasome. PRT6 has been studied in Arabidopsis thaliana and barley but not yet been studied in fleshy fruit plants. Hence, this study was carried out in tomato that is known as the model for fleshy fruit plants. BLASTX analysis identified that Solyc09g010830 which encodes for a PRT6 gene in tomato based on its sequence similarity with PRT6 in A. thaliana. In silico gene expression analysis shows that PRT6 gene was highly expressed in tomato fruits breaker +5. Co-expression analysis shows that PRT6 may not only involved in abiotic stresses but also in biotic stresses. The objective is to analyze the sequence and characterize PRT6 gene in tomato.

  16. Identification and expression analysis of a novel R-type lectin from the coleopteran beetle, Tenebrio molitor.

    PubMed

    Kim, Dong Hyun; Patnaik, Bharat Bhusan; Seo, Gi Won; Kang, Seong Min; Lee, Yong Seok; Lee, Bok Luel; Han, Yeon Soo

    2013-11-01

    We have identified novel ricin-type (R-type) lectin by sequencing of random clones from cDNA library of the coleopteran beetle, Tenebrio molitor. The cDNA sequence is comprised of 495 bp encoding a protein of 164 amino acid residues and shows 49% identity with galectin of Tribolium castaneum. Bioinformatics analysis shows that the amino acid residues from 35 to 162 belong to ricin-type beta-trefoil structure. The transcript was significantly upregulated after early hours of injection with peptidoglycans derived from Gram (+) and Gram (-) bacteria, beta-1, 3 glucan from fungi and an intracellular pathogen, Listeria monocytogenes suggesting putative function in innate immunity. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Flanking signal and mature peptide residues influence signal peptide cleavage

    PubMed Central

    Choo, Khar Heng; Ranganathan, Shoba

    2008-01-01

    Background Signal peptides (SPs) mediate the targeting of secretory precursor proteins to the correct subcellular compartments in prokaryotes and eukaryotes. Identifying these transient peptides is crucial to the medical, food and beverage and biotechnology industries yet our understanding of these peptides remains limited. This paper examines the most common type of signal peptides cleavable by the endoprotease signal peptidase I (SPase I), and the residues flanking the cleavage sites of three groups of signal peptide sequences, namely (i) eukaryotes (Euk) (ii) Gram-positive (Gram+) bacteria, and (iii) Gram-negative (Gram-) bacteria. Results In this study, 2352 secretory peptide sequences from a variety of organisms with amino-terminal SPs are extracted from the manually curated SPdb database for analysis based on physicochemical properties such as pI, aliphatic index, GRAVY score, hydrophobicity, net charge and position-specific residue preferences. Our findings show that the three groups share several similarities in general, but they display distinctive features upon examination in terms of their amino acid compositions and frequencies, and various physico-chemical properties. Thus, analysis or prediction of their sequences should be separated and treated as distinct groups. Conclusion We conclude that the peptide segment recognized by SPase I extends to the start of the mature protein to a limited extent, upon our survey of the amino acid residues surrounding the cleavage processing site. These flanking residues possibly influence the cleavage processing and contribute to non-canonical cleavage sites. Our findings are applicable in defining more accurate prediction tools for recognition and identification of cleavage site of SPs. PMID:19091014

  18. Functional Analysis of the Accessory Protein TapA in Bacillus subtilis Amyloid Fiber Assembly

    PubMed Central

    Romero, Diego; Vlamakis, Hera; Losick, Richard

    2014-01-01

    Bacillus subtilis biofilm formation relies on the assembly of a fibrous scaffold formed by the protein TasA. TasA polymerizes into highly stable fibers with biochemical and morphological features of functional amyloids. Previously, we showed that assembly of TasA fibers requires the auxiliary protein TapA. In this study, we investigated the roles of TapA sequences from the C-terminal and N-terminal ends and TapA cysteine residues in its ability to promote the assembly of TasA amyloid-like fibers. We found that the cysteine residues are not essential for the formation of TasA fibers, as their replacement by alanine residues resulted in only minor defects in biofilm formation. Mutating sequences in the C-terminal half had no effect on biofilm formation. However, we identified a sequence of 8 amino acids in the N terminus that is key for TasA fiber formation. Strains expressing TapA lacking these 8 residues were completely defective in biofilm formation. In addition, this TapA mutant protein exhibited a dominant negative effect on TasA fiber formation. Even in the presence of wild-type TapA, the mutant protein inhibited fiber assembly in vitro and delayed biofilm formation in vivo. We propose that this 8-residue sequence is crucial for the formation of amyloid-like fibers on the cell surface, perhaps by mediating the interaction between TapA or TapA and TasA molecules. PMID:24488317

  19. Functional analysis of the accessory protein TapA in Bacillus subtilis amyloid fiber assembly.

    PubMed

    Romero, Diego; Vlamakis, Hera; Losick, Richard; Kolter, Roberto

    2014-04-01

    Bacillus subtilis biofilm formation relies on the assembly of a fibrous scaffold formed by the protein TasA. TasA polymerizes into highly stable fibers with biochemical and morphological features of functional amyloids. Previously, we showed that assembly of TasA fibers requires the auxiliary protein TapA. In this study, we investigated the roles of TapA sequences from the C-terminal and N-terminal ends and TapA cysteine residues in its ability to promote the assembly of TasA amyloid-like fibers. We found that the cysteine residues are not essential for the formation of TasA fibers, as their replacement by alanine residues resulted in only minor defects in biofilm formation. Mutating sequences in the C-terminal half had no effect on biofilm formation. However, we identified a sequence of 8 amino acids in the N terminus that is key for TasA fiber formation. Strains expressing TapA lacking these 8 residues were completely defective in biofilm formation. In addition, this TapA mutant protein exhibited a dominant negative effect on TasA fiber formation. Even in the presence of wild-type TapA, the mutant protein inhibited fiber assembly in vitro and delayed biofilm formation in vivo. We propose that this 8-residue sequence is crucial for the formation of amyloid-like fibers on the cell surface, perhaps by mediating the interaction between TapA or TapA and TasA molecules.

  20. Characterization of a marsupial sperm protamine gene and its transcripts from the North American opossum (Didelphis marsupialis).

    PubMed

    Winkfein, R J; Nishikawa, S; Connor, W; Dixon, G H

    1993-07-01

    A synthetic oligonucleotide primer, designed from marsupial protamine protein-sequence data [Balhorn, R., Corzett, M., Matrimas, J. A., Cummins, J. & Faden, B. (1989) Analysis of protamines isolated from two marsupials, the ring-tailed wallaby and gray short-tailed opossum, J. Cell. Biol. 107] was used to amplify, via the polymerase chain reaction, protamine sequences from a North American opossum (Didelphis marsupialis) cDNA. Using the amplified sequences as probes, several protamine cDNA clones were isolated. The protein sequence, predicted from the cDNA sequences, consisted of 57 amino acids, contained a large number of arginine residues and exhibited the sequence ARYR at its amino terminus, which is conserved in avian and most eutherian mammal protamines. Like the true protamines of trout and chicken, the opossum protamine lacked cysteine residues, distinguishing it from placental mammalian protamine 1 (P1 or stable) protamines. Examination of the protamine gene, isolated by polymerase-chain-reaction amplification of genomic DNA, revealed the presence of an intron dividing the protamine-coding region, a common characteristic of all mammalian P1 genes. In addition, extensive sequence identity in the 5' and 3' flanking regions between mouse and opossum sequences classify the marsupial protamine as being closely related to placental mammal P1. Protamine transcripts, in both birds and mammals, are present in two size classes, differing by the length of their poly(A) tails (either short or long). Examination of opossum protamine transcripts by Northern hybridization revealed four distinct mRNA species in the total RNA fraction, two of which were enriched in the poly(A)-rich fraction. Northern-blot analysis, using an intron-specific probe, revealed the presence of intron sequences in two of the four protamine transcripts. If expressed, the corresponding protein from intron-containing transcripts would differ from spliced transcripts by length (49 versus 57 amino acids) and would contain a cysteine residue.

  1. Molecular characterization of canine parvovirus in Vientiane, Laos.

    PubMed

    Vannamahaxay, Soulasack; Vongkhamchanh, Souliya; Intanon, Montira; Tangtrongsup, Sahatchai; Tiwananthagorn, Saruda; Pringproa, Kidsadagon; Chuammitri, Phongsakorn

    2017-05-01

    The global emergence of canine parvovirus type 2c (CPV-2c) has been well documented. In the present study, 139 rectal swab samples collected from diarrheic dogs living in Vientiane, Laos, in 2016 were tested for the presence of the canine parvovirus (CPV) VP2 gene by PCR. The results showed that 82.73% (115/139) of dogs were CPV positive by PCR. The partial VP2 gene was sequenced in 94 of the positive samples; 91 samples belonged to CPV-2c (426Glu) subtype, while 3 samples belonged to the CPV-2a (426Asn) subtype. Notably, phylogenetic analysis of amino acid sequences revealed a close relationship between Laotian isolates and novel Chinese CPV-2c isolates. In Laotian CPV isolates, aligned protein sequences indicated a high rate of residue substitutions at positions 305, 324, 345, 370, 375, and 426 in the GH loop. The mutation at residue 370 (Q370R), a single mutation, was characterized as a unique mutant residue specific to the Laotian CPV-2c variant.

  2. Genome-Wide Analysis of Oleosin Gene Family in 22 Tree Species: An Accelerator for Metabolic Engineering of BioFuel Crops and Agrigenomics Industrial Applications?

    PubMed Central

    2015-01-01

    Abstract Trees contribute to enormous plant oil reserves because many trees contain 50%–80% of oil (triacylglycerols, TAGs) in the fruits and kernels. TAGs accumulate in subcellular structures called oil bodies/droplets, in which TAGs are covered by low-molecular-mass hydrophobic proteins called oleosins (OLEs). The OLEs/TAGs ratio determines the size and shape of intracellular oil bodies. There is a lack of comprehensive sequence analysis and structural information of OLEs among diverse trees. The objectives of this study were to identify OLEs from 22 tree species (e.g., tung tree, tea-oil tree, castor bean), perform genome-wide analysis of OLEs, classify OLEs, identify conserved sequence motifs and amino acid residues, and predict secondary and three-dimensional structures in tree OLEs and OLE subfamilies. Data mining identified 65 OLEs with perfect conservation of the “proline knot” motif (PX5SPX3P) from 19 trees. These OLEs contained >40% hydrophobic amino acid residues. They displayed similar properties and amino acid composition. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that these proteins could be classified into five OLE subfamilies. There were distinct patterns of sequence conservation among the OLE subfamilies and within individual tree species. Computational modeling indicated that OLEs were composed of at least three α-helixes connected with short coils without any β-strand and that they exhibited distinct 3D structures and ligand binding sites. These analyses provide fundamental information in the similarity and specificity of diverse OLE isoforms within the same subfamily and among the different species, which should facilitate studying the structure-function relationship and identify critical amino acid residues in OLEs for metabolic engineering of tree TAGs. PMID:26258573

  3. Genome-Wide Analysis of Oleosin Gene Family in 22 Tree Species: An Accelerator for Metabolic Engineering of BioFuel Crops and Agrigenomics Industrial Applications?

    PubMed

    Cao, Heping

    2015-09-01

    Trees contribute to enormous plant oil reserves because many trees contain 50%-80% of oil (triacylglycerols, TAGs) in the fruits and kernels. TAGs accumulate in subcellular structures called oil bodies/droplets, in which TAGs are covered by low-molecular-mass hydrophobic proteins called oleosins (OLEs). The OLEs/TAGs ratio determines the size and shape of intracellular oil bodies. There is a lack of comprehensive sequence analysis and structural information of OLEs among diverse trees. The objectives of this study were to identify OLEs from 22 tree species (e.g., tung tree, tea-oil tree, castor bean), perform genome-wide analysis of OLEs, classify OLEs, identify conserved sequence motifs and amino acid residues, and predict secondary and three-dimensional structures in tree OLEs and OLE subfamilies. Data mining identified 65 OLEs with perfect conservation of the "proline knot" motif (PX5SPX3P) from 19 trees. These OLEs contained >40% hydrophobic amino acid residues. They displayed similar properties and amino acid composition. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that these proteins could be classified into five OLE subfamilies. There were distinct patterns of sequence conservation among the OLE subfamilies and within individual tree species. Computational modeling indicated that OLEs were composed of at least three α-helixes connected with short coils without any β-strand and that they exhibited distinct 3D structures and ligand binding sites. These analyses provide fundamental information in the similarity and specificity of diverse OLE isoforms within the same subfamily and among the different species, which should facilitate studying the structure-function relationship and identify critical amino acid residues in OLEs for metabolic engineering of tree TAGs.

  4. UniDrug-target: a computational tool to identify unique drug targets in pathogenic bacteria.

    PubMed

    Chanumolu, Sree Krishna; Rout, Chittaranjan; Chauhan, Rajinder S

    2012-01-01

    Targeting conserved proteins of bacteria through antibacterial medications has resulted in both the development of resistant strains and changes to human health by destroying beneficial microbes which eventually become breeding grounds for the evolution of resistances. Despite the availability of more than 800 genomes sequences, 430 pathways, 4743 enzymes, 9257 metabolic reactions and protein (three-dimensional) 3D structures in bacteria, no pathogen-specific computational drug target identification tool has been developed. A web server, UniDrug-Target, which combines bacterial biological information and computational methods to stringently identify pathogen-specific proteins as drug targets, has been designed. Besides predicting pathogen-specific proteins essentiality, chokepoint property, etc., three new algorithms were developed and implemented by using protein sequences, domains, structures, and metabolic reactions for construction of partial metabolic networks (PMNs), determination of conservation in critical residues, and variation analysis of residues forming similar cavities in proteins sequences. First, PMNs are constructed to determine the extent of disturbances in metabolite production by targeting a protein as drug target. Conservation of pathogen-specific protein's critical residues involved in cavity formation and biological function determined at domain-level with low-matching sequences. Last, variation analysis of residues forming similar cavities in proteins sequences from pathogenic versus non-pathogenic bacteria and humans is performed. The server is capable of predicting drug targets for any sequenced pathogenic bacteria having fasta sequences and annotated information. The utility of UniDrug-Target server was demonstrated for Mycobacterium tuberculosis (H37Rv). The UniDrug-Target identified 265 mycobacteria pathogen-specific proteins, including 17 essential proteins which can be potential drug targets. UniDrug-Target is expected to accelerate pathogen-specific drug targets identification which will increase their success and durability as drugs developed against them have less chance to develop resistances and adverse impact on environment. The server is freely available at http://117.211.115.67/UDT/main.html. The standalone application (source codes) is available at http://www.bioinformatics.org/ftp/pub/bioinfojuit/UDT.rar.

  5. A strategy for detecting the conservation of folding-nucleus residues in protein superfamilies.

    PubMed

    Michnick, S W; Shakhnovich, E

    1998-01-01

    Nucleation-growth theory predicts that fast-folding peptide sequences fold to their native structure via structures in a transition-state ensemble that share a small number of native contacts (the folding nucleus). Experimental and theoretical studies of proteins suggest that residues participating in folding nuclei are conserved among homologs. We attempted to determine if this is true in proteins with highly diverged sequences but identical folds (superfamilies). We describe a strategy based on comparisons of residue conservation in natural superfamily sequences with simulated sequences (generated with a Monte-Carlo sequence design strategy) for the same proteins. The basic assumptions of the strategy were that natural sequences will conserve residues needed for folding and stability plus function, the simulated sequences contain no functional conservation, and nucleus residues make native contacts with each other. Based on these assumptions, we identified seven potential nucleus residues in ubiquitin superfamily members. Non-nucleus conserved residues were also identified; these are proposed to be involved in stabilizing native interactions. We found that all superfamily members conserved the same potential nucleus residue positions, except those for which the structural topology is significantly different. Our results suggest that the conservation of the nucleus of a specific fold can be predicted by comparing designed simulated sequences with natural highly diverged sequences that fold to the same structure. We suggest that such a strategy could be used to help plan protein folding and design experiments, to identify new superfamily members, and to subdivide superfamilies further into classes having a similar folding mechanism.

  6. The primary structure of rat liver ribosomal protein L37. Homology with yeast and bacterial ribosomal proteins.

    PubMed

    Lin, A; McNally, J; Wool, I G

    1983-09-10

    The covalent structure of the rat liver 60 S ribosomal subunit protein L37 was determined. Twenty-four tryptic peptides were purified and the sequence of each was established; they accounted for all 111 residues of L37. The sequence of the first 30 residues of L37, obtained previously by automated Edman degradation of the intact protein, provided the alignment of the first 9 tryptic peptides. Three peptides (CN1, CN2, and CN3) were produced by cleavage of protein L37 with cyanogen bromide. The sequence of CN1 (65 residues) was established from the sequence of secondary peptides resulting from cleavage with trypsin and chymotrypsin. The sequence of CN1 in turn served to order tryptic peptides 1 through 14. The sequence of CN2 (15 residues) was determined entirely by a micromanual procedure and allowed the alignment of tryptic peptides 14 through 18. The sequence of the NH2-terminal 28 amino acids of CN3 (31 residues) was determined; in addition the complete sequences of the secondary tryptic and chymotryptic peptides were done. The sequence of CN3 provided the order of tryptic peptides 18 through 24. Thus the sequence of the three cyanogen bromide peptides also accounted for the 111 residues of protein L37. The carboxyl-terminal amino acids were identified after carboxypeptidase A treatment. There is a disulfide bridge between half-cystinyl residues at positions 40 and 69. Rat liver ribosomal protein L37 is homologous with yeast YP55 and with Escherichia coli L34. Moreover, there is a segment of 17 residues in rat L37 that occurs, albeit with modifications, in yeast YP55 and in E. coli S4, L20, and L34.

  7. Partial amino acid sequence of the branched chain amino acid aminotransferase (TmB) of E. coli JA199 pDU11

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feild, M.J.; Armstrong, F.B.

    1987-05-01

    E. coli JA199 pDU11 harbors a multicopy plasmid containing the ilv GEDAY gene cluster of S. typhimurium. TmB, gene product of ilv E, was purified, crystallized, and subjected to Edman degradation using a gas phase sequencer. The intact protein yielded an amino terminal 31 residue sequence. Both carboxymethylated apoenzyme and (/sup 3/H)-NaBH-reduced holoenzyme were then subjected to digestion by trypsin. The digests were fractionated using reversed phase HPLC, and the peptides isolated were sequenced. The borohydride-treated holoenzyme was used to isolate the cofactor-binding peptide. The peptide is 27 residues long and a comparison with known sequences of other aminotransferases revealedmore » limited homology. Peptides accounting for 211 of 288 predicted residues have been sequenced, including 9 residues of the carboxyl terminus. Comparison of peptides with the inferred amino acid sequence of the E. coli K-12 enzyme has helped determine the sequence of the amino terminal 59 residues; only two differences between the sequences are noted in this region.« less

  8. Nonenzymatic template-directed synthesis on hairpin oligonucleotides. 3. Incorporation of adenosine and uridine residues

    NASA Technical Reports Server (NTRS)

    Wu, T.; Orgel, L. E.

    1992-01-01

    We have used [32P]-labeled hairpin oligonucleotides to study template-directed synthesis on templates containing one or more A or T residues within a run of C residues. When nucleoside-5'-phosphoro(2-methyl)imidazolides are used as substrates, isolated A and T residues function efficiently in facilitating the incorporation of U and A, respectively. The reactions are regiospecific, producing mainly 3'-5'-phosphodiester bonds. Pairs of consecutive non-C residues are copied much less efficiently. Limited synthesis of CA and AC sequences on templates containing TG and GT sequences was observed along with some synthesis of the AA sequences on templates containing TT sequences. The other dimer sequences investigated, AA, AG, GA, TA, and AT, could not be copied. If A is absent from the reaction mixture, misincorporation of G residues is a significant reaction on templates containing an isolated T residue or two consecutive T residues. However, if both A and G are present, A is incorporated to a much greater extent than G. We believe that wobble-pairing between T and G is responsible for misincorporation when only G is present.

  9. Identification of two allelic IgG1 C(H) coding regions (Cgamma1) of cat.

    PubMed

    Kanai, T H; Ueda, S; Nakamura, T

    2000-01-31

    Two types of cDNA encoding IgG1 heavy chain (gamma1) were isolated from a single domestic short-hair cat. Sequence analysis indicated a higher level of similarity of these Cgamma1 sequences to human Cgamma1 sequence (76.9 and 77.0%) than to mouse sequence (70.0 and 69.7%) at the nucleotide level. Predicted primary structures of both the feline Cgamma1 genes, designated as Cgamma1a and Cgamma1b, were similar to that of human Cgamma1 gene, for instance, as to the size of constant domains, the presence of six conserved cysteine residues involved in formation of the domain structure, and the location of a conserved N-linked glycosylation site. Sequence comparison between the two alleles showed that 7 out of 10 nucleotide differences were within the C(H)3 domain coding region, all leading to nonsynonymous changes in amino acid residues. Partial sequence analysis of genomic clones showed three nucleotide substitutions between the two Cgamma1 alleles in the intron between the CH2 and C(H)3 domain coding regions. In 12 domestic short-hair cats used in this study, the frequency of Cgamma1a allele (62.5%) was higher than that of the Cgamma1b allele (37.5%).

  10. The Structure of Rauvolfia serpentina Strictosidine Synthase Is a Novel Six-Bladed β-Propeller Fold in Plant Proteins[W

    PubMed Central

    Ma, Xueyan; Panjikar, Santosh; Koepke, Juergen; Loris, Elke; Stöckigt, Joachim

    2006-01-01

    The enzyme strictosidine synthase (STR1) from the Indian medicinal plant Rauvolfia serpentina is of primary importance for the biosynthetic pathway of the indole alkaloid ajmaline. Moreover, STR1 initiates all biosynthetic pathways leading to the entire monoterpenoid indole alkaloid family representing an enormous structural variety of ∼2000 compounds in higher plants. The crystal structures of STR1 in complex with its natural substrates tryptamine and secologanin provide structural understanding of the observed substrate preference and identify residues lining the active site surface that contact the substrates. STR1 catalyzes a Pictet-Spengler–type reaction and represents a novel six-bladed β-propeller fold in plant proteins. Structure-based sequence alignment revealed a common repetitive sequence motif (three hydrophobic residues are followed by a small residue and a hydrophilic residue), indicating a possible evolutionary relationship between STR1 and several sequence-unrelated six-bladed β-propeller structures. Structural analysis and site-directed mutagenesis experiments demonstrate the essential role of Glu-309 in catalysis. The data will aid in deciphering the details of the reaction mechanism of STR1 as well as other members of this enzyme family. PMID:16531499

  11. The structure of Rauvolfia serpentina strictosidine synthase is a novel six-bladed beta-propeller fold in plant proteins.

    PubMed

    Ma, Xueyan; Panjikar, Santosh; Koepke, Juergen; Loris, Elke; Stöckigt, Joachim

    2006-04-01

    The enzyme strictosidine synthase (STR1) from the Indian medicinal plant Rauvolfia serpentina is of primary importance for the biosynthetic pathway of the indole alkaloid ajmaline. Moreover, STR1 initiates all biosynthetic pathways leading to the entire monoterpenoid indole alkaloid family representing an enormous structural variety of approximately 2000 compounds in higher plants. The crystal structures of STR1 in complex with its natural substrates tryptamine and secologanin provide structural understanding of the observed substrate preference and identify residues lining the active site surface that contact the substrates. STR1 catalyzes a Pictet-Spengler-type reaction and represents a novel six-bladed beta-propeller fold in plant proteins. Structure-based sequence alignment revealed a common repetitive sequence motif (three hydrophobic residues are followed by a small residue and a hydrophilic residue), indicating a possible evolutionary relationship between STR1 and several sequence-unrelated six-bladed beta-propeller structures. Structural analysis and site-directed mutagenesis experiments demonstrate the essential role of Glu-309 in catalysis. The data will aid in deciphering the details of the reaction mechanism of STR1 as well as other members of this enzyme family.

  12. Connecting Active-Site Loop Conformations and Catalysis in Triosephosphate Isomerase: Insights from a Rare Variation at Residue 96 in the Plasmodial Enzyme.

    PubMed

    Pareek, Vidhi; Samanta, Moumita; Joshi, Niranjan V; Balaram, Hemalatha; Murthy, Mathur R N; Balaram, Padmanabhan

    2016-04-01

    Despite extensive research into triosephosphate isomerases (TIMs), there exists a gap in understanding of the remarkable conjunction between catalytic loop-6 (residues 166-176) movement and the conformational flip of Glu165 (catalytic base) upon substrate binding that primes the active site for efficient catalysis. The overwhelming occurrence of serine at position 96 (98% of the 6277 unique TIM sequences), spatially proximal to E165 and the loop-6 residues, raises questions about its role in catalysis. Notably, Plasmodium falciparum TIM has an extremely rare residue--phenylalanine--at this position whereas, curiously, the mutant F96S was catalytically defective. We have obtained insights into the influence of residue 96 on the loop-6 conformational flip and E165 positioning by combining kinetic and structural studies on the PfTIM F96 mutants F96Y, F96A, F96S/S73A, and F96S/L167V with sequence conservation analysis and comparative analysis of the available apo and holo structures of the enzyme from diverse organisms. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Multi-Harmony: detecting functional specificity from sequence alignment

    PubMed Central

    Brandt, Bernd W.; Feenstra, K. Anton; Heringa, Jaap

    2010-01-01

    Many protein families contain sub-families with functional specialization, such as binding different ligands or being involved in different protein–protein interactions. A small number of amino acids generally determine functional specificity. The identification of these residues can aid the understanding of protein function and help finding targets for experimental analysis. Here, we present multi-Harmony, an interactive web sever for detecting sub-type-specific sites in proteins starting from a multiple sequence alignment. Combining our Sequence Harmony (SH) and multi-Relief (mR) methods in one web server allows simultaneous analysis and comparison of specificity residues; furthermore, both methods have been significantly improved and extended. SH has been extended to cope with more than two sub-groups. mR has been changed from a sampling implementation to a deterministic one, making it more consistent and user friendly. For both methods Z-scores are reported. The multi-Harmony web server produces a dynamic output page, which includes interactive connections to the Jalview and Jmol applets, thereby allowing interactive analysis of the results. Multi-Harmony is available at http://www.ibi.vu.nl/ programs/shmrwww. PMID:20525785

  14. The complete sequence and structural analysis of human apolipoprotein B-100: relationship between apoB-100 and apoB-48 forms.

    PubMed Central

    Cladaras, C; Hadzopoulou-Cladaras, M; Nolte, R T; Atkinson, D; Zannis, V I

    1986-01-01

    We have isolated and sequenced overlapping cDNA clones covering the entire sequence of human apolipoprotein B-100 (apoB-100). DNA sequence analysis and determination of the mRNA transcription initiation site by S1 nuclease mapping showed that the apoB mRNA consists of 14,112 nucleotides including the 5' and 3' untranslated regions which are 128 and 301 nucleotides respectively. The DNA-derived protein sequence shows that apoB-100 is 513,000 daltons and contains 4560 amino acids including a 24-amino-acid-long signal peptide. The mol. wt of apoB-100 implies that there is one apoB molecule per LDL particle. Computer analysis of the predicted secondary structure of the protein showed that some of the potential alpha helical and beta sheet structures are amphipathic, whereas others have non-amphipathic neutral to apolar character. These latter regions may contribute to the formation of the lipid-binding domains of apoB-100. The protein contains 25 cysteines and 20 potential N-glycosylation sites. The majority of cysteines are distributed in the amino terminal portion of the protein. Four of the potential glycosylation sites are in predicted beta turn structures and may represent true glycosylation positions. ApoB lacks the tandem repeats which are characteristic of other apolipoproteins. The mean hydrophobicity the mean value of H1 and helical hydrophobic moment the mean value of microH profiles of apoB showed the presence of several potential helical regions with strong polar character and high hydrophobic moment. The region with the highest hydrophobic moment, between amino acid residues 3352 and 3369, contains five closely spaced, positively charged residues, and has sequence homology to the LDL receptor binding site of apoE. This region is flanked by three neighbouring regions with positively charged amino acids and high hydrophobic moment that are located between residues 3174 and 3681. One or more of these closely spaced apoB sequences may be involved in the formation of the LDL receptor-binding domain of apoB-100. Blotting analysis of intestinal RNA and hybridization of the blots with carboxy apoB cDNA probes produced a single 15-kb hybridization band whereas hybridization with amino terminal probes produced two hybridization bands of 15 and 8 kb. Our data indicate that both forms of apoB mRNA contain common sequences which extend from the amino terminal of apoB-100 to the vicinity of nucleotide residue 6300. These two messages may have resulted from differential splicing of the same primary apoB mRNA transcript. Images Fig. 4. Fig. 6. PMID:3030729

  15. On the relationship between residue structural environment and sequence conservation in proteins.

    PubMed

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  16. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    PubMed

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.

  17. Can natural proteins designed with 'inverted' peptide sequences adopt native-like protein folds?

    PubMed

    Sridhar, Settu; Guruprasad, Kunchur

    2014-01-01

    We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.

  18. Molecular evolution of miraculin-like proteins in soybean Kunitz super-family.

    PubMed

    Selvakumar, Purushotham; Gahloth, Deepankar; Tomar, Prabhat Pratap Singh; Sharma, Nidhi; Sharma, Ashwani Kumar

    2011-12-01

    Miraculin-like proteins (MLPs) belong to soybean Kunitz super-family and have been characterized from many plant families like Rutaceae, Solanaceae, Rubiaceae, etc. Many of them possess trypsin inhibitory activity and are involved in plant defense. MLPs exhibit significant sequence identity (~30-95%) to native miraculin protein, also belonging to Kunitz super-family compared with a typical Kunitz family member (~30%). The sequence and structure-function comparison of MLPs with that of a classical Kunitz inhibitor have demonstrated that MLPs have evolved to form a distinct group within Kunitz super-family. Sequence analysis of new genes along with available MLP sequences in the literature revealed three major groups for these proteins. A significant feature of Rutaceae MLP type 2 sequences is the presence of phosphorylation motif. Subtle changes are seen in putative reactive loop residues among different MLPs suggesting altered specificities to specific proteases. In phylogenetic analysis, Rutaceae MLP type 1 and type 2 proteins clustered together on separate branches, whereas native miraculin along with other MLPs formed distinct clusters. Site-specific positive Darwinian selection was observed at many sites in both the groups of Rutaceae MLP sequences with most of the residues undergoing positive selection located in loop regions. The results demonstrate the sequence and thereby the structure-function divergence of MLPs as a distinct group within soybean Kunitz super-family due to biotic and abiotic stresses of local environment.

  19. ChAy/Bx, a novel chimeric high-molecular-weight glutenin subunit gene apparently created by homoeologous recombination in Triticum turgidum ssp. dicoccoides.

    PubMed

    Guo, Xiao-Hui; Bi, Zhe-Guang; Wu, Bi-Hua; Wang, Zhen-Zhen; Hu, Ji-Liang; Zheng, You-Liang; Liu, Deng-Cai

    2013-12-01

    High-molecular-weight glutenin subunits (HMW-GSs) are of considerable interest, because they play a crucial role in determining dough viscoelastic properties and end-use quality of wheat flour. In this paper, ChAy/Bx, a novel chimeric HMW-GS gene from Triticum turgidum ssp. dicoccoides (AABB, 2n=4x=28) accession D129, was isolated and characterized. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) analysis revealed that the electrophoretic mobility of the glutenin subunit encoded by ChAy/Bx was slightly faster than that of 1Dy12. The complete ORF of ChAy/Bx contained 1,671 bp encoding a deduced polypeptide of 555 amino acid residues (or 534 amino acid residues for the mature protein), making it the smallest HMW-GS gene known from Triticum species. Sequence analysis showed that ChAy/Bx was neither a conventional x-type nor a conventional y-type subunit gene, but a novel chimeric gene. Its first 1305 nt sequence was highly homologous with the corresponding sequence of 1Ay type genes, while its final 366 nt sequence was highly homologous with the corresponding sequence of 1Bx type genes. The mature ChAy/Bx protein consisted of the N-terminus of 1Ay type subunit (the first 414 amino acid residues) and the C-terminus of 1Bx type subunit (the final 120 amino acid residues). Secondary structure prediction showed that ChAy/Bx contained some domains of 1Ay subunit and some domains of 1Bx subunit. The special structure of this HMW glutenin chimera ChAy/Bx subunit might have unique effects on the end-use quality of wheat flour. Here we propose that homoeologous recombination might be a novel pathway for allelic variation or molecular evolution of HMW-GSs. © 2013.

  20. An inter-residue network model to identify mutational-constrained regions on the Ebola coat glycoprotein

    PubMed Central

    Quinlan, Devin S.; Raman, Rahul; Tharakaraman, Kannan; Subramanian, Vidya; del Hierro, Gabriella; Sasisekharan, Ram

    2017-01-01

    Recently, progress has been made in the development of vaccines and monoclonal antibody cocktails that target the Ebola coat glycoprotein (GP). Based on the mutation rates for Ebola virus given its natural sequence evolution, these treatment strategies are likely to impose additional selection pressure to drive acquisition of mutations in GP that escape neutralization. Given the high degree of sequence conservation among GP of Ebola viruses, it would be challenging to determine the propensity of acquiring mutations in response to vaccine or treatment with one or a cocktail of monoclonal antibodies. In this study, we analyzed the mutability of each residue using an approach that captures the structural constraints on mutability based on the extent of its inter-residue interaction network within the three-dimensional structure of the trimeric GP. This analysis showed two distinct clusters of highly networked residues along the GP1-GP2 interface, part of which overlapped with epitope surfaces of known neutralizing antibodies. This network approach also permitted us to identify additional residues in the network of the known hotspot residues of different anti-Ebola antibodies that would impact antibody-epitope interactions. PMID:28397835

  1. Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ecale Zhou, C L; Zemla, A T; Roe, D

    2005-01-29

    Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers, or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set ofmore » ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context, and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics, and vaccines.« less

  2. Computational analysis of histidine mutations on the structural stability of human tyrosinases leading to albinism insurgence.

    PubMed

    Hassan, Mubashir; Abbas, Qamar; Raza, Hussain; Moustafa, Ahmed A; Seo, Sung-Yum

    2017-07-25

    Misfolding and structural alteration in proteins lead to serious malfunctions and cause various diseases in humans. Mutations at the active binding site in tyrosinase impair structural stability and cause lethal albinism by abolishing copper binding. To evaluate the histidine mutational effect, all mutated structures were built using homology modelling. The protein sequence was retrieved from the UniProt database, and 3D models of original and mutated human tyrosinase sequences were predicted by changing the residual positions within the target sequence separately. Structural and mutational analyses were performed to interpret the significance of mutated residues (N 180 , R 202 , Q 202 , R 211 , Y 363 , R 367 , Y 367 and D 390 ) at the active binding site of tyrosinases. CSpritz analysis depicted that 23.25% residues actively participate in the instability of tyrosinase. The accuracy of predicted models was confirmed through online servers ProSA-web, ERRAT score and VERIFY 3D values. The theoretical pI and GRAVY generated results also showed the accuracy of the predicted models. The CCA negative correlation results depicted that the replacement of mutated residues at His within the active binding site disturbs the structural stability of tyrosinases. The predicted CCA scores of Tyr 367 (-0.079) and Q/R 202 (0.032) revealed that both mutations have more potential to disturb the structural stability. MD simulation analyses of all predicted models justified that Gln 202 , Arg 202 , Tyr 367 and D 390 replacement made the protein structures more susceptible to destabilization. Mutational results showed that the replacement of His with Q/R 202 and Y/R 363 has a lethal effect and may cause melanin associated diseases such as OCA1. Taken together, our computational analysis depicts that the mutated residues such as Q/R 202 and Y/R 363 actively participate in instability and misfolding of tyrosinases, which may govern OCA1 through disturbing the melanin biosynthetic pathway.

  3. Comparative analysis of seven viral nuclear export signals (NESs) reveals the crucial role of nuclear export mediated by the third NES consensus sequence of nucleoprotein (NP) in influenza A virus replication.

    PubMed

    Chutiwitoonchai, Nopporn; Kakisaka, Michinori; Yamada, Kazunori; Aida, Yoko

    2014-01-01

    The assembly of influenza virus progeny virions requires machinery that exports viral genomic ribonucleoproteins from the cell nucleus. Currently, seven nuclear export signal (NES) consensus sequences have been identified in different viral proteins, including NS1, NS2, M1, and NP. The present study examined the roles of viral NES consensus sequences and their significance in terms of viral replication and nuclear export. Mutation of the NP-NES3 consensus sequence resulted in a failure to rescue viruses using a reverse genetics approach, whereas mutation of the NS2-NES1 and NS2-NES2 sequences led to a strong reduction in viral replication kinetics compared with the wild-type sequence. While the viral replication kinetics for other NES mutant viruses were also lower than those of the wild-type, the difference was not so marked. Immunofluorescence analysis after transient expression of NP-NES3, NS2-NES1, or NS2-NES2 proteins in host cells showed that they accumulated in the cell nucleus. These results suggest that the NP-NES3 consensus sequence is mostly required for viral replication. Therefore, each of the hydrophobic (Φ) residues within this NES consensus sequence (Φ1, Φ2, Φ3, or Φ4) was mutated, and its viral replication and nuclear export function were analyzed. No viruses harboring NP-NES3 Φ2 or Φ3 mutants could be rescued. Consistent with this, the NP-NES3 Φ2 and Φ3 mutants showed reduced binding affinity with CRM1 in a pull-down assay, and both accumulated in the cell nucleus. Indeed, a nuclear export assay revealed that these mutant proteins showed lower nuclear export activity than the wild-type protein. Moreover, the Φ2 and Φ3 residues (along with other Φ residues) within the NP-NES3 consensus were highly conserved among different influenza A viruses, including human, avian, and swine. Taken together, these results suggest that the Φ2 and Φ3 residues within the NP-NES3 protein are important for its nuclear export function during viral replication.

  4. Binding properties of SUMO-interacting motifs (SIMs) in yeast.

    PubMed

    Jardin, Christophe; Horn, Anselm H C; Sticht, Heinrich

    2015-03-01

    Small ubiquitin-like modifier (SUMO) conjugation and interaction play an essential role in many cellular processes. A large number of yeast proteins is known to interact non-covalently with SUMO via short SUMO-interacting motifs (SIMs), but the structural details of this interaction are yet poorly characterized. In the present work, sequence analysis of a large dataset of 148 yeast SIMs revealed the existence of a hydrophobic core binding motif and a preference for acidic residues either within or adjacent to the core motif. Thus the sequence properties of yeast SIMs are highly similar to those described for human. Molecular dynamics simulations were performed to investigate the binding preferences for four representative SIM peptides differing in the number and distribution of acidic residues. Furthermore, the relative stability of two previously observed alternative binding orientations (parallel, antiparallel) was assessed. For all SIMs investigated, the antiparallel binding mode remained stable in the simulations and the SIMs were tightly bound via their hydrophobic core residues supplemented by polar interactions of the acidic residues. In contrary, the stability of the parallel binding mode is more dependent on the sequence features of the SIM motif like the number and position of acidic residues or the presence of additional adjacent interaction motifs. This information should be helpful to enhance the prediction of SIMs and their binding properties in different organisms to facilitate the reconstruction of the SUMO interactome.

  5. The diversity of H3 loops determines the antigen-binding tendencies of antibody CDR loops.

    PubMed

    Tsuchiya, Yuko; Mizuguchi, Kenji

    2016-04-01

    Of the complementarity-determining regions (CDRs) of antibodies, H3 loops, with varying amino acid sequences and loop lengths, adopt particularly diverse loop conformations. The diversity of H3 conformations produces an array of antigen recognition patterns involving all the CDRs, in which the residue positions actually in contact with the antigen vary considerably. Therefore, for a deeper understanding of antigen recognition, it is necessary to relate the sequence and structural properties of each residue position in each CDR loop to its ability to bind antigens. In this study, we proposed a new method for characterizing the structural features of the CDR loops and obtained the antigen-binding ability of each residue position in each CDR loop. This analysis led to a simple set of rules for identifying probable antigen-binding residues. We also found that the diversity of H3 loop lengths and conformations affects the antigen-binding tendencies of all the CDR loops. © 2016 The Protein Society.

  6. Structural and sequencing analysis of local target DNA recognition by MLV integrase.

    PubMed

    Aiyer, Sriram; Rossi, Paolo; Malani, Nirav; Schneider, William M; Chandar, Ashwin; Bushman, Frederic D; Montelione, Gaetano T; Roth, Monica J

    2015-06-23

    Target-site selection by retroviral integrase (IN) proteins profoundly affects viral pathogenesis. We describe the solution nuclear magnetic resonance structure of the Moloney murine leukemia virus IN (M-MLV) C-terminal domain (CTD) and a structural homology model of the catalytic core domain (CCD). In solution, the isolated MLV IN CTD adopts an SH3 domain fold flanked by a C-terminal unstructured tail. We generated a concordant MLV IN CCD structural model using SWISS-MODEL, MMM-tree and I-TASSER. Using the X-ray crystal structure of the prototype foamy virus IN target capture complex together with our MLV domain structures, residues within the CCD α2 helical region and the CTD β1-β2 loop were predicted to bind target DNA. The role of these residues was analyzed in vivo through point mutants and motif interchanges. Viable viruses with substitutions at the IN CCD α2 helical region and the CTD β1-β2 loop were tested for effects on integration target site selection. Next-generation sequencing and analysis of integration target sequences indicate that the CCD α2 helical region, in particular P187, interacts with the sequences distal to the scissile bonds whereas the CTD β1-β2 loop binds to residues proximal to it. These findings validate our structural model and disclose IN-DNA interactions relevant to target site selection. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. A generalized theoretical framework for the description of spin decoupling in solid-state MAS NMR: Offset effect on decoupling performance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tan, Kong Ooi; Meier, Beat H., E-mail: beme@ethz.ch, E-mail: maer@ethz.ch; Ernst, Matthias, E-mail: beme@ethz.ch, E-mail: maer@ethz.ch

    2016-09-07

    We present a generalized theoretical framework that allows the approximate but rapid analysis of residual couplings of arbitrary decoupling sequences in solid-state NMR under magic-angle spinning conditions. It is a generalization of the tri-modal Floquet analysis of TPPM decoupling [Scholz et al., J. Chem. Phys. 130, 114510 (2009)] where three characteristic frequencies are used to describe the pulse sequence. Such an approach can be used to describe arbitrary periodic decoupling sequences that differ only in the magnitude of the Fourier coefficients of the interaction-frame transformation. It allows a ∼100 times faster calculation of second-order residual couplings as a function ofmore » pulse sequence parameters than full spin-dynamics simulations. By comparing the theoretical calculations with full numerical simulations, we show the potential of the new approach to examine the performance of decoupling sequences. We exemplify the usefulness of this framework by analyzing the performance of commonly used high-power decoupling sequences and low-power decoupling sequences such as amplitude-modulated XiX (AM-XiX) and its super-cycled variant SC-AM-XiX. In addition, the effect of chemical-shift offset is examined for both high- and low-power decoupling sequences. The results show that the cross-terms between the dipolar couplings are the main contributions to the line broadening when offset is present. We also show that the SC-AM-XIX shows a better offset compensation.« less

  8. A generalized theoretical framework for the description of spin decoupling in solid-state MAS NMR: Offset effect on decoupling performance.

    PubMed

    Tan, Kong Ooi; Agarwal, Vipin; Meier, Beat H; Ernst, Matthias

    2016-09-07

    We present a generalized theoretical framework that allows the approximate but rapid analysis of residual couplings of arbitrary decoupling sequences in solid-state NMR under magic-angle spinning conditions. It is a generalization of the tri-modal Floquet analysis of TPPM decoupling [Scholz et al., J. Chem. Phys. 130, 114510 (2009)] where three characteristic frequencies are used to describe the pulse sequence. Such an approach can be used to describe arbitrary periodic decoupling sequences that differ only in the magnitude of the Fourier coefficients of the interaction-frame transformation. It allows a ∼100 times faster calculation of second-order residual couplings as a function of pulse sequence parameters than full spin-dynamics simulations. By comparing the theoretical calculations with full numerical simulations, we show the potential of the new approach to examine the performance of decoupling sequences. We exemplify the usefulness of this framework by analyzing the performance of commonly used high-power decoupling sequences and low-power decoupling sequences such as amplitude-modulated XiX (AM-XiX) and its super-cycled variant SC-AM-XiX. In addition, the effect of chemical-shift offset is examined for both high- and low-power decoupling sequences. The results show that the cross-terms between the dipolar couplings are the main contributions to the line broadening when offset is present. We also show that the SC-AM-XIX shows a better offset compensation.

  9. Genomic organization, sequence characterization and expression analysis of Tenebrio molitor apolipophorin-III in response to an intracellular pathogen, Listeria monocytogenes.

    PubMed

    Noh, Ju Young; Patnaik, Bharat Bhusan; Tindwa, Hamisi; Seo, Gi Won; Kim, Dong Hyun; Patnaik, Hongray Howrelia; Jo, Yong Hun; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Han, Yeon Soo

    2014-01-25

    Apolipophorin III (apoLp-III) is a well-known hemolymph protein having a functional role in lipid transport and immune response of insects. We cloned full-length cDNA encoding putative apoLp-III from larvae of the coleopteran beetle, Tenebrio molitor (TmapoLp-III), by identification of clones corresponding to the partial sequence of TmapoLp-III, subsequently followed with full length sequencing by a clone-by-clone primer walking method. The complete cDNA consists of 890 nucleotides, including an ORF encoding 196 amino acid residues. Excluding a putative signal peptide of the first 20 amino acid residues, the 176-residue mature apoLp-III has a calculated molecular mass of 19,146Da. Genomic sequence analysis with respect to its cDNA showed that TmapoLp-III was organized into four exons interrupted by three introns. Several immune-related transcription factor binding sites were discovered in the putative 5'-flanking region. BLAST and phylogenetic analyses reveal that TmapoLp-III has high sequence identity (88%) with Tribolium castaneum apoLp-III but shares little sequence homologies (<26%) with other apoLp-IIIs. Homology modeling of Tm apoLp-III shows a bundle of five amphipathic alpha helices, including a short helix 3'. The 'helix-short helix-helix' motif was predicted to be implicated in lipid binding interactions, through reversible conformational changes and accommodating the hydrophobic residues to the exterior for stability. Highest level of TmapoLp-III mRNA was detected at late pupal stages, albeit it is expressed in the larval and adult stages at lower levels. The tissue specific expression of the transcripts showed significantly higher numbers in larval fat body and adult integument. In addition, TmapoLp-III mRNA was found to be highly upregulated in late stages of L. monocytogenes or E. coli challenge. These results indicate that TmapoLp-III may play an important role in innate immune responses against bacterial pathogens in T. molitor. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Conformational analysis of the N-terminal sequence Met1 Val60 of the tyrosine hydroxylase

    NASA Astrophysics Data System (ADS)

    Alieva, Irada N.; Mustafayeva, Narmina N.; Gojayev, Niftali M.

    2006-03-01

    Molecular mechanics method and molecular dynamics (MD) simulation techniques are used to study the behavior and the effect of the amino acids substitution on structure and molecular dynamics of the specific portion of Met1-Val60 amino acid residues from N-terminal regulatory domain of the tyrosine hydroxylase (TH) and its mutants in which the positively charged arginine residues at positions 37 and 38 were replaced by electrically neutral Gly and negatively charged Glu, and serine residue at position 40 was replaced by Ala or Asp residue. Our study allowed us to make the following conclusions: (i) the higher conformational flexibility of the Met1-Arg16 sequence is revealed in comparision to other part of the N-terminus; (ii) the stretch of amino acid residues Met30-Ser40 within the N-terminus forms β-turn so that two α-helices (residues 16-29 and residues 41-60) are paralel one another; (ii) the significant differences that are observed for the Arg37→Gly37, Arg37-Arg38→Glu37-Glu38 mutant segments indicates that the positive charge of the Arg37 and Arg38 residues is one of the main factor that maintains the characteristic of the turn; (ii) no major conformational changes are observed between Ser40→Ala40, and Ser40→Asp40 mutant segments.

  11. Sequencing of the amylopullulanase (apu) gene of Thermoanaerobacter ethanolicus 39E, and identification of the active site by site-directed mutagenesis.

    PubMed

    Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G

    1993-08-05

    The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.

  12. Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

    PubMed Central

    Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea

    2014-01-01

    In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061

  13. Research on wind field algorithm of wind lidar based on BP neural network and grey prediction

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Chen, Chun-Li; Luo, Xiong; Zhang, Yan; Yang, Ze-hou; Zhou, Jie; Shi, Xiao-ding; Wang, Lei

    2018-01-01

    This paper uses the BP neural network and grey algorithm to forecast and study radar wind field. In order to reduce the residual error in the wind field prediction which uses BP neural network and grey algorithm, calculating the minimum value of residual error function, adopting the residuals of the gray algorithm trained by BP neural network, using the trained network model to forecast the residual sequence, using the predicted residual error sequence to modify the forecast sequence of the grey algorithm. The test data show that using the grey algorithm modified by BP neural network can effectively reduce the residual value and improve the prediction precision.

  14. Analysis of sequence repeats of proteins in the PDB.

    PubMed

    Mary Rajathei, David; Selvaraj, Samuel

    2013-12-01

    Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Application of advanced cytometric and molecular technologies to minimal residual disease monitoring

    NASA Astrophysics Data System (ADS)

    Leary, James F.; He, Feng; Reece, Lisa M.

    2000-04-01

    Minimal residual disease monitoring presents a number of theoretical and practical challenges. Recently it has been possible to meet some of these challenges by combining a number of new advanced biotechnologies. To monitor the number of residual tumor cells requires complex cocktails of molecular probes that collectively provide sensitivities of detection on the order of one residual tumor cell per million total cells. Ultra-high-speed, multi parameter flow cytometry is capable of analyzing cells at rates in excess of 100,000 cells/sec. Residual tumor selection marker cocktails can be optimized by use of receiver operating characteristic analysis. New data minimizing techniques when combined with multi variate statistical or neural network classifications of tumor cells can more accurately predict residual tumor cell frequencies. The combination of these techniques can, under at least some circumstances, detect frequencies of tumor cells as low as one cell in a million with an accuracy of over 98 percent correct classification. Detection of mutations in tumor suppressor genes requires insolation of these rare tumor cells and single-cell DNA sequencing. Rare residual tumor cells can be isolated at single cell level by high-resolution single-cell cell sorting. Molecular characterization of tumor suppressor gene mutations can be accomplished using a combination of single- cell polymerase chain reaction amplification of specific gene sequences followed by TA cloning techniques and DNA sequencing. Mutations as small as a single base pair in a tumor suppressor gene of a single sorted tumor cell have been detected using these methods. Using new amplification procedures and DNA micro arrays it should be possible to extend the capabilities shown in this paper to screening of multiple DNA mutations in tumor suppressor and other genes on small numbers of sorted metastatic tumor cells.

  16. ADOMA: A Command Line Tool to Modify ClustalW Multiple Alignment Output.

    PubMed

    Zaal, Dionne; Nota, Benjamin

    2016-01-01

    We present ADOMA, a command line tool that produces alternative outputs from ClustalW multiple alignments of nucleotide or protein sequences. ADOMA can simplify the output of alignments by showing only the different residues between sequences, which is often desirable when only small differences such as single nucleotide polymorphisms are present (e.g., between different alleles). Another feature of ADOMA is that it can enhance the ClustalW output by coloring the residues in the alignment. This tool is easily integrated into automated Linux pipelines for next-generation sequencing data analysis, and may be useful for researchers in a broad range of scientific disciplines including evolutionary biology and biomedical sciences. The source code is freely available at https://sourceforge. net/projects/adoma/. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Divergence of Structure and Function in the Haloacid Dehalogenase Enzyme Superfamily: Bacteroides thetaiotaomicron BT2127 is an Inorganic Pyrophosphatase+

    PubMed Central

    Huang, Hua; Yury, Patskovsky; Toro, Rafael; Farelli, Jeremiah D.; Pandya, Chetanya; Almo, Steven C.; Allen, Karen N.; Dunaway-Mariano, Debra

    2012-01-01

    The explosion of protein sequence information requires that current strategies for function assignment must evolve to complement experimental approaches with computationally-based function prediction. This necessitates the development of strategies based on the identification of sequence markers in the form of specificity determinants and a more informed definition of orthologues. Herein, we have undertaken the function assignment of the unknown Haloalkanoate Dehalogenase superfamily member BT2127 (Uniprot accession # Q8A5V9) from Bacteroides thetaiotaomicron using an integrated bioinformatics/structure/mechanism approach. The substrate specificity profile and steady-state rate constants of BT2127 (with kcat/Km value for pyrophosphate of ∼1 × 105 M−1 s−1), together with the gene context, supports the assigned in vivo function as an inorganic pyrophosphatase. The X-ray structural analysis of the wild-type BT2127 and several variants generated by site-directed mutagenesis shows that substrate discrimination is based, in part, on active site space restrictions imposed by the cap domain (specifically by residues Tyr76 and Glu47). Structure guided site directed mutagenesis coupled with kinetic analysis of the mutant enzymes identified the residues required for catalysis, substrate binding, and domain-domain association. Based on this structure-function analysis, the catalytic residues Asp11, Asp13, Thr113, and Lys147 as well the metal binding residues Asp171, Asn172 and Glu47 were used as markers to confirm BT2127 orthologues identified via sequence searches. This bioinformatic analysis demonstrated that the biological range of BT2127 orthologue is restricted to the phylum Bacteroidetes/Chlorobi. The key structural determinants in the divergence of BT2127 and its closest homologue β-phosphoglucomutase control the leaving group size (phosphate vs. glucose-phosphate) and the position of the Asp acid/base in the open vs. closed conformations. HADSF pyrophosphatases represent a third mechanistic and fold type for bacterial pyrophosphatases. PMID:21894910

  18. Two different groups of signal sequence in M-superfamily conotoxins.

    PubMed

    Wang, Qi; Jiang, Hui; Han, Yu-Hong; Yuan, Duo-Duo; Chi, Cheng-Wu

    2008-04-01

    M-superfamily conotoxins can be divided into four branches (M-1, M-2, M-3 and M-4) according to the number of amino acid residues in the third Cys loop. In general, it is widely accepted that the conotoxin signal peptides of each superfamily are strictly conserved. Recently, we cloned six cDNAs of novel M-superfamily conotoxins from Conus leopardus, Conus marmoreus and Conus quercinus, belonging to either M-1 or M-3 branch. These conotoxins, judging from the putative peptide sequences deducted from cDNAs, are rich in acidic residues and share highly conserved signal and pro-peptide region. However, they are quite different from the reported conotoxins of M-2 and M-4 branches even in their signal peptides, which in general are considered highly conserved for each superfamily of conotoxins. The signal sequences of M-1 and M-3 conotoxins composed of 24 residues start with MLKMGVVL-, while those of M-2 and M-4 conotoxins composed of 25 residues start with MMSKLGVL-. It is another example that different types of signal peptides can exist within a superfamily besides the I-conotoxin superfamily. In addition to the different disulfide connectivity of M-1 conotoxins from that of M-4 or M-2 conotoxins, the sequence alignment, preferential Cys codon usage and phylogenetic tree analysis suggest that M-1 and M-3 conotoxins have much closer relationship, being different from the conotoxins of other two branches (M-4 and M-2) of M-superfamily.

  19. Analysis of the linker region joining the adenylation and carrier protein domains of the modular nonribosomal peptide synthetases.

    PubMed

    Miller, Bradley R; Sundlov, Jesse A; Drake, Eric J; Makin, Thomas A; Gulick, Andrew M

    2014-10-01

    Nonribosomal peptide synthetases (NRPSs) are multimodular proteins capable of producing important peptide natural products. Using an assembly line process, the amino acid substrate and peptide intermediates are passed between the active sites of different catalytic domains of the NRPS while bound covalently to a peptidyl carrier protein (PCP) domain. Examination of the linker sequences that join the NRPS adenylation and PCP domains identified several conserved proline residues that are not found in standalone adenylation domains. We examined the roles of these proline residues and neighboring conserved sequences through mutagenesis and biochemical analysis of the reaction catalyzed by the adenylation domain and the fully reconstituted NRPS pathway. In particular, we identified a conserved LPxP motif at the start of the adenylation-PCP linker. The LPxP motif interacts with a region on the adenylation domain to stabilize a critical catalytic lysine residue belonging to the A10 motif that immediately precedes the linker. Further, this interaction with the C-terminal subdomain of the adenylation domain may coordinate movement of the PCP with the conformational change of the adenylation domain. Through this work, we extend the conserved A10 motif of the adenylation domain and identify residues that enable proper adenylation domain function. © 2014 Wiley Periodicals, Inc.

  20. Schematic representation of residue-based protein context-dependent data: an application to transmembrane proteins.

    PubMed

    Campagne, F; Weinstein, H

    1999-01-01

    An algorithmic method for drawing residue-based schematic diagrams of proteins on a 2D page is presented and illustrated. The method allows the creation of rendering engines dedicated to a given family of sequences, or fold. The initial implementation provides an engine that can produce a 2D diagram representing secondary structure for any transmembrane protein sequence. We present the details of the strategy for automating the drawing of these diagrams. The most important part of this strategy is the development of an algorithm for laying out residues of a loop that connects to arbitrary points of a 2D plane. As implemented, this algorithm is suitable for real-time modification of the loop layout. This work is of interest for the representation and analysis of data from (1) protein databases, (2) mutagenesis results, or (3) various kinds of protein context-dependent annotations or data.

  1. The structure of the SBP-Tag–streptavidin complex reveals a novel helical scaffold bridging binding pockets on separate subunits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barrette-Ng, Isabelle H.; Wu, Sau-Ching; Tjia, Wai-Mui

    2013-05-01

    The structure of the SBP-Tag–streptavidin complex reveals a novel mode of peptide recognition in which a single peptide binds simultaneously to biotin-binding pockets from adjacent subunits of streptavidin. The molecular details of peptide recognition suggest how the SBP-Tag can be further modified to become an even more useful tag for a wider range of biotechnological applications. The 38-residue SBP-Tag binds to streptavidin more tightly (K{sub d} ≃ 2.5–4.9 nM) than most if not all other known peptide sequences. Crystallographic analysis at 1.75 Å resolution shows that the SBP-Tag binds to streptavidin in an unprecedented manner by simultaneously interacting with biotin-bindingmore » pockets from two separate subunits. An N-terminal HVV peptide sequence (residues 12–14) and a C-terminal HPQ sequence (residues 31–33) form the bulk of the direct interactions between the SBP-Tag and the two biotin-binding pockets. Surprisingly, most of the peptide spanning these two sites (residues 17–28) adopts a regular α-helical structure that projects three leucine side chains into a groove formed at the interface between two streptavidin protomers. The crystal structure shows that residues 1–10 and 35–38 of the original SBP-Tag identified through in vitro selection and deletion analysis do not appear to contact streptavidin and thus may not be important for binding. A 25-residue peptide comprising residues 11–34 (SBP-Tag2) was synthesized and shown using surface plasmon resonance to bind streptavidin with very similar affinity and kinetics when compared with the SBP-Tag. The SBP-Tag2 was also added to the C-terminus of β-lactamase and was shown to be just as effective as the full-length SBP-Tag in affinity purification. These results validate the molecular structure of the SBP-Tag–streptavidin complex and establish a minimal bivalent streptavidin-binding tag from which further rational design and optimization can proceed.« less

  2. Viral morphogenesis is the dominant source of sequence censorship in M13 combinatorial peptide phage display.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rodi, D. J.; Soares, A. S.; Makowski, L.

    Novel statistical methods have been developed and used to quantitate and annotate the sequence diversity within combinatorial peptide libraries on the basis of small numbers (1-200) of sequences selected at random from commercially available M13 p3-based phage display libraries. These libraries behave statistically as though they correspond to populations containing roughly 4.0{+-}1.6% of the random dodecapeptides and 7.9{+-}2.6% of the random constrained heptapeptides that are theoretically possible within the phage populations. Analysis of amino acid residue occurrence patterns shows no demonstrable influence on sequence censorship by Escherichia coli tRNA isoacceptor profiles or either overall codon or Class II codon usagemore » patterns, suggesting no metabolic constraints on recombinant p3 synthesis. There is an overall depression in the occurrence of cysteine, arginine and glycine residues and an overabundance of proline, threonine and histidine residues. The majority of position-dependent amino acid sequence bias is clustered at three positions within the inserted peptides of the dodecapeptide library, +1, +3 and +12 downstream from the signal peptidase cleavage site. Conformational tendency measures of the peptides indicate a significant preference for inserts favoring a {beta}-turn conformation. The observed protein sequence limitations can primarily be attributed to genetic codon degeneracy and signal peptidase cleavage preferences. These data suggest that for applications in which maximal sequence diversity is essential, such as epitope mapping or novel receptor identification, combinatorial peptide libraries should be constructed using codon-corrected trinucleotide cassettes within vector-host systems designed to minimize morphogenesis-related censorship.« less

  3. Sequence determination and analysis of S-adenosyl-L-homocysteine hydrolase from yellow lupine (Lupinus luteus).

    PubMed

    Brzeziński, K; Janowski, R; Podkowiński, J; Jaskólski, M

    2001-01-01

    The coding sequences of two S-adenosyl-L-homocysteine hydrolases (SAHases) were identified in yellow lupine by screenig of a cDNA library. One of them, corresponding to the complete protein, was sequenced and compared with 52 other SAHase sequences. Phylogenetic analysis of these proteins identified three groups of the enzymes. Group A comprises only bacterial sequences. Group B is subdivided into two subgroups, one of which (B1) is formed by animal sequences. Subgroup B2 consist of two distinct clusters, B2a and B2b. Cluster B2b comprises all known plant sequences, including the yellow lupine enzyme, which are distinguished by a 50-residue insert. Group C is heterogeneous and contains SAHases from Archaea as well as a new class of animal enzymes, distinctly different from those in group B1.

  4. Sequence co-evolution gives 3D contacts and structures of protein complexes

    PubMed Central

    Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M; Green, Anna G; Kohlbacher, Oliver; Sander, Chris; Bonvin, Alexandre M J J; Marks, Debora S

    2014-01-01

    Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001 PMID:25255213

  5. Dissection of a nuclear localization signal.

    PubMed

    Hodel, M R; Corbett, A H; Hodel, A E

    2001-01-12

    The regulated process of protein import into the nucleus of a eukaryotic cell is mediated by specific nuclear localization signals (NLSs) that are recognized by protein import receptors. This study seeks to decipher the energetic details of NLS recognition by the receptor importin alpha through quantitative analysis of variant NLSs. The relative importance of each residue in two monopartite NLS sequences was determined using an alanine scanning approach. These measurements yield an energetic definition of a monopartite NLS sequence where a required lysine residue is followed by two other basic residues in the sequence K(K/R)X(K/R). In addition, the energetic contributions of the second basic cluster in a bipartite NLS ( approximately 3 kcal/mol) as well as the energy of inhibition of the importin alpha importin beta-binding domain ( approximately 3 kcal/mol) were also measured. These data allow the generation of an energetic scale of nuclear localization sequences based on a peptide's affinity for the importin alpha-importin beta complex. On this scale, a functional NLS has a binding constant of approximately 10 nm, whereas a nonfunctional NLS has a 100-fold weaker affinity of 1 microm. Further correlation between the current in vitro data and in vivo function will provide the foundation for a comprehensive quantitative model of protein import.

  6. Structural protein descriptors in 1-dimension and their sequence-based predictions.

    PubMed

    Kurgan, Lukasz; Disfani, Fatemeh Miri

    2011-09-01

    The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.

  7. Evolutionary Diversifaction of Aminopeptidase N in Lepidoptera by Conserved Clade-specific Amino Acid Residues

    PubMed Central

    Hughes, Austin L.

    2015-01-01

    Members of the aminopepidase N (APN) gene family of the insect order Lepidoptera (moths and butterflies) bind the naturally insecticidal Cry toxins produced by the bacterium Bacillus thuringiensis. Phylogenetic analysis of amino acid sequences of seven lepidopteran APN classes provided strong support for the hypothesis that lepidopteran APN2 class arose by gene duplication prior to the most recent common ancestor of Lepidoptera and Diptera. The Cry toxin-binding region (BR) of lepidopteran and dipteran APNs was subject to stronger purifying selection within APN classes than was the remainder of the molecule, reflecting conservation of catalytic site and adjoining residues within the BR. Of lepidopteran APN classes, APN2, APN6, and APN8 showed the strongest evidence of functional specialization, both in expression patterns and in the occurrence of conserved derived amino acid residues. The latter three APN classes also shared a convergently evolved conserved residue close to the catalytic site. APN8 showed a particularly strong tendency towards class-specific conserved residues, including one of the catalytic site residues in the BR and ten others in close vicinity to the catalytic site residues. The occurrence of class-specific sequences along with the conservation of enzymatic function is consistent with the hypothesis that the presence of Cry toxins in the environment has been a factor shaping the evolution of this multi-gene family. PMID:24675701

  8. Structural and sequence features of two residue turns in beta-hairpins.

    PubMed

    Madan, Bharat; Seo, Sung Yong; Lee, Sun-Gu

    2014-09-01

    Beta-turns in beta-hairpins have been implicated as important sites in protein folding. In particular, two residue β-turns, the most abundant connecting elements in beta-hairpins, have been a major target for engineering protein stability and folding. In this study, we attempted to investigate and update the structural and sequence properties of two residue turns in beta-hairpins with a large data set. For this, 3977 beta-turns were extracted from 2394 nonhomologous protein chains and analyzed. First, the distribution, dihedral angles and twists of two residue turn types were determined, and compared with previous data. The trend of turn type occurrence and most structural features of the turn types were similar to previous results, but for the first time Type II turns in beta-hairpins were identified. Second, sequence motifs for the turn types were devised based on amino acid positional potentials of two-residue turns, and their distributions were examined. From this study, we could identify code-like sequence motifs for the two residue beta-turn types. Finally, structural and sequence properties of beta-strands in the beta-hairpins were analyzed, which revealed that the beta-strands showed no specific sequence and structural patterns for turn types. The analytical results in this study are expected to be a reference in the engineering or design of beta-hairpin turn structures and sequences. © 2014 Wiley Periodicals, Inc.

  9. Interaction of the p85 subunit of PI 3-kinase and its N-terminal SH2 domain with a PDGF receptor phosphorylation site: structural features and analysis of conformational changes.

    PubMed Central

    Panayotou, G; Bax, B; Gout, I; Federwisch, M; Wroblowski, B; Dhand, R; Fry, M J; Blundell, T L; Wollmer, A; Waterfield, M D

    1992-01-01

    Circular dichroism and fluorescence spectroscopy were used to investigate the structure of the p85 alpha subunit of the PI 3-kinase, a closely related p85 beta protein, and a recombinant SH2 domain-containing fragment of p85 alpha. Significant spectral changes, indicative of a conformational change, were observed on formation of a complex with a 17 residue peptide containing a phosphorylated tyrosine residue. The sequence of this peptide is identical to the sequence surrounding Tyr751 in the kinase-insert region of the platelet-derived growth factor beta-receptor (beta PDGFR). The rotational correlation times measured by fluorescence anisotropy decay indicated that phosphopeptide binding changed the shape of the SH2 domain-containing fragment. The CD and fluorescence spectroscopy data support the secondary structure prediction based on sequence analysis and provide evidence for flexible linker regions between the various domains of the p85 proteins. The significance of these results for SH2 domain-containing proteins is discussed. Images PMID:1330535

  10. New tyrosinase inhibitory decapeptide: Molecular insights into the role of tyrosine residues.

    PubMed

    Ochiai, Akihito; Tanaka, Seiya; Imai, Yuta; Yoshida, Hisashi; Kanaoka, Takumi; Tanaka, Takaaki; Taniguchi, Masayuki

    2016-06-01

    Tyrosinase, a rate-limiting enzyme in melanin biosynthesis, catalyzes the hydroxylation of l-tyrosine to 3,4-dihydroxy-l-phenylalanine (l-dopa) (monophenolase reaction) and the subsequent oxidation of l-dopa to l-dopaquinone (diphenolase reaction). Thus, tyrosinase inhibitors have been proposed as skin-lightening agents; however, many of the existing inhibitors cannot be widely used in the cosmetic industry due to their high cytotoxicity and instability. On the other hand, some tyrosinase inhibitory peptides have been reported as safe. In this study, we found that the peptide TH10, which has a similar sequence to the characterized inhibitory peptide P4, strongly inhibits the monophenolase reaction with a half-maximal inhibitory concentration of 102 μM. Seven of the ten amino acid residues in TH10 were identical to P4; however, TH10 possesses one N-terminal tyrosine, whereas P4 contains three tyrosine residues located at its N-terminus, center, and C-terminus. Subsequent analysis using sequence-shuffled variants indicated that the tyrosine residues located at the N-terminus and center of P4 have little to no contribution to its inhibitory activity. Furthermore, docking simulation analysis of these peptides with mushroom tyrosinase demonstrated that the active tyrosine residue was positioned close to copper ions, suggesting that TH10 and P4 bind to tyrosinase as a substrate analogue. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  11. Sequence analysis of a canine parvovirus isolated from a red panda (Ailurus fulgens) in China.

    PubMed

    Qin, Qin; Loeffler, I Kati; Li, Ming; Tian, Kegong; Wei, Fuwen

    2007-06-01

    Canine parvovirus (CPV) was first recognized in the late 1970 s in dogs and has mutated and spread throughout the world in canid and felid species since then. In this study, a novel CPV was isolated from the endangered red panda (Ailurus fulgens) in China. Nucleotide and phylogenetic analysis of the capsid protein VP2 gene classified the red panda parvovirus (RPPV) as a CPV-2a type. Substitution of Val for Gly at the conserved 300 residue in RPPV presents an unusual variation in the CPV-2a amino acid sequence and is further evidence for the continuing evolution of the virus. The 300 residue is important in distinguishing the antigenicity and host range of CPVs. The clinical significance and population impact of RPPV infection in captive red pandas in China is unknown and is an important topic for future research.

  12. Phenotypic analysis of NS5A variant from liver transplant patient with increased cyclosporine susceptibility

    PubMed Central

    Ansari, Israr-ul H.; Allen, Todd; Berical, Andrew; Stock, Peter G.; Barin, Burc; Striker, Rob

    2013-01-01

    Hepatitis C virus (HCV) replication is limited by cyclophilin inhibitors but it remains unclear how viral genetic variations influence susceptibility to cyclosporine (cyclosporine A, CsA), a cyclophilin inhibitor. In this study HCV from liver transplant patients was sequenced before and after CsA exposure. Phenotypic analysis of NS5A sequence was performed by using HCV sub genomic replicon to determine CsA susceptibility. The data indicates an atypical proline at position 328 in NS5A causes increases CsA sensitivity both in the context of genotype 1a and 1b residues. Point mutants mimicking other naturally occurring residues at this position also increased (Ala) or decreased (Arg) replicon sensitivity to CsA relative to the typical threonine (genotype 1a) or serine (genotype 1b) at this position. This work has implications for treatment of HCV by cyclophilin inhibitors. PMID:23290631

  13. Genetic and structural analyses of cytochrome P450 hydroxylases in sex hormone biosynthesis: Sequential origin and subsequent coevolution.

    PubMed

    Goldstone, Jared V; Sundaramoorthy, Munirathinam; Zhao, Bin; Waterman, Michael R; Stegeman, John J; Lamb, David C

    2016-01-01

    Biosynthesis of steroid hormones in vertebrates involves three cytochrome P450 hydroxylases, CYP11A1, CYP17A1 and CYP19A1, which catalyze sequential steps in steroidogenesis. These enzymes are conserved in the vertebrates, but their origin and existence in other chordate subphyla (Tunicata and Cephalochordata) have not been clearly established. In this study, selected protein sequences of CYP11A1, CYP17A1 and CYP19A1 were compiled and analyzed using multiple sequence alignment and phylogenetic analysis. Our analyses show that cephalochordates have sequences orthologous to vertebrate CYP11A1, CYP17A1 or CYP19A1, and that echinoderms and hemichordates possess CYP11-like but not CYP19 genes. While the cephalochordate sequences have low identity with the vertebrate sequences, reflecting evolutionary distance, the data show apparent origin of CYP11 prior to the evolution of CYP19 and possibly CYP17, thus indicating a sequential origin of these functionally related steroidogenic CYPs. Co-occurrence of the three CYPs in early chordates suggests that the three genes may have coevolved thereafter, and that functional conservation should be reflected in functionally important residues in the proteins. CYP19A1 has the largest number of conserved residues while CYP11A1 sequences are less conserved. Structural analyses of human CYP11A1, CYP17A1 and CYP19A1 show that critical substrate binding site residues are highly conserved in each enzyme family. The results emphasize that the steroidogenic pathways producing glucocorticoids and reproductive steroids are several hundred million years old and that the catalytic structural elements of the enzymes have been conserved over the same period of time. Analysis of these elements may help to identify when precursor functions linked to these enzymes first arose. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. Cloning and sequence analysis of Hemonchus contortus HC58cDNA.

    PubMed

    Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li

    2007-06-01

    The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.

  15. Structure, synthesis, and molecular cloning of dermaseptins B, a family of skin peptide antibiotics.

    PubMed

    Charpentier, S; Amiche, M; Mester, J; Vouille, V; Le Caer, J P; Nicolas, P; Delfour, A

    1998-06-12

    Analysis of antimicrobial activities that are present in the skin secretions of the South American frog Phyllomedusa bicolor revealed six polycationic (lysine-rich) and amphipathic alpha-helical peptides, 24-33 residues long, termed dermaseptins B1 to B6, respectively. Prepro-dermaseptins B all contain an almost identical signal peptide, which is followed by a conserved acidic propiece, a processing signal Lys-Arg, and a dermaseptin progenitor sequence. The 22-residue signal peptide plus the first 3 residues of the acidic propiece are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The 25-residue amino-terminal region of prepro-dermaseptins B shares 50% identity with the corresponding region of precursors for D-amino acid containing opioid peptides or for antimicrobial peptides originating from the skin of distantly related frog species. The remarkable similarity found between prepro-proteins that encode end products with strikingly different sequences, conformations, biological activities and modes of action suggests that the corresponding genes have evolved through dissemination of a conserved "secretory cassette" exon.

  16. Functional specificity of a Hox protein mediated by the recognition of minor groove structure.

    PubMed

    Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S

    2007-11-02

    The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.

  17. Cloning and nucleotide sequence of the Pseudomonas aeruginosa glucose-selective OprB porin gene and distribution of OprB within the family Pseudomonadaceae.

    PubMed

    Wylie, J L; Worobec, E A

    1994-03-01

    OprB is a glucose-selective porin known to be produced by Pseudomonas aeruginosa and Pseudomonas putida. We have cloned and sequenced the oprB gene of P. aeruginosa and obtained expression of OprB in Escherichia coli. The mature protein consists of 423 amino acid residues with a deduced molecular mass of 47597 Da. Several clusters of amino acid residues, potentially involved in the structure or function of the protein, were identified. An area of regional homology with E. coli LamB was also identified. Carbohydrate-inducible proteins, potentially homologous to OprB, were identified in several rRNA homology-group-I pseudomonads by sodium dodecyl sulfate/polyacrylamide gel electrophoresis analysis, Western immunoblotting and N-terminal amino acid sequencing. These species also contained DNA that hybridized to a P. aeruginosa oprB gene probe.

  18. The primary structure of the thymidine kinase gene of fish lymphocystis disease virus.

    PubMed

    Schnitzler, P; Handermann, M; Szépe, O; Darai, G

    1991-06-01

    The DNA nucleotide sequence of the thymidine kinase (TK) gene of fish lymphocystis disease virus (FLDV) which has been localized between the coordinates 0.678 to 0.688 of the viral genome was determined. The analysis of the DNA nucleotide sequence located between the recognition sites of HindIII (0.669 map unit; nucleotide position 1) and AccI (nucleotide position 2032) revealed the presence of an open reading frame of 954 bp on the lower strand of this region between nucleotide positions 1868 (ATG) and 915 (TAA). It encodes for a protein of 318 amino acid residues. The evolutionary relationships of the TK gene of FLDV to the other known TK genes was investigated using the method of progressive sequence alignment. These analyses revealed a high degree of diversity between the protein sequence of FLDV TK gene and the amino acid composition of other TKs tested. However, significant conservations were detected at several regions of amino acid residues of the FLDV TK protein when compared to the amino acid sequence of TKs of African swine fever virus, fowlpox virus, shope fibroma virus, and vaccinia virus and to the amino acid sequences of the cellular cytoplasmic TK of chicken, mouse, and man.

  19. Naturally selected hepatitis C virus polymorphisms confer broad neutralizing antibody resistance.

    PubMed

    Bailey, Justin R; Wasilewski, Lisa N; Snider, Anna E; El-Diwany, Ramy; Osburn, William O; Keck, Zhenyong; Foung, Steven K H; Ray, Stuart C

    2015-01-01

    For hepatitis C virus (HCV) and other highly variable viruses, broadly neutralizing mAbs are an important guide for vaccine development. The development of resistance to anti-HCV mAbs is poorly understood, in part due to a lack of neutralization testing against diverse, representative panels of HCV variants. Here, we developed a neutralization panel expressing diverse, naturally occurring HCV envelopes (E1E2s) and used this panel to characterize neutralizing breadth and resistance mechanisms of 18 previously described broadly neutralizing anti-HCV human mAbs. The observed mAb resistance could not be attributed to polymorphisms in E1E2 at known mAb-binding residues. Additionally, hierarchical clustering analysis of neutralization resistance patterns revealed relationships between mAbs that were not predicted by prior epitope mapping, identifying 3 distinct neutralization clusters. Using this clustering analysis and envelope sequence data, we identified polymorphisms in E2 that confer resistance to multiple broadly neutralizing mAbs. These polymorphisms, which are not at mAb contact residues, also conferred resistance to neutralization by plasma from HCV-infected subjects. Together, our method of neutralization clustering with sequence analysis reveals that polymorphisms at noncontact residues may be a major immune evasion mechanism for HCV, facilitating viral persistence and presenting a challenge for HCV vaccine development.

  20. A novel peptide from the ACEI/BPP-CNP precursor in the venom of Crotalus durissus collilineatus.

    PubMed

    Higuchi, Shigesada; Murayama, Nobuhiro; Saguchi, Ken-ichi; Ohi, Hiroaki; Fujita, Yoshiaki; da Silva, Nelson Jorge; de Siqueira, Rodrigo José Bezerra; Lahlou, Saad; Aird, Steven D

    2006-10-01

    In crotaline venoms, angiotensin-converting enzyme inhibitors [ACEIs, also known as bradykinin potentiating peptides (BPPs)], are products of a gene coding for an ACEI/BPP-C-type natriuretic peptide (CNP) precursor. In the genes from Bothrops jararaca and Gloydius blomhoffii, ACEI/BPP sequences are repeated. Sequencing of a cDNA clone from venom glands of Crotalus durissus collilineatus showed that two ACEIs/BPPs are located together at the N-terminus, but without repeats. An additional sequence for CNP was unexpectedly found at the C-terminus. Homologous genes for the ACEI/BPP-CNP precursor suggest that most crotaline venoms contain both ACEIs/BPPs and CNP. The sequence of ACEIs/BPPs is separated from the CNP sequence by a long spacer sequence. Previously, there was no evidence that this spacer actually coded any expressed peptides. Aird and Kaiser (1986, unpublished) previously isolated and sequenced a peptide of 11 residues (TPPAGPDVGPR) from Crotalus viridis viridis venom. In the present study, analysis of the cDNA clone from C. d. collilineatus revealed a nearly identical sequence in the ACEI/BPP-CNP spacer. Fractionation of the crude venom by reverse phase HPLC (C(18)), and analysis of the fractions by mass spectrometry (MS) indicated a component of 1020.5 Da. Amino acid sequencing by MS/MS confirmed that C. d. collilineatus venom contains the peptide TPPAGPDGGPR. Its high proline content and paired proline residues are typical of venom hypotensive peptides, although it lacks the usual N-terminal pyroglutamate. It has no demonstrable hypotensive activity when injected intravenously in rats; however, its occurrence in the venoms of dissimilar species suggests that its presence is not accidental. Evidence suggests that these novel toxins probably activate anaphylatoxin C3a receptors.

  1. Molecular cloning and characterization of an acetylcholinesterase cDNA in the brown planthopper, Nilaparvata lugens.

    PubMed

    Yang, Zhifan; Chen, Jun; Chen, Yongqin; Jiang, Sijing

    2010-01-01

    A full cDNA encoding an acetylcholinesterase (AChE, EC 3.1.1.7) was cloned and characterized from the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae). The complete cDNA (2467 bp) contains a 1938-bp open reading frame encoding 646 amino acid residues. The amino acid sequence of the AChE deduced from the cDNA consists of 30 residues for a putative signal peptide and 616 residues for the mature protein with a predicted molecular weight of 69,418. The three residues (Ser242, Glu371, and His485) that putatively form the catalytic triad and the six Cys that form intra-subunit disulfide bonds are completely conserved, and 10 out of the 14 aromatic residues lining the active site gorge of the AChE are also conserved. Northern blot analysis of poly(A)+ RNA showed an approximately 2.6-kb transcript, and Southern blot analysis revealed there likely was just a single copy of this gene in N. lugens. The deduced protein sequence is most similar to AChE of Nephotettix cincticeps with 83% amino acid identity. Phylogenetic analysis constructed with 45 AChEs from 30 species showed that the deduced N. lugens AChE formed a cluster with the other 8 insect AChE2s. Additionally, the hypervariable region and amino acids specific to insect AChE2 also existed in the AChE of N. lugens. The results revealed that the AChE cDNA cloned in this work belongs to insect AChE2 subgroup, which is orthologous to Drosophila AChE. Comparison of the AChEs between the susceptible and resistant strains revealed a point mutation, Gly185Ser, is likely responsible for the insensitivity of the AChE to methamidopho in the resistant strain.

  2. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

    PubMed Central

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833

  3. Embedding strategies for effective use of information from multiple sequence alignments.

    PubMed Central

    Henikoff, S.; Henikoff, J. G.

    1997-01-01

    We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452

  4. Elman RNN based classification of proteins sequences on account of their mutual information.

    PubMed

    Mishra, Pooja; Nath Pandey, Paras

    2012-10-21

    In the present work we have employed the method of estimating residue correlation within the protein sequences, by using the mutual information (MI) of adjacent residues, based on structural and solvent accessibility properties of amino acids. The long range correlation between nonadjacent residues is improved by constructing a mutual information vector (MIV) for a single protein sequence, like this each protein sequence is associated with its corresponding MIVs. These MIVs are given to Elman RNN to obtain the classification of protein sequences. The modeling power of MIV was shown to be significantly better, giving a new approach towards alignment free classification of protein sequences. We also conclude that sequence structural and solvent accessible property based MIVs are better predictor. Copyright © 2012 Elsevier Ltd. All rights reserved.

  5. Catalysis by the second class of tRNA(m1G37) methyl transferase requires a conserved proline.

    PubMed

    Christian, Thomas; Evilia, Caryn; Hou, Ya-Ming

    2006-06-20

    The enzyme tRNA(m1G37) methyl transferase catalyzes the transfer of a methyl group from S-adenosyl methionine (AdoMet) to the N1 position of G37, which is 3' to the anticodon sequence and whose modification is important for maintaining the reading frame fidelity. While the enzyme in bacteria is highly conserved and is encoded by the trmD gene, recent studies show that the counterpart of this enzyme in archaea and eukarya, encoded by the trm5 gene, is unrelated to trmD both in sequence and in structure. To further test this prediction, we seek to identify residues in the second class of tRNA(m1G37) methyl transferase that are required for catalysis. Such residues should provide mechanistic insights into the distinct structural origins of the two classes. Using the Trm5 enzyme of the archaeon Methanocaldococcus jannaschii (previously MJ0883) as an example, we have created mutants to test many conserved residues for their catalytic potential and substrate-binding capabilities with respect to both AdoMet and tRNA. We identified that the proline at position 267 (P267) is a critical residue for catalysis, because substitution of this residue severely decreases the kcat of the methylation reaction in steady-state kinetic analysis, and the k(chem) in single turnover kinetic analysis. However, substitution of P267 has milder effect on the Km and little effect on the Kd of either substrate. Because P267 has no functional side chain that can directly participate in the chemistry of methyl transfer, we suggest that its role in catalysis is to stabilize conformations of enzyme and substrates for proper alignment of reactive groups at the enzyme active site. Sequence analysis shows that P267 is embedded in a peptide motif that is conserved among the Trm5 family, but absent from the TrmD family, supporting the notion that the two families are descendants of unrelated protein structures.

  6. Mining protein loops using a structural alphabet and statistical exceptionality

    PubMed Central

    2010-01-01

    Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/. PMID:20132552

  7. Mining protein loops using a structural alphabet and statistical exceptionality.

    PubMed

    Regad, Leslie; Martin, Juliette; Nuel, Gregory; Camproux, Anne-Claude

    2010-02-04

    Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 A). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.

  8. Collagenolytic Matrix Metalloproteinase Activities toward Peptomeric Triple-Helical Substrates.

    PubMed

    Stawikowski, Maciej J; Stawikowska, Roma; Fields, Gregg B

    2015-05-19

    Although collagenolytic matrix metalloproteinases (MMPs) possess common domain organizations, there are subtle differences in their processing of collagenous triple-helical substrates. In this study, we have incorporated peptoid residues into collagen model triple-helical peptides and examined MMP activities toward these peptomeric chimeras. Several different peptoid residues were incorporated into triple-helical substrates at subsites P3, P1, P1', and P10' individually or in combination, and the effects of the peptoid residues were evaluated on the activities of full-length MMP-1, MMP-8, MMP-13, and MMP-14/MT1-MMP. Most peptomers showed little discrimination between MMPs. However, a peptomer containing N-methyl Gly (sarcosine) in the P1' subsite and N-isobutyl Gly (NLeu) in the P10' subsite was hydrolyzed efficiently only by MMP-13 [nomenclature relative to the α1(I)772-786 sequence]. Cleavage site analysis showed hydrolysis at the Gly-Gln bond, indicating a shifted binding of the triple helix compared to the parent sequence. Favorable hydrolysis by MMP-13 was not due to sequence specificity or instability of the substrate triple helix but rather was based on the specific interactions of the P7' peptoid residue with the MMP-13 hemopexin-like domain. A fluorescence resonance energy transfer triple-helical peptomer was constructed and found to be readily processed by MMP-13, not cleaved by MMP-1 and MMP-8, and weakly hydrolyzed by MT1-MMP. The influence of the triple-helical structure containing peptoid residues on the interaction between MMP subsites and individual substrate residues may provide additional information about the mechanism of collagenolysis, the understanding of collagen specificity, and the design of selective MMP probes.

  9. Filtrates and Residues: Qualitative Analysis of Some Transition Metals.

    ERIC Educational Resources Information Center

    Kilner, Cary

    1985-01-01

    Describes a qualitative analysis laboratory in which students examine specific precipitates that can be used to identify copper, cobalt, nickel, and iron cations. The objective of the laboratory is to determine which test or sequence of tests unambiguously identifies each cation and to use the results to identify several unknowns. (JN)

  10. Combining modelling and mutagenesis studies of synaptic vesicle protein 2A to identify a series of residues involved in racetam binding.

    PubMed

    Shi, Jiye; Anderson, Dina; Lynch, Berkley A; Castaigne, Jean-Gabriel; Foerch, Patrik; Lebon, Florence

    2011-10-01

    LEV (levetiracetam), an antiepileptic drug which possesses a unique profile in animal models of seizure and epilepsy, has as its unique binding site in brain, SV2A (synaptic vesicle protein 2A). Previous studies have used a chimaeric and site-specific mutagenesis approach to identify three residues in the putative tenth transmembrane helix of SV2A that, when mutated, alter binding of LEV and related racetam derivatives to SV2A. In the present paper, we report a combined modelling and mutagenesis study that successfully identifies another 11 residues in SV2A that appear to be involved in ligand binding. Sequence analysis and modelling of SV2A suggested residues equivalent to critical functional residues of other MFS (major facilitator superfamily) transporters. Alanine scanning of these and other SV2A residues resulted in the identification of residues affecting racetam binding, including Ile273 which differentiated between racetam analogues, when mutated to alanine. Integrating mutagenesis results with docking analysis led to the construction of a mutant in which six SV2A residues were replaced with corresponding SV2B residues. This mutant showed racetam ligand-binding affinity intermediate to the affinities observed for SV2A and SV2B.

  11. Redesigning the type II' β-turn in green fluorescent protein to type I': implications for folding kinetics and stability.

    PubMed

    Madan, Bharat; Sokalingam, Sriram; Raghunathan, Govindan; Lee, Sun-Gu

    2014-10-01

    Both Type I' and Type II' β-turns have the same sense of the β-turn twist that is compatible with the β-sheet twist. They occur predominantly in two residue β-hairpins, but the occurrence of Type I' β-turns is two times higher than Type II' β-turns. This suggests that Type I' β-turns may be more stable than Type II' β-turns, and Type I' β-turn sequence and structure can be more favorable for protein folding than Type II' β-turns. Here, we redesigned the native Type II' β-turn in GFP to Type I' β-turn, and investigated its effect on protein folding and stability. The Type I' β-turns were designed based on the statistical analysis of residues in natural Type I' β-turns. The substitution of the native "GD" sequence of i+1 and i+2 residues with Type I' preferred "(N/D)G" sequence motif increased the folding rate by 50% and slightly improved the thermodynamic stability. Despite the enhancement of in vitro refolding kinetics and stability of the redesigned mutants, they showed poor soluble expression level compared to wild type. To overcome this problem, i and i + 3 residues of the designed Type I' β-turn were further engineered. The mutation of Thr to Lys at i + 3 could restore the in vivo soluble expression of the Type I' mutant. This study indicates that Type II' β-turns in natural β-hairpins can be further optimized by converting the sequence to Type I'. © 2014 Wiley Periodicals, Inc.

  12. Differential signatures of bacterial and mammalian IMP dehydrogenase enzymes.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, R.; Evans, G.; Rotella, F.

    1999-06-01

    IMP dehydrogenase (IMPDH) is an essential enzyme of de novo guanine nucleotide synthesis. IMPDH inhibitors have clinical utility as antiviral, anticancer or immunosuppressive agents. The essential nature of this enzyme suggests its therapeutic applications may be extended to the development of antimicrobial agents. Bacterial IMPDH enzymes show bio- chemical and kinetic characteristics that are different than the mammalian IMPDH enzymes, suggesting IMPDH may be an attractive target for the development of antimicrobial agents. We suggest that the biochemical and kinetic differences between bacterial and mammalian enzymes are a consequence of the variance of specific, identifiable amino acid residues. Identification ofmore » these residues or combination of residues that impart this mammalian or bacterial enzyme signature is a prerequisite for the rational identification of agents that specifically target the bacterial enzyme. We used sequence alignments of IMPDH proteins to identify sequence signatures associated with bacterial or eukaryotic IMPDH enzymes. These selections were further refined to discern those likely to have a role in catalysis using information derived from the bacterial and mammalian IMPDH crystal structures and site-specific mutagenesis. Candidate bacterial sequence signatures identified by this process include regions involved in subunit interactions, the active site flap and the NAD binding region. Analysis of sequence alignments in these regions indicates a pattern of catalytic residues conserved in all enzymes and a secondary pattern of amino acid conservation associated with the major phylogenetic groups. Elucidation of the basis for this mammalian/bacterial IMPDH signature will provide insight into the catalytic mechanism of this enzyme and the foundation for the development of highly specific inhibitors.« less

  13. Protein interface classification by evolutionary analysis

    PubMed Central

    2012-01-01

    Background Distinguishing biologically relevant interfaces from lattice contacts in protein crystals is a fundamental problem in structural biology. Despite efforts towards the computational prediction of interface character, many issues are still unresolved. Results We present here a protein-protein interface classifier that relies on evolutionary data to detect the biological character of interfaces. The classifier uses a simple geometric measure, number of core residues, and two evolutionary indicators based on the sequence entropy of homolog sequences. Both aim at detecting differential selection pressure between interface core and rim or rest of surface. The core residues, defined as fully buried residues (>95% burial), appear to be fundamental determinants of biological interfaces: their number is in itself a powerful discriminator of interface character and together with the evolutionary measures it is able to clearly distinguish evolved biological contacts from crystal ones. We demonstrate that this definition of core residues leads to distinctively better results than earlier definitions from the literature. The stringent selection and quality filtering of structural and sequence data was key to the success of the method. Most importantly we demonstrate that a more conservative selection of homolog sequences - with relatively high sequence identities to the query - is able to produce a clearer signal than previous attempts. Conclusions An evolutionary approach like the one presented here is key to the advancement of the field, which so far was missing an effective method exploiting the evolutionary character of protein interfaces. Its coverage and performance will only improve over time thanks to the incessant growth of sequence databases. Currently our method reaches an accuracy of 89% in classifying interfaces of the Ponstingl 2003 datasets and it lends itself to a variety of useful applications in structural biology and bioinformatics. We made the corresponding software implementation available to the community as an easy-to-use graphical web interface at http://www.eppic-web.org. PMID:23259833

  14. Sequence characterization of cDNA sequence of encoding of an antimicrobial Peptide with no disulfide bridge from the Iranian mesobuthus eupeus venomous glands.

    PubMed

    Farajzadeh-Sheikh, Ahmad; Jolodar, Abbas; Ghaemmaghami, Shamsedin

    2013-01-01

    Scorpion venom glands produce some antimicrobial peptides (AMP) that can rapidly kill a broad range of microbes and have additional activities that impact on the quality and effectiveness of innate responses and inflammation. In this study, we reported the identification of a cDNA sequence encoding cysteine-free antimicrobial peptides isolated from venomous glands of this species. Total RNA was extracted from the Iranian mesobuthus eupeus venom glands, and cDNA was synthesized by using the modified oligo (dT). The cDNA was used as the template for applying Semi-nested RT- PCR technique. PCR Products were used for direct nucleotide sequencing and the results were compared with Gen Bank database. A 213 BP cDNA fragment encoding the entire coding region of an antimicrobial toxin from the Iranian scorpion M. Eupeus venom glands were isolated. The full-length sequence of the coding region was 210 BP contained an open reading frame of 70 amino with a predicted molecular mass of 7970.48 Da and theoretical Pi of 9.10. The open reading frame consists of 210 BP encoding a precursor of 70 amino acid residues, including a signal peptide of 23 residues a propertied of 7 residues, and a mature peptide of 34 residues with no disulfide bridge. The peptide has detectable sequence identity to the Lesser Asian mesobuthus eupeus MeVAMP-2 (98%), MeVAMP-9 (60%) and several previously described AMPs from other scorpion venoms including mesobuthus martensii (94%) and buthus occitanus Israelis (82%). The secondary structure of the peptide mainly consisted of α-helical structure which was generally conserved by previously reported scorpion counterparts. The phylogenetic analysis showed that the Iranian MeAMP-like toxin was similar but not identical with that of venom antimicrobial peptides from lesser Asian scorpion mesobuthus eupeus.

  15. The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase.

    PubMed Central

    Freemont, P S; Dunbar, B; Fothergill-Gilmore, L A

    1988-01-01

    The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase, comprising 363 residues, was determined. The sequence was deduced by automated sequencing of CNBr-cleavage, o-iodosobenzoic acid-cleavage, trypsin-digest and staphylococcal-proteinase-digest fragments. Comparison of the sequence with other class I aldolase sequences shows that the mammalian muscle isoenzyme is one of the most highly conserved enzymes known, with only about 2% of the residues changing per 100 million years. Non-mammalian aldolases appear to be evolving at the same rate as other glycolytic enzymes, with about 4% of the residues changing per 100 million years. Secondary-structure predictions are analysed in an accompanying paper [Sawyer, Fothergill-Gilmore & Freemont (1988) Biochem. J. 249, 789-793]. PMID:3355497

  16. Evidence for N- and C-terminal processing of a plant defense-related enzyme: Primary structure of tobacco prepro-β-1,3-glucanase

    PubMed Central

    Shinshi, H.; Wenzler, H.; Neuhaus, J.-M.; Felix, G.; Hofsteenge, J.; Meins, F.

    1988-01-01

    Tobacco glucan endo-1,3-β-glucosidase (β-1,3-glucanase; 1,3-β-D-glucan glucanohydrolase; EC 3.2.1.39) exhibits complex hormonal and developmental regulation and is induced when plants are infected with pathogens. We determined the primary structure of this enzyme from the nucleotide sequence of five partial cDNA clones and the amino acid sequence of five peptides covering a total of 70 residues. β-1,3-Glucanase is produced as a 359-residue preproenzyme with an N-terminal hydrophobic signal peptide of 21 residues and a C-terminal extension of 22 residues containing a putative N-glycosylation site. The results of pulse-chase experiments with tunicamycin provide evidence that the first step in processing is loss of the signal peptide and addition of an oligosaccharide side chain. The glycosylated intermediate is further processed with the loss of the oligosaccharide side chain and C-terminal extension to give the mature enzyme. Heterogeneity in the sequences of cDNA clones and of mature protein and in Southern blot analysis of restriction endonuclease fragments indicates that tobacco β-1,3-glucanase is encoded by a small gene family. Two or three members of this family appear to have their evolutionary origin in each of the progenitors of tobacco, Nicotiana sylvestris and Nicotiana tomentosiformis. Images PMID:16593965

  17. Sequence swapping does not result in conformation swapping for the beta4/beta5 and beta8/beta9 beta-hairpin turns in human acidic fibroblast growth factor.

    PubMed

    Kim, Jaewon; Lee, Jihun; Brych, Stephen R; Logan, Timothy M; Blaber, Michael

    2005-02-01

    The beta-turn is the most common type of nonrepetitive structure in globular proteins, comprising ~25% of all residues; however, a detailed understanding of effects of specific residues upon beta-turn stability and conformation is lacking. Human acidic fibroblast growth factor (FGF-1) is a member of the beta-trefoil superfold and contains a total of five beta-hairpin structures (antiparallel beta-sheets connected by a reverse turn). beta-Turns related by the characteristic threefold structural symmetry of this superfold exhibit different primary structures, and in some cases, different secondary structures. As such, they represent a useful system with which to study the role that turn sequences play in determining structure, stability, and folding of the protein. Two turns related by the threefold structural symmetry, the beta4/beta5 and beta8/beta9 turns, were subjected to both sequence-swapping and poly-glycine substitution mutations, and the effects upon stability, folding, and structure were investigated. In the wild-type protein these turns are of identical length, but exhibit different conformations. These conformations were observed to be retained during sequence-swapping and glycine substitution mutagenesis. The results indicate that the beta-turn structure at these positions is not determined by the turn sequence. Structural analysis suggests that residues flanking the turn are a primary structural determinant of the conformation within the turn.

  18. GFam: a platform for automatic annotation of gene families.

    PubMed

    Sasidharan, Rajkumar; Nepusz, Tamás; Swarbreck, David; Huala, Eva; Paccanaro, Alberto

    2012-10-01

    We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam's capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.

  19. Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

    PubMed Central

    2012-01-01

    Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems. PMID:22643026

  20. Sequencing of T-superfamily conotoxins from Conus virgo: pyroglutamic acid identification and disulfide arrangement by MALDI mass spectrometry.

    PubMed

    Mandal, Amit Kumar; Ramasamy, Mani Ramakrishnan Santhana; Sabareesh, Varatharajan; Openshaw, Matthew E; Krishnan, Kozhalmannom S; Balaram, Padmanabhan

    2007-08-01

    De novo mass spectrometric sequencing of two Conus peptides, Vi1359 and Vi1361, from the vermivorous cone snail Conus virgo, found off the southern Indian coast, is presented. The peptides, whose masses differ only by 2 Da, possess two disulfide bonds and an amidated C-terminus. Simple chemical modifications and enzymatic cleavage coupled with matrix assisted laser desorption ionization (MALDI) mass spectrometric analysis aided in establishing the sequences of Vi1359, ZCCITIPECCRI-NH(2), and Vi1361, ZCCPTMPECCRI-NH(2), which differ only at residues 4 and 6 (Z = pyroglutamic acid). The presence of the pyroglutamyl residue at the N-terminus was unambiguously identified by chemical hydrolysis of the cyclic amide, followed by esterification. The presence of Ile residues in both the peptides was confirmed from high-energy collision induced dissociation (CID) studies, using the observation of w(n)- and d(n)-ions as a diagnostic. Differential cysteine labeling, in conjunction with MALDI-MS/MS, permitted establishment of disulfide connectivity in both peptides as Cys2-Cys9 and Cys3-Cys10. The cysteine pattern clearly reveals that the peptides belong to the class of T-superfamily conotoxins, in particular the T-1 superfamily.

  1. Q-learning residual analysis: application to the effectiveness of sequences of antipsychotic medications for patients with schizophrenia.

    PubMed

    Ertefaie, Ashkan; Shortreed, Susan; Chakraborty, Bibhas

    2016-06-15

    Q-learning is a regression-based approach that uses longitudinal data to construct dynamic treatment regimes, which are sequences of decision rules that use patient information to inform future treatment decisions. An optimal dynamic treatment regime is composed of a sequence of decision rules that indicate how to optimally individualize treatment using the patients' baseline and time-varying characteristics to optimize the final outcome. Constructing optimal dynamic regimes using Q-learning depends heavily on the assumption that regression models at each decision point are correctly specified; yet model checking in the context of Q-learning has been largely overlooked in the current literature. In this article, we show that residual plots obtained from standard Q-learning models may fail to adequately check the quality of the model fit. We present a modified Q-learning procedure that accommodates residual analyses using standard tools. We present simulation studies showing the advantage of the proposed modification over standard Q-learning. We illustrate this new Q-learning approach using data collected from a sequential multiple assignment randomized trial of patients with schizophrenia. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. A Modified LS+AR Model to Improve the Accuracy of the Short-term Polar Motion Prediction

    NASA Astrophysics Data System (ADS)

    Wang, Z. W.; Wang, Q. X.; Ding, Y. Q.; Zhang, J. J.; Liu, S. S.

    2017-03-01

    There are two problems of the LS (Least Squares)+AR (AutoRegressive) model in polar motion forecast: the inner residual value of LS fitting is reasonable, but the residual value of LS extrapolation is poor; and the LS fitting residual sequence is non-linear. It is unsuitable to establish an AR model for the residual sequence to be forecasted, based on the residual sequence before forecast epoch. In this paper, we make solution to those two problems with two steps. First, restrictions are added to the two endpoints of LS fitting data to fix them on the LS fitting curve. Therefore, the fitting values next to the two endpoints are very close to the observation values. Secondly, we select the interpolation residual sequence of an inward LS fitting curve, which has a similar variation trend as the LS extrapolation residual sequence, as the modeling object of AR for the residual forecast. Calculation examples show that this solution can effectively improve the short-term polar motion prediction accuracy by the LS+AR model. In addition, the comparison results of the forecast models of RLS (Robustified Least Squares)+AR, RLS+ARIMA (AutoRegressive Integrated Moving Average), and LS+ANN (Artificial Neural Network) confirm the feasibility and effectiveness of the solution for the polar motion forecast. The results, especially for the polar motion forecast in the 1-10 days, show that the forecast accuracy of the proposed model can reach the world level.

  3. PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework.

    PubMed

    Song, Jiangning; Li, Fuyi; Takemoto, Kazuhiro; Haffari, Gholamreza; Akutsu, Tatsuya; Chou, Kuo-Chen; Webb, Geoffrey I

    2018-04-14

    Determining the catalytic residues in an enzyme is critical to our understanding the relationship between protein sequence, structure, function, and enhancing our ability to design novel enzymes and their inhibitors. Although many enzymes have been sequenced, and their primary and tertiary structures determined, experimental methods for enzyme functional characterization lag behind. Because experimental methods used for identifying catalytic residues are resource- and labor-intensive, computational approaches have considerable value and are highly desirable for their ability to complement experimental studies in identifying catalytic residues and helping to bridge the sequence-structure-function gap. In this study, we describe a new computational method called PREvaIL for predicting enzyme catalytic residues. This method was developed by leveraging a comprehensive set of informative features extracted from multiple levels, including sequence, structure, and residue-contact network, in a random forest machine-learning framework. Extensive benchmarking experiments on eight different datasets based on 10-fold cross-validation and independent tests, as well as side-by-side performance comparisons with seven modern sequence- and structure-based methods, showed that PREvaIL achieved competitive predictive performance, with an area under the receiver operating characteristic curve and area under the precision-recall curve ranging from 0.896 to 0.973 and from 0.294 to 0.523, respectively. We demonstrated that this method was able to capture useful signals arising from different levels, leveraging such differential but useful types of features and allowing us to significantly improve the performance of catalytic residue prediction. We believe that this new method can be utilized as a valuable tool for both understanding the complex sequence-structure-function relationships of proteins and facilitating the characterization of novel enzymes lacking functional annotations. Copyright © 2018 Elsevier Ltd. All rights reserved.

  4. Polymorphisms in the phosducin (PDC) gene on chromosome 1q25-32

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Humphries, P.; Mansergh, F.C.; Farrar, G.J.

    1994-09-01

    Phosducin (33 kDa protein or MEKA) is a principal water-soluble phosphoprotein in the rod and cone photoreceptor cells and pinealocytes. This protein modulates the phototransduction cascade by binding to the beta and gamma subunit complexes of transducin. The PDC gene has been mapped to 1q25-32, the region of linkage of two hereditary retinal degenerative disorders; autosomal dominant juvenile-onset open-angle glaucoma and one form of autosomal recessive RP. Using previously published sequence data, PCR primers were designed to amplify the coding and 5{prime} flanking regions of the PDC gene. Direct sequencing revealed three polymorphisms in the 5{prime} flanking region, two ofmore » which were in regions highly homologous between humans and mice. Analysis of the polymorphisms was then extended to larger population samples using SSCPE and denaturing gel analysis. The first polymorphism PDC1 resulted from an insertion of a G residue at position -653/4. Allele frequencies were determined to be 0.51 (insG) and 0.49 (normal) giving a PIC value of 0.50. A deletion of a T residue at position -488 was the basis of the PDC2 polymorphism with allele frequencies of 0.88 (normal) and 0.12 (delT) and a PIC value of 0.21. Interestingly, the allele with an inserted G residue in PDC1 always segregrated with the deleted T allele in PDC2. The third polymorphism PDC3 was caused by a T or G residue at position -1083. Allele frequencies of 0.26 (G residue) and 0.74 (T residue) were determined from an analysis of 80 individuals with an overall PIC value of 0.39. The identification of these three polymorphisms in the PDC gene will be useful for future genetic linkage studies of chromosome 1q in inherited retinopathies.« less

  5. Statistical Linkage Analysis of Substitutions in Patient-Derived Sequences of Genotype 1a Hepatitis C Virus Nonstructural Protein 3 Exposes Targets for Immunogen Design

    PubMed Central

    Quadeer, Ahmed A.; Louie, Raymond H. Y.; Shekhar, Karthik; Chakraborty, Arup K.; Hsing, I-Ming

    2014-01-01

    ABSTRACT Chronic hepatitis C virus (HCV) infection is one of the leading causes of liver failure and liver cancer, affecting around 3% of the world's population. The extreme sequence variability of the virus resulting from error-prone replication has thwarted the discovery of a universal prophylactic vaccine. It is known that vigorous and multispecific cellular immune responses, involving both helper CD4+ and cytotoxic CD8+ T cells, are associated with the spontaneous clearance of acute HCV infection. Escape mutations in viral epitopes can, however, abrogate protective T-cell responses, leading to viral persistence and associated pathologies. Despite the propensity of the virus to mutate, there might still exist substitutions that incur a fitness cost. In this paper, we identify groups of coevolving residues within HCV nonstructural protein 3 (NS3) by analyzing diverse sequences of this protein using ideas from random matrix theory and associated methods. Our analyses indicate that one of these groups comprises a large percentage of residues for which HCV appears to resist multiple simultaneous substitutions. Targeting multiple residues in this group through vaccine-induced immune responses should either lead to viral recognition or elicit escape substitutions that compromise viral fitness. Our predictions are supported by published clinical data, which suggested that immune genotypes associated with spontaneous clearance of HCV preferentially recognized and targeted this vulnerable group of residues. Moreover, mapping the sites of this group onto the available protein structure provided insight into its functional significance. An epitope-based immunogen is proposed as an alternative to the NS3 epitopes in the peptide-based vaccine IC41. IMPORTANCE Despite much experimental work on HCV, a thorough statistical study of the HCV sequences for the purpose of immunogen design was missing in the literature. Such a study is vital to identify epistatic couplings among residues that can provide useful insights for designing a potent vaccine. In this work, ideas from random matrix theory were applied to characterize the statistics of substitutions within the diverse publicly available sequences of the genotype 1a HCV NS3 protein, leading to a group of sites for which HCV appears to resist simultaneous substitutions possibly due to deleterious effect on viral fitness. Our analysis leads to completely novel immunogen designs for HCV. In addition, the NS3 epitopes used in the recently proposed peptide-based vaccine IC41 were analyzed in the context of our framework. Our analysis predicts that alternative NS3 epitopes may be worth exploring as they might be more efficacious. PMID:24760894

  6. The Treacher Collins syndrome (TCOF1) gene product, treacle, is targeted to the nucleolus by signals in its C-terminus.

    PubMed

    Winokur, S T; Shiang, R

    1998-11-01

    The TCOF1 gene product, treacle, responsible for the craniofacial disorder Treacher Collins syndrome, has been predicted to be a member of a class of nucleolar phosphoproteins based on its primary amino acid sequence. Treacle is a low complexity protein with ten repeating units of acidic and basic residues, each of which contains a large number of putative casein kinase 2 and protein kinase C phosphorylation sites. In addition, the C-terminus of treacle contains multiple putative nuclear localization signals. The overall structure of treacle, as well as sequence similarity to several nucleolar phosphoproteins, predicts that treacle is a member of this class of proteins. Using green fluorescent protein fusion constructs with the full-length and deleted domains of the murine homolog of treacle, we demonstrate that the cellular localization of treacle is nucleolar. This localization is mediated by the last 41 residues of the C-terminus (residues 1262-1302). At least two functional nuclear localization signals have been identified in the protein, one between residues 1176 and 1270 and the second within the last 32 residues of the protein (1271-1302). The nucleolar localization signal is disrupted by two constructs that split the C-terminal region between residues 1270 and 1271. This study provides the first direct analysis of treacle and demonstrates that the protein involved in TCOF1 is a nucleolar protein.

  7. Protein Sectors: Statistical Coupling Analysis versus Conservation

    PubMed Central

    Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas

    2015-01-01

    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535

  8. Identification of the regulatory autophosphorylation site of autophosphorylation-dependent protein kinase (auto-kinase). Evidence that auto-kinase belongs to a member of the p21-activated kinase family.

    PubMed

    Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D

    1998-08-15

    Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2.

  9. Identification of the regulatory autophosphorylation site of autophosphorylation-dependent protein kinase (auto-kinase). Evidence that auto-kinase belongs to a member of the p21-activated kinase family.

    PubMed Central

    Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D

    1998-01-01

    Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2. PMID:9693111

  10. ArrayPitope: Automated Analysis of Amino Acid Substitutions for Peptide Microarray-Based Antibody Epitope Mapping.

    PubMed

    Hansen, Christian Skjødt; Østerbye, Thomas; Marcatili, Paolo; Lund, Ole; Buus, Søren; Nielsen, Morten

    2017-01-01

    Identification of epitopes targeted by antibodies (B cell epitopes) is of critical importance for the development of many diagnostic and therapeutic tools. For clinical usage, such epitopes must be extensively characterized in order to validate specificity and to document potential cross-reactivity. B cell epitopes are typically classified as either linear epitopes, i.e. short consecutive segments from the protein sequence or conformational epitopes adapted through native protein folding. Recent advances in high-density peptide microarrays enable high-throughput, high-resolution identification and characterization of linear B cell epitopes. Using exhaustive amino acid substitution analysis of peptides originating from target antigens, these microarrays can be used to address the specificity of polyclonal antibodies raised against such antigens containing hundreds of epitopes. However, the interpretation of the data provided in such large-scale screenings is far from trivial and in most cases it requires advanced computational and statistical skills. Here, we present an online application for automated identification of linear B cell epitopes, allowing the non-expert user to analyse peptide microarray data. The application takes as input quantitative peptide data of fully or partially substituted overlapping peptides from a given antigen sequence and identifies epitope residues (residues that are significantly affected by substitutions) and visualize the selectivity towards each residue by sequence logo plots. Demonstrating utility, the application was used to identify and address the antibody specificity of 18 linear epitope regions in Human Serum Albumin (HSA), using peptide microarray data consisting of fully substituted peptides spanning the entire sequence of HSA and incubated with polyclonal rabbit anti-HSA (and mouse anti-rabbit-Cy3). The application is made available at: www.cbs.dtu.dk/services/ArrayPitope.

  11. ArrayPitope: Automated Analysis of Amino Acid Substitutions for Peptide Microarray-Based Antibody Epitope Mapping

    PubMed Central

    Hansen, Christian Skjødt; Østerbye, Thomas; Marcatili, Paolo; Lund, Ole; Buus, Søren

    2017-01-01

    Identification of epitopes targeted by antibodies (B cell epitopes) is of critical importance for the development of many diagnostic and therapeutic tools. For clinical usage, such epitopes must be extensively characterized in order to validate specificity and to document potential cross-reactivity. B cell epitopes are typically classified as either linear epitopes, i.e. short consecutive segments from the protein sequence or conformational epitopes adapted through native protein folding. Recent advances in high-density peptide microarrays enable high-throughput, high-resolution identification and characterization of linear B cell epitopes. Using exhaustive amino acid substitution analysis of peptides originating from target antigens, these microarrays can be used to address the specificity of polyclonal antibodies raised against such antigens containing hundreds of epitopes. However, the interpretation of the data provided in such large-scale screenings is far from trivial and in most cases it requires advanced computational and statistical skills. Here, we present an online application for automated identification of linear B cell epitopes, allowing the non-expert user to analyse peptide microarray data. The application takes as input quantitative peptide data of fully or partially substituted overlapping peptides from a given antigen sequence and identifies epitope residues (residues that are significantly affected by substitutions) and visualize the selectivity towards each residue by sequence logo plots. Demonstrating utility, the application was used to identify and address the antibody specificity of 18 linear epitope regions in Human Serum Albumin (HSA), using peptide microarray data consisting of fully substituted peptides spanning the entire sequence of HSA and incubated with polyclonal rabbit anti-HSA (and mouse anti-rabbit-Cy3). The application is made available at: www.cbs.dtu.dk/services/ArrayPitope. PMID:28095436

  12. Honey bee (Apis mellifera) transferrin-gene structure and the role of ecdysteroids in the developmental regulation of its expression.

    PubMed

    do Nascimento, Adriana Mendes; Cuvillier-Hot, Virginie; Barchuk, Angel Roberto; Simões, Zilá Luz Paulino; Hartfelder, Klaus

    2004-05-01

    Social life is prone to invasion by microorganisms, and binding of ferric ions by transferrin is an efficient strategy to restrict their access to iron. In this study, we isolated cDNA and genomic clones encoding an Apis mellifera transferrin (AmTRF) gene. It has an open reading frame (ORF) of 2136 bp spread over nine exons. The deduced protein sequence comprises 686 amino acid residues plus a 26 residues signal sequence, giving a predicted molecular mass of 76 kDa. Comparison of the deduced AmTRF amino acid sequence with known insect transferrins revealed significant similarity extending over the entire sequence. It clusters with monoferric transferrins, with which it shares putative iron-binding residues in the N-terminal lobe. In a functional analysis of AmTRF expression in honey bee development, we monitored its expression profile in the larval and pupal stages. The negative regulation of AmTRF by ecdysteroids deduced from the developmental expression profile was confirmed by experimental treatment of spinning-stage honey bee larvae with 20-hydroxyecdysone, and of fourth instar-larvae with juvenile hormone. A juvenile hormone application to spinning-stage larvae, in contrast, had only a minor effect on AmTRF transcript levels. This is the first study implicating ecdysteroids in the developmental regulation of transferrin expression in an insect species.

  13. InterProSurf: a web server for predicting interacting sites on protein surfaces

    PubMed Central

    Negi, Surendra S.; Schein, Catherine H.; Oezguen, Numan; Power, Trevor D.; Braun, Werner

    2009-01-01

    Summary A new web server, InterProSurf, predicts interacting amino acid residues in proteins that are most likely to interact with other proteins, given the 3D structures of subunits of a protein complex. The prediction method is based on solvent accessible surface area of residues in the isolated subunits, a propensity scale for interface residues and a clustering algorithm to identify surface regions with residues of high interface propensities. Here we illustrate the application of InterProSurf to determine which areas of Bacillus anthracis toxins and measles virus hemagglutinin protein interact with their respective cell surface receptors. The computationally predicted regions overlap with those regions previously identified as interface regions by sequence analysis and mutagenesis experiments. PMID:17933856

  14. Common fold in helix–hairpin–helix proteins

    PubMed Central

    Shao, Xuguang; Grishin, Nick V.

    2000-01-01

    Helix–hairpin–helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein–protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude that most HhH motifs are integrated as a part of a five-helical domain, termed (HhH)2 domain here. It typically consists of two consecutive HhH motifs that are linked by a connector helix and displays pseudo-2-fold symmetry. (HhH)2 domains show clear structural integrity and a conserved hydrophobic core composed of seven residues, one residue from each α-helix and each hairpin, and deserves recognition as a distinct protein fold. In addition to known HhH in the structures of RuvA, RadA, MutY and DNA-polymerases, we have detected new HhH motifs in sterile alpha motif and barrier-to-autointegration factor domains, the α-subunit of Escherichia coli RNA-polymerase, DNA-helicase PcrA and DNA glyco­s­y­lases. Statistically significant sequence similarity of HhH motifs and pronounced structural conservation argue for homology between (HhH)2 domains in different protein families. Our analysis helps to clarify how non-symmetric protein motifs bind to the double helix of DNA through the formation of a pseudo-2-fold symmetric (HhH)2 functional unit. PMID:10908318

  15. Role of DNA conformation & energetic insights in Msx-1-DNA recognition as revealed by molecular dynamics studies on specific and nonspecific complexes.

    PubMed

    Kachhap, Sangita; Singh, Balvinder

    2015-01-01

    In most of homeodomain-DNA complexes, glutamine or lysine is present at 50th position and interacts with 5th and 6th nucleotide of core recognition region. Molecular dynamics simulations of Msx-1-DNA complex (Q50-TG) and its variant complexes, that is specific (Q50K-CC), nonspecific (Q50-CC) having mutation in DNA and (Q50K-TG) in protein, have been carried out. Analysis of protein-DNA interactions and structure of DNA in specific and nonspecific complexes show that amino acid residues use sequence-dependent shape of DNA to interact. The binding free energies of all four complexes were analysed to define role of amino acid residue at 50th position in terms of binding strength considering the variation in DNA on stability of protein-DNA complexes. The order of stability of protein-DNA complexes shows that specific complexes are more stable than nonspecific ones. Decomposition analysis shows that N-terminal amino acid residues have been found to contribute maximally in binding free energy of protein-DNA complexes. Among specific protein-DNA complexes, K50 contributes more as compared to Q50 towards binding free energy in respective complexes. The sequence dependence of local conformation of DNA enables Q50/Q50K to make hydrogen bond with nucleotide(s) of DNA. The changes in amino acid sequence of protein are accommodated and stabilized around TAAT core region of DNA having variation in nucleotides.

  16. Numerical Analysis of Residual Stress and Distortion Use Finite Element Method on Inner Bottom Construction of Geomarin IV Survey Ship with Welding Sequence Variations

    NASA Astrophysics Data System (ADS)

    Syahroni, N.; Hartono, A. B. W.; Murtedjo, M.

    2018-03-01

    In the ship fabrication industry, welding is the most critical stage. If the quality of welding on ship fabrication is not good, then it will affect the strength and overall appearance of the structure. One of the factors that affect the quality of welding is residual stress and distortion. In this research welding simulation is performed on the inner bottom construction of Geomarin IV Ship Survey using shell element and has variation to welding sequence. In this study, welding simulations produced peak temperatures at 2490 K at variation 4. While the lowest peak temperature was produced by variation 2 with a temperature of 2339 K. After welding simulation, it continued simulating residual stresses and distortion. The smallest maximum tensile residual stress found in the inner bottom construction is 375.23 MPa, and the maximum tensile pressure is -20.18 MPa. The residual stress is obtained from variation 3. The distortion occurring in the inner bottom construction for X=720 mm is 4.2 mm and for X=-720 mm, the distortion is 4.92 mm. The distortion is obtained from the variation 3. Near the welding area, distortion value reaches its minimum point. This is because the stiffeners in the form of frames serves as anchoring.

  17. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    PubMed

    Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

    2015-01-01

    Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  18. Isolation and characterization of a new bacteriocin, termed enterocin M, produced by environmental isolate Enterococcus faecium AL41.

    PubMed

    Mareková, Mária; Lauková, Andrea; Skaugen, Morten; Nes, Ingolf

    2007-08-01

    The new bacteriocin, termed enterocin M, produced by Enterococcus faecium AL 41 showed a wide spectrum of inhibitory activity against the indicator organisms from different sources. It was purified by (NH4)2SO4 precipitation, cation-exchange chromatography and reverse phase chromatography (FPLC). The purified peptide was sequenced by N-terminal amino acid Edman degradation and a mass spectrometry analysis was performed. By combining the data obtained from amino acid sequence (39 N-terminal amino acid residues was determined) and the molecular weight (determined to be 4628 Da) it was concluded that the purified enterocin M is a new bacteriocin, which is very similar to enterocin P. However, its molecular weight is different from enterocin P (4701.25). Of the first 39 N-terminal residues of enterocin M, valine was found in position 20 and a lysine in position 35, while enterocin P has tryptophane residues in these positions.

  19. An evolutionary analysis identifies a conserved pentapeptide stretch containing the two essential lysine residues for rice L-myo-inositol 1-phosphate synthase catalytic activity

    PubMed Central

    Basak, Papri; Maitra-Majee, Susmita; Das, Jayanta Kumar; Mukherjee, Abhishek; Ghosh Dastidar, Shubhra; Pal Choudhury, Pabitra

    2017-01-01

    A molecular evolutionary analysis of a well conserved protein helps to determine the essential amino acids in the core catalytic region. Based on the chemical properties of amino acid residues, phylogenetic analysis of a total of 172 homologous sequences of a highly conserved enzyme, L-myo-inositol 1-phosphate synthase or MIPS from evolutionarily diverse organisms was performed. This study revealed the presence of six phylogenetically conserved blocks, out of which four embrace the catalytic core of the functional protein. Further, specific amino acid modifications targeting the lysine residues, known to be important for MIPS catalysis, were performed at the catalytic site of a MIPS from monocotyledonous model plant, Oryza sativa (OsMIPS1). Following this study, OsMIPS mutants with deletion or replacement of lysine residues in the conserved blocks were made. Based on the enzyme kinetics performed on the deletion/replacement mutants, phylogenetic and structural comparison with the already established crystal structures from non-plant sources, an evolutionarily conserved peptide stretch was identified at the active pocket which contains the two most important lysine residues essential for catalytic activity. PMID:28950028

  20. Analysis of inter-residue contacts reveals folding stabilizers in P-loops of potassium, sodium, and TRPV channels.

    PubMed

    Korkosh, V S; Zhorov, B S; Tikhonov, D B

    2016-05-01

    The family of P-loop channels includes potassium, sodium, calcium, cyclic nucleotide-gated and TRPV channels, as well as ionotropic glutamate receptors. Despite vastly different physiological and pharmacological properties, the channels have structurally conserved folding of the pore domain. Furthermore, crystallographic data demonstrate surprisingly similar mutual disposition of transmembrane and membrane-diving helices. To understand determinants of this conservation, here we have compared available high-resolution structures of sodium, potassium, and TRPV1 channels. We found that some residues, which are in matching positions of the sequence alignment, occur in different positions in the 3D alignment. Surprisingly, we found 3D mismatches in well-packed P-helices. Analysis of energetics of individual residues in Monte Carlo minimized structures revealed cyclic patterns of energetically favorable inter- and intra-subunit contacts of P-helices with S6 helices. The inter-subunit contacts are rather conserved in all the channels, whereas the intra-subunit contacts are specific for particular types of the channels. Our results suggest that these residue-residue contacts contribute to the folding stabilization. Analysis of such contacts is important for structural and phylogenetic studies of homologous proteins.

  1. Accounting for epistatic interactions improves the functional analysis of protein structures.

    PubMed

    Wilkins, Angela D; Venner, Eric; Marciano, David C; Erdin, Serkan; Atri, Benu; Lua, Rhonald C; Lichtarge, Olivier

    2013-11-01

    The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. lichtarge@bcm.edu. Supplementary data are available at Bioinformatics online.

  2. Accounting for epistatic interactions improves the functional analysis of protein structures

    PubMed Central

    Wilkins, Angela D.; Venner, Eric; Marciano, David C.; Erdin, Serkan; Atri, Benu; Lua, Rhonald C.; Lichtarge, Olivier

    2013-01-01

    Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: lichtarge@bcm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24021383

  3. Purification and sequencing of the active site tryptic peptide from penicillin-binding protein 1b of Escherichia coli

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nicholas, R.A.; Suzuki, H.; Hirota, Y.

    This paper reports the sequence of the active site peptide of penicillin-binding protein 1b from Escherichia coli. Purified penicillin-binding protein 1b was labeled with (/sup 14/C)penicillin G, digested with trypsin, and partially purified by gel filtration. Upon further purification by high-pressure liquid chromatography, two radioactive peaks were observed, and the major peak, representing over 75% of the applied radioactivity, was submitted to amino acid analysis and sequencing. The sequence Ser-Ile-Gly-Ser-Leu-Ala-Lys was obtained. The active site nucleophile was identified by digesting the purified peptide with aminopeptidase M and separating the radioactive products on high-pressure liquid chromatography. Amino acid analysis confirmed thatmore » the serine residue in the middle of the sequence was covalently bonded to the (/sup 14/C)penicilloyl moiety. A comparison of this sequence to active site sequences of other penicillin-binding proteins and beta-lactamases is presented.« less

  4. Engineering diverse changes in beta-turn propensities in the N-terminal beta-hairpin of ubiquitin reveals significant effects on stability and kinetics but a robust folding transition state.

    PubMed

    Simpson, Emma R; Meldrum, Jill K; Searle, Mark S

    2006-04-04

    Using the N-terminal 17-residue beta-hairpin of ubiquitin as a "host" for mutational studies, we have investigated the influence of the beta-turn sequence on protein stability and folding kinetics by replacing the native G-bulged turn (TLTGK) with more flexible analogues (TG3K and TG5K) and a series of four-residue type I' beta-turn sequences, commonly found in beta-hairpins. Although a statistical analysis of type I' turns demonstrates residue preferences at specific sites, the frequency of occurrence appears to only broadly correlate with experimentally determined protein stabilities. The subsequent engineering of context-dependent non-native tertiary contacts involving turn residues is shown to produce large changes in stability. Relatively few point mutations have been described that probe secondary structure formation in ubiquitin in a manner that is independent of tertiary contacts. To this end, we have used the more rigorous rate-equilibrium free energy relationship (Leffler analysis), rather than the two-point phi value analysis, to show for a family of engineered beta-turn mutants that stability (range of approximately 20 kJ/mol) and folding kinetics (190-fold variation in refolding rate) are linearly correlated (alpha(f) = 0.74 +/- 0.08). The data are consistent with a transition state that is robust with regard to a wide range of statistically favored and disfavored beta-turn mutations and implicate a loosely assembled beta-hairpin as a key template in transition state stabilization with the beta-turn playing a central role.

  5. Molecular cloning and characterization of RGA1 encoding a G protein alpha subunit from rice (Oryza sativa L. IR-36).

    PubMed

    Seo, H S; Kim, H Y; Jeong, J Y; Lee, S Y; Cho, M J; Bahk, J D

    1995-03-01

    A cDNA clone, RGA1, was isolated by using a GPA1 cDNA clone of Arabidopsis thaliana G protein alpha subunit as a probe from a rice (Oryza sativa L. IR-36) seedling cDNA library from roots and leaves. Sequence analysis of genomic clone reveals that the RGA1 gene has 14 exons and 13 introns, and encodes a polypeptide of 380 amino acid residues with a calculated molecular weight of 44.5 kDa. The encoded protein exhibits a considerable degree of amino acid sequence similarity to all the other known G protein alpha subunits. A putative TATA sequence (ATATGA), a potential CAAT box sequence (AGCAATAC), and a cis-acting element, CCACGTGG (ABRE), known to be involved in ABA induction are found in the promoter region. The RGA1 protein contains all the consensus regions of G protein alpha subunits except the cysteine residue near the C-terminus for ADP-ribosylation by pertussis toxin. The RGA1 polypeptide expressed in Escherichia coli was, however, ADP-ribosylated by 10 microM [adenylate-32P] NAD and activated cholera toxin. Southern analysis indicates that there are no other genes similar to the RGA1 gene in the rice genome. Northern analysis reveals that the RGA1 mRNA is 1.85 kb long and expressed in vegetative tissues, including leaves and roots, and that its expression is regulated by light.

  6. Proteolytic interconversion and N-terminal sequences of the Citrobacter diversus major beta-lactamases.

    PubMed Central

    Franceschini, N; Amicosante, G; Perilli, M; Maccarrone, M; Oratore, A; van Beeumen, J; Frère, J M

    1991-01-01

    The N-terminal sequences of the two major beta-lactamases produced by Citrobacter diversus differed only by the absence of the first residue in form II and the loss of five amino acid residues at the C-terminal end. Limited proteolysis of the homogeneous form I protein yielded a variety of enzymatically active products. In the major product obtained after the action of papain, the first three N-terminal residues of form I had been cleaved, whereas at the C-terminal end the treated enzyme lacked five residues. However, this cannot explain the different behaviours of form I, form II and papain digestion product upon chromatofocusing. Form I, which was sequenced up to position 56, exhibited a very high degree of similarity with a Klebsiella oxytoca beta-lactamase. The determined sequence, which contained the active serine residue, demonstrated that the chromosome-encoded beta-lactamase of Citrobacter diversus belong to class A. Images Fig. 2. PMID:2039443

  7. Amino acid sequence of a trypsin inhibitor from a Spirometra (Spirometra erinaceieuropaei).

    PubMed

    Sanda, A; Uchida, A; Itagaki, T; Kobayashi, H; Inokuchi, N; Koyama, T; Iwama, M; Ohgi, K; Irie, M

    2001-12-01

    A trypsin inhibitor that is highly homologous with bovine pancreatic trypsin inhibitor (BPTI) was co-purified along with RNase from Spirometra (Spirometra erinaceieuropaei). The amino acid sequence of this inhibitor (SETI) and the nucleotide sequence of the cDNA encoding this protein were determined by protein chemistry and gene technology. SETI contains 68 amino acid residues and has a molecular mass of 7,798 Da. SETI has 31 amino acid residues that are identical with BPTI's sequence, including 6 half-cystine and 5 aromatic amino acid residues. The active site Lys residue in BPTI is replaced by an Arg residue in SETI. SETI is an effective inhibitor of trypsin and moderately inhibits a-chymotrypsin, but less inhibits elastase or subtilisin. SETI was expressed by E. coli containing a PelB vector carrying the SETI encoding cDNA; an expression yield of 0.68 mg/l was obtained. The phylogenetic relationship of SETI and the other BPTI-like trypsin inhibitors was analyzed using most likelihood inference methods.

  8. From sequence analysis of three novel ascorbate peroxidases from Arabidopsis thaliana to structure, function and evolution of seven types of ascorbate peroxidase.

    PubMed Central

    Jespersen, H M; Kjaersgård, I V; Ostergaard, L; Welinder, K G

    1997-01-01

    Ascorbate peroxidases are haem proteins that efficiently scavenge H2O2 in the cytosol and chloroplasts of plants. Database analyses retrieved 52 expressed sequence tags coding for Arabidopsis thaliana ascorbate peroxidases. Complete sequencing of non-redundant clones revealed three novel types in addition to the two cytosol types described previously in Arabidopsis. Analysis of sequence data available for all plant ascorbate peroxidases resulted in the following classification: two types of cytosol soluble ascorbate peroxidase designated cs1 and cs2; three types of cytosol membrane-bound ascorbate peroxidase, namely cm1, bound to microbodies via a C-terminal membrane-spanning segment, and cm2 and cm3, both of unknown location; two types of chloroplast ascorbate peroxidase with N-terminal transit sequences, the stromal ascorbate peroxidase (chs), and the thylakoid-bound ascorbate peroxidase showing a C-terminal transmembrane segment and designated cht. Further comparison of the patterns of conserved residues and the crystal structure of pea ascorbate peroxidase showed that active site residues are conserved, and three peptide segments implicated in interaction with reducing substrate are similar, excepting cm2 and cm3 types. A change of Phe-175 in cytosol types to Trp-175 in chloroplast types might explain the greater ascorbate specificity of chloroplast compared with cytosol ascorbate peroxidases. Residues involved in homodimeric subunit interaction are conserved only in cs1, cs2 and cm1 types. The proximal cation (K+)-binding site observed in pea ascorbate peroxidase seems to be conserved. In addition, cm1, cm2, cm3, chs and cht ascorbate peroxidases contain Asp-43, Asn-57 and Ser-59, indicative of a distal monovalent cation site. The data support the hypothesis that present-day peroxidases evolved by an early gene duplication event. PMID:9291097

  9. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    PubMed Central

    2011-01-01

    Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the target can be reliably identified. The HomPPI web server is available at http://homppi.cs.iastate.edu/. Conclusions Sequence homology-based methods offer a class of computationally efficient and reliable approaches for predicting the protein-protein interface residues that participate in either obligate or transient interactions. For query proteins involved in transient interactions, the reliability of interface residue prediction can be improved by exploiting knowledge of putative interaction partners. PMID:21682895

  10. Building toy models of proteins using coevolutionary information

    NASA Astrophysics Data System (ADS)

    Cheng, Ryan; Raghunathan, Mohit; Onuchic, Jose

    2015-03-01

    Recent developments in global statistical methodologies have advanced the analysis of large collections of protein sequences for coevolutionary information. Coevolution between amino acids in a protein arises from compensatory mutations that are needed to maintain the stability or function of a protein over the course of evolution. This gives rise to quantifiable correlations between amino acid positions within the multiple sequence alignment of a protein family. Here, we use Direct Coupling Analysis (DCA) to infer a Potts model Hamiltonian governing the correlated mutations in a protein family to obtain the sequence-dependent interaction energies of a toy protein model. We demonstrate that this methodology predicts residue-residue interaction energies that are consistent with experimental mutational changes in protein stabilities as well as other computational methodologies. Furthermore, we demonstrate with several examples that DCA could be used to construct a structure-based model that quantitatively agrees with experimental data on folding mechanisms. This work serves as a potential framework for generating models of proteins that are enriched by evolutionary data that can potentially be used to engineer key functional motions and interactions in protein systems. This research has been supported by the NSF INSPIRE award MCB-1241332 and by the CTBP sponsored by the NSF (Grant PHY-1427654).

  11. Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications

    PubMed Central

    Cvicek, Vaclav; Goddard, William A.; Abrol, Ravinder

    2016-01-01

    The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs. These residues can be used to make testable hypotheses about the structural basis of receptor function and about the molecular basis of disease-associated single nucleotide polymorphisms. PMID:27028541

  12. Overcoming Sequence Misalignments with Weighted Structural Superposition

    PubMed Central

    Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.

    2012-01-01

    An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542

  13. GENETIC CHARACTERIZATION OF CANINE PARVOVIRUS IN SYMPATRIC FREE-RANGING WILD CARNIVORES IN PORTUGAL.

    PubMed

    Miranda, Carla; Santos, Nuno; Parrish, Colin; Thompson, Gertrude

    2017-10-01

    Since its emergence in the 1970s, canine parvovirus (CPV) has been reported in domestic and nondomestic carnivores worldwide with severe implications on their health and survival. Here, we aim to better understand CPV circulation in multihost-pathogens systems by characterizing CPV DNA or viruses in 227 free-ranging wild carnivores of 12 species from Portugal. Collected samples during 1995-2011 were analyzed by PCR and sequence analysis. The canine parvovirus DNA was detected in 4 (2%) animals of two species, namely in wolves (Canis lupus; 3/63, 5%, 95% confidence interval=1.6-3.15) and in a stone marten (Martes foina; 1/36, 3%, 95% confidence interval=0.5-14.2). Viruses in two wolves had VP2 residue 426 as aspartic acid (so-called CPV-2b) and the third had VP2 residue 426 as asparagine (CPV-2a), while the virus in the stone marten uniquely had VP2 residue 426 as glutamic acid (CPV-2c). The comparative analysis of the full-length VP2 gene of our isolates showed other nonsynonymous mutations. The phylogenetic analysis demonstrated that the sequences from wolves clustered together, showing a close relationship with European domestic dogs (Canis lupus familiaris) and wolf strains while the viral sequence from the stone marten grouped with other viruses contained the glutamic acid VP2 426 along with raccoon (Procyon lotor), bobcat (Lynx rufus), and domestic dog strains. This study confirmed that wild carnivores in Portugal are infected by CPV variants, strongly suggesting viral transmission between the wild and domestic populations and suggesting a need for a better understanding of the epidemiology of the disease and its management in wild populations.

  14. Characterization of milled solid residue from cypress liquefaction in sub- and super ethanol.

    PubMed

    Liu, Hua-Min; Liu, Yu-Lan

    2014-01-01

    Cypress liquefaction in sub- and super ethanol was carried out in an autoclave at various temperatures. Milled solid residue (MSR) was isolated from solid residue remaining from the liquefaction process, and its chemical characteristics was comparatively investigated with milled wood lignin (MWL) of cypress by sugar analysis, elemental analysis, FT-IR analysis, gel permeation chromatography, and NMR analysis. Results showed that there were two reactions (de-polymerization and re-polymerization) during the cypress liquefaction in sub- and super ethanol and the re-polymerization reactions were the main reaction at 220-260°C. Considering the stability of side-chain, the stability of lignin side-chain in cypress during liquefaction process in ethanol could be sequenced as follows: β-5>β-β'>β-O-4'. The MSR were mainly from the decomposition and re-polymerization of lignin. This study suggests that characterization of MSR provides a promising method to investigate the mechanisms of cypress liquefaction in ethanol. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  15. Albumin Redhill (-1 Arg, 320 Ala----Thr): a glycoprotein variant of human serum albumin whose precursor has an aberrant signal peptidase cleavage site.

    PubMed

    Brennan, S O; Myles, T; Peach, R J; Donaldson, D; George, P M

    1990-01-01

    Albumin Redhill is an electrophoretically slow genetic variant of human serum albumin that does not bind 63Ni2+ and has a molecular mass 2.5 kDa higher than normal albumin. Its inability to bind Ni2+ was explained by the finding of an additional residue of Arg at position -1. This did not explain the molecular basis of the genetic variation (since proalbumin contains adjacent Arg residues at -1 and -2) or the increase in apparent molecular mass. Fractionation of tryptic digests on concanavalin A-Sepharose followed by peptide mapping of the bound and unbound fractions and sequence analysis of the glycopeptides identified a mutation of 320 Ala----Thr. This introduces an Asn-Tyr-Thr oligosaccharide attachment sequence centered on Asn-318 and explains the increase in molecular mass. This, however, did not satisfactorily explain the presence of the additional Arg residue at position -1. DNA sequencing of polymerase chain reaction-amplified genomic DNA encoding the prepro sequence of albumin indicated an additional mutation of -2 Arg----Cys. This introduces a prepro sequence, Met-Lys-Trp-Val-Thr-Phe-Ile-Ser-Leu-Leu-Phe-Leu-Phe-Ser-Ser-Ala-Tyr- Ser-Arg-Gly-Val-Phe-Cys-Arg (cf.-Tyr-Ser-Arg-Gly-Val-Phe-Arg-Arg- in normal human pre-proalbumin). We propose that the new Phe-Cys-Arg sequence in the propeptide is an aberrant signal peptidase cleavage site and that the signal peptidase cleaves the propeptide of albumin Redhill in the lumen of the endoplasmic reticulum before it reaches the Golgi vesicles, the site of the diarginyl-specific proalbumin convertase.

  16. Identification and characterization of a matrix protein (PPP-10) in the periostracum of the pearl oyster, Pinctada fucata.

    PubMed

    Nakayama, Seiji; Suzuki, Michio; Endo, Hirotoshi; Iimura, Kurin; Kinoshita, Shigeharu; Watabe, Shugo; Kogure, Toshihiro; Nagasawa, Hiromichi

    2013-01-01

    The periostracum is a layered structure that is formed as a mollusk shell grows. The shell is covered by the periostracum, which consists of organic matrices that prevent decalcification of the shell. In the present study, we discovered the presence of chitin in the periostracum and identified a novel matrix protein, Pinctada fucata periostracum protein named PPP-10. It was purified from the sodium dodecyl sulfate/dithiothreitol-soluble fraction of the periostracum of the Japanese pearl oyster, P. fucata. The deduced amino acid sequence was determined by a combination of amino acid sequence analysis and cDNA cloning. The open reading frame encoded a precursor protein of 112 amino acid residues including a 21-residue signal peptide. The 91 residues following the signal peptide contained abundant Cys and Tyr residues. PPP-10 was expressed on the outer side of the outer fold in the mantle, indicating that PPP-10 was present in the second or third layer of the periostracum. We also determined that the recombinant PPP-10 had chitin-binding activity and could incorporate chitin into the scaffolds of the periostracum. These results shed light on the early steps in mollusk shell formation.

  17. Identification and characterization of a matrix protein (PPP-10) in the periostracum of the pearl oyster, Pinctada fucata☆

    PubMed Central

    Nakayama, Seiji; Suzuki, Michio; Endo, Hirotoshi; Iimura, Kurin; Kinoshita, Shigeharu; Watabe, Shugo; Kogure, Toshihiro; Nagasawa, Hiromichi

    2013-01-01

    The periostracum is a layered structure that is formed as a mollusk shell grows. The shell is covered by the periostracum, which consists of organic matrices that prevent decalcification of the shell. In the present study, we discovered the presence of chitin in the periostracum and identified a novel matrix protein, Pinctada fucata periostracum protein named PPP-10. It was purified from the sodium dodecyl sulfate/dithiothreitol-soluble fraction of the periostracum of the Japanese pearl oyster, P. fucata. The deduced amino acid sequence was determined by a combination of amino acid sequence analysis and cDNA cloning. The open reading frame encoded a precursor protein of 112 amino acid residues including a 21-residue signal peptide. The 91 residues following the signal peptide contained abundant Cys and Tyr residues. PPP-10 was expressed on the outer side of the outer fold in the mantle, indicating that PPP-10 was present in the second or third layer of the periostracum. We also determined that the recombinant PPP-10 had chitin-binding activity and could incorporate chitin into the scaffolds of the periostracum. These results shed light on the early steps in mollusk shell formation. PMID:24251105

  18. Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites.

    PubMed

    Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato

    2014-10-01

    Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.

  19. The hypothetical protein Atu4866 from Agrobacterium tumefaciens adopts a streptavidin-like fold

    PubMed Central

    Ai, Xuanjun; Semesi, Anthony; Yee, Adelinda; Arrowsmith, Cheryl H.; Choy, Wing-Yiu; Li, Shawn S.C.

    2008-01-01

    Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares ≥60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a β-barrel/sandwich formed by eight antiparallel β-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site. PMID:18042676

  20. Epidemiology of canine distemper virus in wild raccoon dogs (Nyctereutes procyonoides) from South Korea.

    PubMed

    Cha, Se-Yeoun; Kim, Eun-Ju; Kang, Min; Jang, Sang-Ho; Lee, Hae-Beom; Jang, Hyung-Kwan

    2012-09-01

    Raccoon dogs (Nyctereutes procyonoides) are widespread and common in South Korea. In 2011, we obtained serum samples from 102 wild raccoon dogs to survey their exposure to canine distemper virus (CDV). Forty-five of the 102 animals (44.1%) were seropositive. Field cases of canine distemper in wild raccoon dogs from 2010 to 2011 were investigated. Fourteen cases of CDV infection were identified by a commercially available CDV antigen detection kit. These cases were used for virus isolation and molecular analysis. Sequence analysis of hemagglutinin genes indicated that all viruses isolated belonged to the Asia-2 genotype. H protein residues which are related to the receptor and host specificity (residues 530 and 549) were analyzed. A glutamic acid (E) residue is present at 530 in all isolates. At 549, a histidine (H) residue was found in five isolates and tyrosine (Y) residue was found in 6 isolates. Our study demonstrated that CDV infection was widespread in wild raccoon dogs in South Korea. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. Immunoglobulin from Antarctic fish species of Rajidae family.

    PubMed

    Coscia, Maria Rosaria; Cocca, Ennio; Giacomelli, Stefano; Cuccaro, Fausta; Oreste, Umberto

    2012-03-01

    Immunoglobulins (Ig) of Chondroichthyes have been extensively studied in sharks; in contrast, in skates investigations on Ig remain scarce and fragmentary despite the high occurrence of skates in all of the major oceans of the world. To focus on Rajidae Igμ, the most abundant heavy chain isotype, we have chosen the Antarctic species Bathyraja eatonii, Bathyraja albomaculata, Bathyraja brachyurops, and Amblyraja georgiana which live at high latitudes in the Southern Ocean, and at very low temperatures. We prepared mRNA from the spleen of individuals of each species and performed RT-PCR experiments using two oligonucleotides designed on the alignment of various elasmobranch Igμ heavy chain sequences available in GenBank. The PCR products, about 1400-nt long, were cloned and sequenced. Nucleotide sequence identities calculated for the constant region domains ranged from 88.5% to 97.5% between species, and from 91.1% to 99.7% within species. In a distance tree, including also Raja erinacea sequences, two major branches were obtained, one containing Arhynchobatinae sequences, the other one Rajinae sequences. Four presumptive D gene segments were identified in the region of the VH/D/JH recombination; two different D segments were often found in the same sequence. Moreover, 5-15 genomic fragments of different lengths, carrying the gene locus encoding Igμ chain were revealed by Southern blotting analysis. B. eatonii amino acid sequences were analyzed for the positional diversity by Shannon entropy analysis, showing CH4 as the most conserved domain, and CH3 as the most variable one. B. eatonii CDR3 region length varied between 11 and 15 amino acid residues; the mean length (13.4 aa) was greater than that of Leucoraja eglanteria sequences (7.7 aa). An alignment of representative sequences of Antarctic species and R. erinacea showed that more cysteine residues not involved in the intradomain disulfide bridges were present in Antarctic species. Copyright © 2011 Elsevier B.V. All rights reserved.

  2. Phylogenetic and Structural Analysis of the Pluripotency Factor Sex-Determining Region Y box2 Gene of Camelus dromedarius (cSox2).

    PubMed

    Alawad, Abdullah; Alharbi, Sultan; Alhazzaa, Othman; Alagrafi, Faisal; Alkhrayef, Mohammed; Alhamdan, Ziyad; Alenazi, Abdullah; Al-Johi, Hasan; Alanazi, Ibrahim O; Hammad, Mohamed

    2016-01-01

    Although the sequencing information of Sox2 cDNA for many mammalian is available, the Sox2 cDNA of Camelus dromedaries has not yet been characterized. The objective of this study was to sequence and characterize Sox2 cDNA from the brain of C. dromedarius (also known as Arabian camel). A full coding sequence of the Sox2 gene from the brain of C. dromedarius was amplified by reverse transcription PCRjmc and then sequenced using the 3730XL series platform Sequencer (Applied Biosystem) for the first time. The cDNA sequence displayed an open reading frame of 822 nucleotides, encoding a protein of 273 amino acids. The molecular weight and the isoelectric point of the translated protein were calculated as 29.825 kDa and 10.11, respectively, using bioinformatics analysis. The predicted cSox2 protein sequence exhibited high identity: 99% for Homo sapiens, Mus musculus, Bos taurus, and Vicugna pacos; 98% for Sus scrofa and 93% for Camelus ferus. A 3D structure was built based on the available crystal structure of the HMG-box domain of human stem cell transcription factor Sox2 (PDB: 2 LE4) with 81 residues and predicting bioinformatics software for 273 amino acid residues. The comparison confirms the presence of the HMG-box domain in the cSox2 protein. The orthologous phylogenetic analysis showed that the Sox2 isoform from C. dromedarius was grouped with humans, alpacas, cattle, and pigs. We believe that this genetic and structural information will be a helpful source for the annotation. Furthermore, Sox2 is one of the transcription factors that contributes to the generation-induced pluripotent stem cells (iPSCs), which in turn will probably help generate camel induced pluripotent stem cells (CiPSCs).

  3. A sequence-specific transcription activator motif and powerful synthetic variants that bind Mediator using a fuzzy protein interface.

    PubMed

    Warfield, Linda; Tuttle, Lisa M; Pacheco, Derek; Klevit, Rachel E; Hahn, Steven

    2014-08-26

    Although many transcription activators contact the same set of coactivator complexes, the mechanism and specificity of these interactions have been unclear. For example, do intrinsically disordered transcription activation domains (ADs) use sequence-specific motifs, or do ADs of seemingly different sequence have common properties that encode activation function? We find that the central activation domain (cAD) of the yeast activator Gcn4 functions through a short, conserved sequence-specific motif. Optimizing the residues surrounding this short motif by inserting additional hydrophobic residues creates very powerful ADs that bind the Mediator subunit Gal11/Med15 with high affinity via a "fuzzy" protein interface. In contrast to Gcn4, the activity of these synthetic ADs is not strongly dependent on any one residue of the AD, and this redundancy is similar to that of some natural ADs in which few if any sequence-specific residues have been identified. The additional hydrophobic residues in the synthetic ADs likely allow multiple faces of the AD helix to interact with the Gal11 activator-binding domain, effectively forming a fuzzier interface than that of the wild-type cAD.

  4. SEQATOMS: a web tool for identifying missing regions in PDB in sequence context.

    PubMed

    Brandt, Bernd W; Heringa, Jaap; Leunissen, Jack A M

    2008-07-01

    With over 46 000 proteins, the Protein Data Bank (PDB) is the most important database with structural information of biological macromolecules. PDB files contain sequence and coordinate information. Residues present in the sequence can be absent from the coordinate section, which means their position in space is unknown. Similarity searches are routinely carried out against sequences taken from PDB SEQRES. However, there no distinction is made between residues that have a known or unknown position in the 3D protein structure. We present a FASTA sequence database that is produced by combining the sequence and coordinate information. All residues absent from the PDB coordinate section are masked with lower-case letters, thereby providing a view of these residues in the context of the entire protein sequence, which facilitates inspecting 'missing' regions. We also provide a masked version of the CATH domain database. A user-friendly BLAST interface is available for similarity searching. In contrast to standard (stand-alone) BLAST output, which only contains upper-case letters, our output retains the lower-case letters of the masked regions. Thus, our server can be used to perform BLAST searching case-sensitively. Here, we have applied it to the study of missing regions in their sequence context. SEQATOMS is available at http://www.bioinformatics.nl/tools/seqatoms/.

  5. How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

    PubMed

    Tian, Pengfei; Best, Robert B

    2017-10-17

    Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.

  6. Identification of succinimide sites in proteins by N-terminal sequence analysis after alkaline hydroxylamine cleavage.

    PubMed Central

    Kwong, M. Y.; Harris, R. J.

    1994-01-01

    Under favorable conditions, Asp or Asn residues can undergo rearrangement to a succinimide (cyclic imide), which may also serve as an intermediate for deamidation and/or isoaspartate formation. Direct identification of such succinimides by peptide mapping is hampered by their lability at neutral and alkaline pH. We determined that incubation in 2 M hydroxylamine, 0.2 M Tris buffer, pH 9, for 2 h at 45 degrees C will specifically cleave on the C-terminal side of succinimides without cleavage at Asn-Gly bonds; yields are typically approximately 50%. N-terminal sequence analysis can then be used to identify an internal sequence generated by cleavage of the succinimide, hence identifying the succinimide site. PMID:8142891

  7. Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.

    PubMed

    Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M

    1999-10-01

    This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.

  8. Isolation and primary structural analysis of two conjugated polyketone reductases from Candida parapsilosis.

    PubMed

    Hidalgo, A R; Akond, M A; Kita, K; Kataoka, M; Shimizu, S

    2001-12-01

    Two conjugated polyketone reductases (CPRs) were isolated from Candida parapsilosis IFO 0708. The primary structures of CPRs (C1 and C2) were analyzed by amino acid sequencing. The amino acid sequences of both enzymes had high similarity to those of several proteins of the aldo-keto-reductase (AKR) superfamily. However, several amino acid residues in the putative active sites of AKRs were not conserved in CPRs-C1 and -C2.

  9. Analysis of a developmentally regulated nuclear localization signal in Xenopus

    PubMed Central

    1992-01-01

    The 289 residue nuclear oncoprotein encoded by the adenovirus 5 Ela gene contains two peptide sequences that behave as nuclear localization signals (NLS). One signal, located at the carboxy terminus, is like many other known NLSs in that it consists of a short stretch of basic residues (KRPRP) and is constitutively active in cells. The second signal resides within an internal 45 residue region of E1a that contains few basic residues or sequences that resemble other known NLSs. Moreover, this internal signal functions in injected Xenopus oocytes, but not in transfected Xenopus A6 cells, suggesting that it could be regulated developmentally (Slavicek et al. 1989. J. Virol. 63:4047). In this study, we show that the activity of this signal is sensitive to ATP depletion in vivo, efficiently directs the import of a 50 kD fusion protein and can compete with the E1a carboxy-terminal NLS for nuclear import. In addition, we have delineated the precise amino acid residues that comprise the second E1a NLS, and have assessed its utilization during Xenopus embryogenesis. Using amino acid deletion and substitution analyses, we show that the signal consists of the sequence FV(X)7-20MXSLXYM(X)4MF. By expressing in Xenopus embryos a truncated E1a protein that contains only the second NLS and by monitoring its cytoplasmic/nuclear distribution during development with indirect immunofluorescence, we find that the second NLS is utilized up to the early neurula stage. In addition, there appears to be a hierarchy among the embryonic germ layers as to when the second NLS becomes nonfunctional. For this reason, we refer to this NLS as the developmentally regulated nuclear localization signal (drNLS). The implications of these findings for early development are discussed. PMID:1387407

  10. Sequence-Specific Model for Peptide Retention Time Prediction in Strong Cation Exchange Chromatography.

    PubMed

    Gussakovsky, Daniel; Neustaeter, Haley; Spicer, Victor; Krokhin, Oleg V

    2017-11-07

    The development of a peptide retention prediction model for strong cation exchange (SCX) separation on a Polysulfoethyl A column is reported. Off-line 2D LC-MS/MS analysis (SCX-RPLC) of S. cerevisiae whole cell lysate was used to generate a retention dataset of ∼30 000 peptides, sufficient for identifying the major sequence-specific features of peptide retention mechanisms in SCX. In contrast to RPLC/hydrophilic interaction liquid chromatography (HILIC) separation modes, where retention is driven by hydrophobic/hydrophilic contributions of all individual residues, SCX interactions depend mainly on peptide charge (number of basic residues at acidic pH) and size. An additive model (incorporating the contributions of all 20 residues into the peptide retention) combined with a peptide length correction produces a 0.976 R 2 value prediction accuracy, significantly higher than the additive models for either HILIC or RPLC. Position-dependent effects on peptide retention for different residues were driven by the spatial orientation of tryptic peptides upon interaction with the negatively charged surface functional groups. The positively charged N-termini serve as a primary point of interaction. For example, basic residues (Arg, His, Lys) increase peptide retention when located closer to the N-terminus. We also found that hydrophobic interactions, which could lead to a mixed-mode separation mechanism, are largely suppressed at 20-30% of acetonitrile in the eluent. The accuracy of the final Sequence-Specific Retention Calculator (SSRCalc) SCX model (∼0.99 R 2 value) exceeds all previously reported predictors for peptide LC separations. This also provides a solid platform for method development in 2D LC-MS protocols in proteomics and peptide retention prediction filtering of false positive identifications.

  11. Analysis of the regulatory region of the protease III (ptr) gene of Escherichia coli K-12.

    PubMed

    Claverie-Martin, F; Diaz-Torres, M R; Kushner, S R

    1987-01-01

    The ptr gene of Escherichia coli encodes protease III (Mr 110,000) and a 50-kDa polypeptide, both of which are found in the periplasmic space. The gene is physically located between the recC and recB loci on the E. coli chromosome. The nucleotide sequence of a 1167-bp EcoRV-ClaI fragment of chromosomal DNA containing the promoter region and 885 bp of the ptr coding sequence has been determined. S1 nuclease mapping analysis showed that the major 5' end of the ptr mRNA was localized 127 bp upstream from the ATG start codon. The open reading frame (ORF), preceded by a Shine-Dalgarno sequence, extends to the end of the sequenced DNA. Downstream from the -35 and -10 regions is a sequence that strongly fits the consensus sequence of known nitrogen-regulated promoters. A signal peptide of 23 amino acids residues is present at the N terminus of the derived amino acid sequence. The cleavage site as well as the ORF were confirmed by sequencing the N terminus of mature protease III.

  12. Analysis of the Mutations in the Active Site of the RNA-Dependent RNA Polymerase of Human Parainfluenza Virus Type 3 (HPIV3)

    PubMed Central

    Malur, Achut G.; Gupta, Neera K.; De, Bishnu P.; Banerjee, Amiya K.

    2002-01-01

    The large protein (L) of the human parainfluenza virus type 3 (HPIV3) is the functional RNA-dependent RNA polymerase, which possesses highly conserved residues QGDNQ located within motif C of domain III comprising the putative polymerase active site. We have characterized the role of the QGDNQ residues as well as the residues flanking this region in the polymerase activity of the L protein by site-directed mutagenesis and examining the polymerase activity of the wild-type and mutant L proteins by an in vivo minigenome replication assay and an in vitro mRNA transcription assay. All mutations in the QGDNQ residues abolished transcription while mutations in the flanking residues gave rise to variable polymerase activities. These observations support the contention that the QGDNQ sequence is absolutely required for the polymerase activity of the HPIV3 RNA-dependent RNA polymerase. PMID:12064576

  13. Assessing the role of aromatic residues in the amyloid aggregation of human muscle acylphosphatase

    PubMed Central

    Bemporad, Francesco; Taddei, Niccolò; Stefani, Massimo; Chiti, Fabrizio

    2006-01-01

    Among the many parameters that have been proposed to promote amyloid fibril formation is the π-stacking of aromatic residues. We have studied the amyloid aggregation of several mutants of human muscle acylphosphatase in which an aromatic residue was substituted with a non-aromatic one. The aggregation rate was determined using the Thioflavin T test under conditions in which the variants populated initially an ensemble of partially unfolded conformations. Substitutions in aggregation-promoting fragments of the sequence result in a dramatically decreased aggregation rate of the protein, confirming the propensity of aromatic residues to promote this process. Nevertheless, a statistical analysis shows that the measured decrease of aggregation rate following mutation arises predominantly from a reduction of hydrophobicity and intrinsic β-sheet propensity. This suggests that aromatic residues favor aggregation because of these factors rather than for their aromaticity. PMID:16600970

  14. A conserved mechanism for replication origin recognition and binding in archaea.

    PubMed

    Majerník, Alan I; Chong, James P J

    2008-01-15

    To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.

  15. The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam.

    PubMed

    Goonesekere, Nalin C W; Shipely, Krysten; O'Connor, Kevin

    2010-06-01

    The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9-20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF. Copyright 2010 Elsevier Ltd. All rights reserved.

  16. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    PubMed

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  17. Comparative sequence analysis of acid sensitive/resistance proteins in Escherichia coli and Shigella flexneri

    PubMed Central

    Manikandan, Selvaraj; Balaji, Seetharaaman; Kumar, Anil; Kumar, Rita

    2007-01-01

    The molecular basis for the survival of bacteria under extreme conditions in which growth is inhibited is a question of great current interest. A preliminary study was carried out to determine residue pattern conservation among the antiporters of enteric bacteria, responsible for extreme acid sensitivity especially in Escherichia coli and Shigella flexneri. Here we found the molecular evidence that proved the relationship between E. coli and S. flexneri. Multiple sequence alignment of the gadC coded acid sensitive antiporter showed many conserved residue patterns at regular intervals at the N-terminal region. It was observed that as the alignment approaches towards the C-terminal, the number of conserved residues decreases, indicating that the N-terminal region of this protein has much active role when compared to the carboxyl terminal. The motif, FHLVFFLLLGG, is well conserved within the entire gadC coded protein at the amino terminal. The motif is also partially conserved among other antiporters (which are not coded by gadC) but involved in acid sensitive/resistance mechanism. Phylogenetic cluster analysis proves the relationship of Escherichia coli and Shigella flexneri. The gadC coded proteins are converged as a clade and diverged from other antiporters belongs to the amino acid-polyamine-organocation (APC) superfamily. PMID:21670792

  18. Stresses in Implant-Supported Fixed Complete Dentures with Different Screw-Tightening Sequences and Torque Application Modes.

    PubMed

    Barcellos, Leonardo H; Palmeiro, Marina Lobato; Naconecy, Marcos M; Geremia, Tomás; Cervieri, André; Shinkai, Rosemary S

    2018-05-17

    To compare the effects of different screw-tightening sequences and torque applications on stresses in implant-supported fixed complete dentures supported by five abutments. Strain gauges fixed to the abutments were used to test the sequences 2-4-3-1-5; 1-2-3-4-5; 3-2-4-1-5; and 2-5-4-1-3 with direct 10-Ncm torque or progressive torque (5 + 10 Ncm). Data were analyzed using analysis of variance and standardized effect size. No effects of tightening sequence or torque application were found except for the sequence 3-2-4-1-5 and some small to moderate effect sizes. Screw-tightening sequences and torque application modes have only a marginal effect on residual stresses.

  19. Sequence Alignment to Predict Across Species Susceptibility ...

    EPA Pesticide Factsheets

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev

  20. The role of aromatic phenylalanine residues in binding carotenoid to light-harvesting model and wild-type complexes.

    PubMed

    García-Martín, A; Pazur, A; Wilhelm, B; Silber, M; Robert, B; Braun, P

    2008-09-26

    The mode of carotenoid (Crt) binding to polypeptide and specifying its function is as yet largely unknown. Statistical analysis of major photosystems I and II suggests that aromatic residues make up a significant part of the Crt binding pockets. Phenylalanine residues ensure approximately 25%--at some carbon atoms even up to 40%--of the total contacts with Crts. By use of an alanine-leucine model transmembrane helix that replaces the native helix of the bacterial light-harvesting complex 2 (LH2) alpha-subunit, we study the effects of polypeptide residues on cofactor binding in a model sequence context. Here, it is shown that phenylalanine residues located in the close vicinity of the Crts' polyene backbone significantly contribute to the binding of the Crt to the model protein. The replacement of a phenylalanine with leucine in the model helix results in significant reduction in the complexes' Crt content. This effect is strongly enhanced by the removal of a second phenylalanine in close vicinity to the Crt, i.e., of the wild-type (WT) beta-subunit. Remarkably, the mutation of only two phenylalanine residues in the LH2 WT sequence, alpha-Phe at position -12 and beta-Phe at -8, results in the loss of nearly 50% of functional Crt. Resonance Raman spectra indicate that the Crt conformation is fundamentally altered by the absence of the phenylalanines' aromatic side chains, suggesting that they lock the Crt into a precise, well-defined configuration. Thus, binding and specific functionalisation of Crt in the model and WT light-harvesting complex is reliant on the aromatic residue phenylalanine. The use of the light-harvesting complex as a model system thus substantiates the notion that the aromatic residue phenylalanine is a key factor for the binding of Crt to transmembrane proteins.

  1. HPV positive, wild type TP53, and p16 overexpression correlate with the absence of residual tumors after chemoradiotherapy in anal squamous cell carcinoma.

    PubMed

    Soares, Paulo C; Abdelhay, Eliana S; Thuler, Luiz Claudio S; Soares, Bruno Moreira; Demachki, Samia; Ferro, Gessica Valéria Rocha; Assumpção, Paulo P; Lamarão, Leticia Martins; Ribeiro Pinto, Luis Felipe; Burbano, Rommel Mario Rodríguez

    2018-02-21

    Anal residual tumors are consensually identified within six months of chemoradiotherapy and represent a persistent lesion that may have prognostic value for overall survival. The aim of this study was to evaluate the association of HPV and HIV status, p16 expression level and TP53 mutations with the absence of residual tumors (local response) in Squamous Cell Carcinoma (SCC) of the anal canal after chemoradiotherapy. We performed a study on 78 patients with SCC of the anal canal who submitted to chemoradiotherapy and were followed for a six-month period to identify the absence or presence of residual tumors. HPV DNA was identified by polymerase chain reaction and direct sequencing, HIV RNA was detected by TaqMan amplification, p16 expression was detected by western blotting, and the mutational analysis of TP53 was performed by direct sequencing; additionally, samples carrying mutations underwent fluorescent in sit hybridization. The evaluation of the tumor response to treatment was conducted six months after the conclusion of chemoradiotherapy. The following classifications were used to evaluate the outcomes: a) no response (presence of residual tumor) and b) complete response (absence of residual tumor). The significant variables associated with the absence of residual tumors were HPV positive, p16 overexpressed, wild-type TP53, female gender, and stages I and II. Only the presence of HPV was independently correlated with the clinical response; this variable increased the chances of a response within six months by 31-fold. The presence of HPV in tumor cells was correlated with the absence of a residual tumor. This correlation is valuable and can direct future therapeutic approaches in the anal canal.

  2. Modeling the Residual Strength of a Fibrous Composite Using the Residual Daniels Function

    NASA Astrophysics Data System (ADS)

    Paramonov, Yu.; Cimanis, V.; Varickis, S.; Kleinhofs, M.

    2016-09-01

    The concept of a residual Daniels function (RDF) is introduced. Together with the concept of Daniels sequence, the RDF is used for estimating the residual (after some preliminary fatigue loading) static strength of a unidirectional fibrous composite (UFC) and its S-N curve on the bases of test data. Usually, the residual strength is analyzed on the basis of a known S-N curve. In our work, an inverse approach is used: the S-N curve is derived from an analysis of the residual strength. This approach gives a good qualitive description of the process of decreasing residual strength and explanes the existence of the fatigue limit. The estimates of parameters of the corresponding regression model can be interpreted as estimates of parameters of the local strength of components of the UFC. In order to approach the quantitative experimental estimates of the fatigue life, some ideas based on the mathematics of the semiMarkovian process are employed. Satisfactory results in processing experimental data on the fatigue life and residual strength of glass/epoxy laminates are obtained.

  3. A comparative analysis on the physicochemical properties of tick-borne encephalitis virus envelope protein residues that affect its antigenic properties.

    PubMed

    Bukin, Yu S; Dzhioev, Yu P; Tkachev, S E; Kozlova, I V; Paramonov, A I; Ruzek, D; Qu, Z; Zlobin, V I

    2017-06-15

    This work is dedicated to the study of the variability of the main antigenic envelope protein E among different strains of tick-borne encephalitis virus at the level of physical and chemical properties of the amino acid residues. E protein variants were extracted from then NCBI database. Four amino acid residues properties in the polypeptide sequences were investigated: the average volume of the amino acid residue in the protein tertiary structure, the number of amino acid residue hydrogen bond donors, the charge of amino acid residue lateral radical and the dipole moment of the amino acid residue. These physico-chemical properties are involved in antigen-antibody interactions. As a result, 103 different variants of the antigenic determinants of the tick-borne encephalitis virus E protein were found, significantly different by physical and chemical properties of the amino acid residues in their structure. This means that some strains among the natural variants of tick-borne encephalitis virus can potentially escape the immune response induced by the standard vaccine. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. SequenceCEROSENE: a computational method and web server to visualize spatial residue neighborhoods at the sequence level.

    PubMed

    Heinke, Florian; Bittrich, Sebastian; Kaiser, Florian; Labudde, Dirk

    2016-01-01

    To understand the molecular function of biopolymers, studying their structural characteristics is of central importance. Graphics programs are often utilized to conceive these properties, but with the increasing number of available structures in databases or structure models produced by automated modeling frameworks this process requires assistance from tools that allow automated structure visualization. In this paper a web server and its underlying method for generating graphical sequence representations of molecular structures is presented. The method, called SequenceCEROSENE (color encoding of residues obtained by spatial neighborhood embedding), retrieves the sequence of each amino acid or nucleotide chain in a given structure and produces a color coding for each residue based on three-dimensional structure information. From this, color-highlighted sequences are obtained, where residue coloring represent three-dimensional residue locations in the structure. This color encoding thus provides a one-dimensional representation, from which spatial interactions, proximity and relations between residues or entire chains can be deduced quickly and solely from color similarity. Furthermore, additional heteroatoms and chemical compounds bound to the structure, like ligands or coenzymes, are processed and reported as well. To provide free access to SequenceCEROSENE, a web server has been implemented that allows generating color codings for structures deposited in the Protein Data Bank or structure models uploaded by the user. Besides retrieving visualizations in popular graphic formats, underlying raw data can be downloaded as well. In addition, the server provides user interactivity with generated visualizations and the three-dimensional structure in question. Color encoded sequences generated by SequenceCEROSENE can aid to quickly perceive the general characteristics of a structure of interest (or entire sets of complexes), thus supporting the researcher in the initial phase of structure-based studies. In this respect, the web server can be a valuable tool, as users are allowed to process multiple structures, quickly switch between results, and interact with generated visualizations in an intuitive manner. The SequenceCEROSENE web server is available at https://biosciences.hs-mittweida.de/seqcerosene.

  5. Bioinformatic prediction and in vivo validation of residue-residue interactions in human proteins

    NASA Astrophysics Data System (ADS)

    Jordan, Daniel; Davis, Erica; Katsanis, Nicholas; Sunyaev, Shamil

    2014-03-01

    Identifying residue-residue interactions in protein molecules is important for understanding both protein structure and function in the context of evolutionary dynamics and medical genetics. Such interactions can be difficult to predict using existing empirical or physical potentials, especially when residues are far from each other in sequence space. Using a multiple sequence alignment of 46 diverse vertebrate species we explore the space of allowed sequences for orthologous protein families. Amino acid changes that are known to damage protein function allow us to identify specific changes that are likely to have interacting partners. We fit the parameters of the continuous-time Markov process used in the alignment to conclude that these interactions are primarily pairwise, rather than higher order. Candidates for sites under pairwise epistasis are predicted, which can then be tested by experiment. We report the results of an initial round of in vivo experiments in a zebrafish model that verify the presence of multiple pairwise interactions predicted by our model. These experimentally validated interactions are novel, distant in sequence, and are not readily explained by known biochemical or biophysical features.

  6. Expression and characterization of a new esterase with GCSAG motif from a permafrost metagenomic library.

    PubMed

    Petrovskaya, Lada E; Novototskaya-Vlasova, Ksenia A; Spirina, Elena V; Durdenko, Ekaterina V; Lomakina, Galina Yu; Zavialova, Maria G; Nikolaev, Evgeny N; Rivkina, Elizaveta M

    2016-05-01

    As a result of construction and screening of a metagenomic library prepared from a permafrost-derived microcosm, we have isolated a novel gene coding for a putative lipolytic enzyme that belongs to the hormone-sensitive lipase family. It encodes a polypeptide of 343 amino acid residues whose amino acid sequence displays maximum likelihood with uncharacterized proteins from Sphingomonas species. A putative catalytic serine residue of PMGL2 resides in a new variant of a recently discovered GTSAG sequence in which a Thr residue is replaced by a Cys residue (GCSAG). The recombinant PMGL2 was produced in Escherichia coli cells and purified by Ni-affinity chromatography. The resulting protein preferably utilizes short-chain p-nitrophenyl esters (C4 and C8) and therefore is an esterase. It possesses maximum activity at 45°C in slightly alkaline conditions and has limited thermostability at higher temperatures. Activity of PMGL2 is stimulated in the presence of 0.25-1.5 M NaCl indicating the good salt tolerance of the new enzyme. Mass spectrometric analysis demonstrated that N-terminal methionine in PMGL2 is processed and cysteine residues do not form a disulfide bond. The results of the study demonstrate the significance of the permafrost environment as a unique genetic reservoir and its potential for metagenomic exploration. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Characterization and functional analysis of hypoxia-inducible factor HIF1α and its inhibitor HIF1αn in tilapia.

    PubMed

    Li, Hong Lian; Gu, Xiao Hui; Li, Bi Jun; Chen, Xiao; Lin, Hao Ran; Xia, Jun Hong

    2017-01-01

    Hypoxia is a major cause of fish morbidity and mortality in the aquatic environment. Hypoxia-inducible factors are very important modulators in the transcriptional response to hypoxic stress. In this study, we characterized and conducted functional analysis of hypoxia-inducible factor HIF1α and its inhibitor HIF1αn in Nile tilapia (Oreochromis niloticus). By cloning and Sanger sequencing, we obtained the full length cDNA sequences for HIF1α (2686bp) and HIF1αn (1308bp), respectively. The CDS of HIF1α includes 15 exons encoding 768 amino acid residues and the CDS of HIF1αn contains 8 exons encoding 354 amino acid residues. The complete CDS sequences of HIF1α and HIF1αn cloned from tilapia shared very high homology with known genes from other fishes. HIF1α show differentiated expression in different tissues (brain, heart, gill, spleen, liver) and at different hypoxia exposure times (6h, 12h, 24h). HIF1αn expression level under hypoxia is generally increased (6h, 12h, 24h) and shows extremely highly upregulation in brain tissue under hypoxia. A functional determination site analysis in the protein sequences between fish and land animals identified 21 amino acid sites in HIF1α and 2 sites in HIF1αn as significantly associated sites (α = 0.05). Phylogenetic tree-based positive selection analysis suggested 22 sites in HIF1α as positively selected sites with a p-value of at least 95% for fish lineages compared to the land animals. Our study could be important for clarifying the mechanism of fish adaptation to aquatic hypoxia environment.

  8. Characterization and functional analysis of hypoxia-inducible factor HIF1α and its inhibitor HIF1αn in tilapia

    PubMed Central

    Li, Hong Lian; Gu, Xiao Hui; Li, Bi Jun; Chen, Xiao; Lin, Hao Ran; Xia, Jun Hong

    2017-01-01

    Hypoxia is a major cause of fish morbidity and mortality in the aquatic environment. Hypoxia-inducible factors are very important modulators in the transcriptional response to hypoxic stress. In this study, we characterized and conducted functional analysis of hypoxia-inducible factor HIF1α and its inhibitor HIF1αn in Nile tilapia (Oreochromis niloticus). By cloning and Sanger sequencing, we obtained the full length cDNA sequences for HIF1α (2686bp) and HIF1αn (1308bp), respectively. The CDS of HIF1α includes 15 exons encoding 768 amino acid residues and the CDS of HIF1αn contains 8 exons encoding 354 amino acid residues. The complete CDS sequences of HIF1α and HIF1αn cloned from tilapia shared very high homology with known genes from other fishes. HIF1α show differentiated expression in different tissues (brain, heart, gill, spleen, liver) and at different hypoxia exposure times (6h, 12h, 24h). HIF1αn expression level under hypoxia is generally increased (6h, 12h, 24h) and shows extremely highly upregulation in brain tissue under hypoxia. A functional determination site analysis in the protein sequences between fish and land animals identified 21 amino acid sites in HIF1α and 2 sites in HIF1αn as significantly associated sites (α = 0.05). Phylogenetic tree-based positive selection analysis suggested 22 sites in HIF1α as positively selected sites with a p-value of at least 95% for fish lineages compared to the land animals. Our study could be important for clarifying the mechanism of fish adaptation to aquatic hypoxia environment. PMID:28278251

  9. Molecular modeling and docking studies of human 5-hydroxytryptamine 2A (5-HT2A) receptor for the identification of hotspots for ligand binding.

    PubMed

    Kanagarajadurai, Karuppiah; Malini, Manoharan; Bhattacharya, Aditi; Panicker, Mitradas M; Sowdhamini, Ramanathan

    2009-12-01

    The serotonergic system has been implicated in emotional and cognitive function. In particular, 5-HT(2A) (5-hydroxytrytamine receptor 2A) is attributed to a number of disorders like schizophrenia, depression, eating disorders and anxiety. 5-HT(2A), being a GPCR (G-protein coupled receptor), is important in the pharmaceutical industry as a proven target for these disorders. Despite their extensive clinical importance, the structural studies of this protein is lacking due to difficulties in determining its crystal structure. We have performed sequence analysis and molecular modeling of 5-HT(2A) that has revealed a set of conserved residues and motifs considered to play an important role in maintaining structural integrity and function of the receptor. The analysis also revealed a set of residues specific to the receptor which distinguishes them from other members of the subclass and their orthologs. Further, starting from the model structure of human 5-HT(2A) receptor, docking studies were attempted to envisage how it might interact with eight of its ligands (such as serotonin, dopamine, DOI, LSD, haloperidol, ketanserin, risperidone and clozapine). The binding studies of dopamine to 5-HT(2A) receptor can bring up better understanding in the etiology of a number of neurological disorders involving both these two receptors. Our sequence analysis and study of interactions of this receptor with other ligands reveal additional residue hotspots such as Asn 363 and Tyr 370. The function of these residues can be further analyzed by rational design of site-directed mutagenesis. Two distinct binding sites are identified which could play important roles in ligand binding and signaling.

  10. Structural studies of polypeptides: Mechanism of immunoglobin catalysis and helix propagation in hybrid sequence, disulfide containing peptides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Storrs, Richard Wood

    1992-08-01

    Catalytic immunoglobin fragments were studied Nuclear Magnetic Resonance spectroscopy to identify amino acid residues responsible for the catalytic activity. Small, hybrid sequence peptides were analyzed for helix propagation following covalent initiation and for activity related to the protein from which the helical sequence was derived. Hydrolysis of p-nitrophenyl carbonates and esters by specific immunoglobins is thought to involve charge complementarity. The pK of the transition state analog P-nitrophenyl phosphate bound to the immunoglobin fragment was determined by 31P-NMR to verify the juxtaposition of a positively charged amino acid to the binding/catalytic site. Optical studies of immunoglobin mediated photoreversal of cis,more » syn cyclobutane thymine dimers implicated tryptophan as the photosensitizing chromophore. Research shows the chemical environment of a single tryptophan residue is altered upon binding of the thymine dimer. This tryptophan residue was localized to within 20 Å of the binding site through the use of a nitroxide paramagnetic species covalently attached to the thymine dimer. A hybrid sequence peptide was synthesized based on the bee venom peptide apamin in which the helical residues of apamin were replaced with those from the recognition helix of the bacteriophage 434 repressor protein. Oxidation of the disufide bonds occured uniformly in the proper 1-11, 3-15 orientation, stabilizing the 434 sequence in an α-helix. The glycine residue stopped helix propagation. Helix propagation in 2,2,2-trifluoroethanol mixtures was investigated in a second hybrid sequence peptide using the apamin-derived disulfide scaffold and the S-peptide sequence. The helix-stop signal previously observed was not observed in the NMR NOESY spectrum. Helical connectivities were seen throughout the S-peptide sequence. The apamin/S-peptide hybrid binded to the S-protein (residues 21-166 of ribonuclease A) and reconstituted enzymatic activity.« less

  11. Structural studies of polypeptides: Mechanism of immunoglobin catalysis and helix propagation in hybrid sequence, disulfide containing peptides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Storrs, R.W.

    1992-08-01

    Catalytic immunoglobin fragments were studied Nuclear Magnetic Resonance spectroscopy to identify amino acid residues responsible for the catalytic activity. Small, hybrid sequence peptides were analyzed for helix propagation following covalent initiation and for activity related to the protein from which the helical sequence was derived. Hydrolysis of p-nitrophenyl carbonates and esters by specific immunoglobins is thought to involve charge complementarity. The pK of the transition state analog P-nitrophenyl phosphate bound to the immunoglobin fragment was determined by [sup 31]P-NMR to verify the juxtaposition of a positively charged amino acid to the binding/catalytic site. Optical studies of immunoglobin mediated photoreversal ofmore » cis, syn cyclobutane thymine dimers implicated tryptophan as the photosensitizing chromophore. Research shows the chemical environment of a single tryptophan residue is altered upon binding of the thymine dimer. This tryptophan residue was localized to within 20 [Angstrom] of the binding site through the use of a nitroxide paramagnetic species covalently attached to the thymine dimer. A hybrid sequence peptide was synthesized based on the bee venom peptide apamin in which the helical residues of apamin were replaced with those from the recognition helix of the bacteriophage 434 repressor protein. Oxidation of the disufide bonds occured uniformly in the proper 1-11, 3-15 orientation, stabilizing the 434 sequence in an [alpha]-helix. The glycine residue stopped helix propagation. Helix propagation in 2,2,2-trifluoroethanol mixtures was investigated in a second hybrid sequence peptide using the apamin-derived disulfide scaffold and the S-peptide sequence. The helix-stop signal previously observed was not observed in the NMR NOESY spectrum. Helical connectivities were seen throughout the S-peptide sequence. The apamin/S-peptide hybrid binded to the S-protein (residues 21-166 of ribonuclease A) and reconstituted enzymatic activity.« less

  12. Molecular Cloning and Characterization of an Acetylcholinesterase cDNA in the Brown Planthopper, Nilaparvata lugens

    PubMed Central

    Yang, Zhifan; Chen, Jun; Chen, Yongqin; Jiang, Sijing

    2010-01-01

    A full cDNA encoding an acetylcholinesterase (AChE, EC 3.1.1.7) was cloned and characterized from the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae). The complete cDNA (2467 bp) contains a 1938-bp open reading frame encoding 646 amino acid residues. The amino acid sequence of the AChE deduced from the cDNA consists of 30 residues for a putative signal peptide and 616 residues for the mature protein with a predicted molecular weight of 69,418. The three residues (Ser242, Glu371, and His485) that putatively form the catalytic triad and the six Cys that form intra-subunit disulfide bonds are completely conserved, and 10 out of the 14 aromatic residues lining the active site gorge of the AChE are also conserved. Northern blot analysis of poly(A)+ RNA showed an approximately 2.6-kb transcript, and Southern blot analysis revealed there likely was just a single copy of this gene in N. lugens. The deduced protein sequence is most similar to AChE of Nephotettix cincticeps with 83% amino acid identity. Phylogenetic analysis constructed with 45 AChEs from 30 species showed that the deduced N. lugens AChE formed a cluster with the other 8 insect AChE2s. Additionally, the hypervariable region and amino acids specific to insect AChE2 also existed in the AChE of N. lugens. The results revealed that the AChE cDNA cloned in this work belongs to insect AChE2 subgroup, which is orthologous to Drosophila AChE. Comparison of the AChEs between the susceptible and resistant strains revealed a point mutation, Gly185Ser, is likely responsible for the insensitivity of the AChE to methamidopho in the resistant strain. PMID:20874389

  13. A highly Conserved Aspartic Acid Residue of the Chitosanase from Bacillus Sp. TS Is Involved in the Substrate Binding.

    PubMed

    Zhou, Zhanping; Zhao, Shuangzhi; Liu, Yang; Chang, Zhengying; Ma, Yanhe; Li, Jian; Song, Jiangning

    2016-11-01

    The chitosanase from Bacillus sp. TS (CsnTS) is an enzyme belonging to the glycoside hydrolase family 8. The sequence of CsnTS shares 98 % identity with the chitosanase from Bacillus sp. K17. Crystallography analysis and site-direct mutagenesis of the chitosanase from Bacillus sp. K17 identified the important residues involved in the catalytic interaction and substrate binding. However, despite progress in understanding the catalytic mechanism of the chitosanase from the family GH8, the functional roles of some residues that are highly conserved throughout this family have not been fully elucidated. This study focused on one of these residues, i.e., the aspartic acid residue at position 318. We found that apart from asparagine, mutation of Asp318 resulted in significant loss of enzyme activity. In-depth investigations showed that mutation of this residue not only impaired enzymatic activity but also affected substrate binding. Taken together, our results showed that Asp318 plays an important role in CsnTS activity.

  14. Sequence search on a supercomputer.

    PubMed

    Gotoh, O; Tagashira, Y

    1986-01-10

    A set of programs was developed for searching nucleic acid and protein sequence data bases for sequences similar to a given sequence. The programs, written in FORTRAN 77, were optimized for vector processing on a Hitachi S810-20 supercomputer. A search of a 500-residue protein sequence against the entire PIR data base Ver. 1.0 (1) (0.5 M residues) is carried out in a CPU time of 45 sec. About 4 min is required for an exhaustive search of a 1500-base nucleotide sequence against all mammalian sequences (1.2M bases) in Genbank Ver. 29.0. The CPU time is reduced to about a quarter with a faster version.

  15. Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

    PubMed

    Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

    1987-08-01

    To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded.

  16. Purification of cold-shock-like proteins from Stigmatella aurantiaca - molecular cloning and characterization of the cspA gene.

    PubMed

    Stamm, I; Leclerque, A; Plaga, W

    1999-09-01

    Prominent low-molecular-weight proteins were isolated from vegetative cells of the myxobacterium Stigmatella aurantiaca and were found to be members of the cold-shock protein family. A first gene of this family (cspA) was cloned and sequenced. It encodes a protein of 68 amino acid residues that displays up to 71% sequence identity with other bacterial cold-shock(-like) proteins. A cysteine residue within the RNP-2 motif is a peculiarity of Stigmatella CspA. A cspA::(Deltatrp-lacZ) fusion gene construct was introduced into Stigmatella by electroporation, a method that has not been used previously for this strain. Analysis of the resultant transformants revealed that cspA transcription occurs at high levels during vegetative growth at 20 and 32 degrees C, and during fruiting body formation.

  17. Accurate Sample Assignment in a Multiplexed, Ultrasensitive, High-Throughput Sequencing Assay for Minimal Residual Disease.

    PubMed

    Bartram, Jack; Mountjoy, Edward; Brooks, Tony; Hancock, Jeremy; Williamson, Helen; Wright, Gary; Moppett, John; Goulden, Nick; Hubank, Mike

    2016-07-01

    High-throughput sequencing (HTS) (next-generation sequencing) of the rearranged Ig and T-cell receptor genes promises to be less expensive and more sensitive than current methods of monitoring minimal residual disease (MRD) in patients with acute lymphoblastic leukemia. However, the adoption of new approaches by clinical laboratories requires careful evaluation of all potential sources of error and the development of strategies to ensure the highest accuracy. Timely and efficient clinical use of HTS platforms will depend on combining multiple samples (multiplexing) in each sequencing run. Here we examine the Ig heavy-chain gene HTS on the Illumina MiSeq platform for MRD. We identify errors associated with multiplexing that could potentially impact the accuracy of MRD analysis. We optimize a strategy that combines high-purity, sequence-optimized oligonucleotides, dual indexing, and an error-aware demultiplexing approach to minimize errors and maximize sensitivity. We present a probability-based, demultiplexing pipeline Error-Aware Demultiplexer that is suitable for all MiSeq strategies and accurately assigns samples to the correct identifier without excessive loss of data. Finally, using controls quantified by digital PCR, we show that HTS-MRD can accurately detect as few as 1 in 10(6) copies of specific leukemic MRD. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  18. Variant amino acid residues alter the enzyme activity of peanut type 2 Diacylglycerol Acyltransferases

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferase (DGAT) catalyzes the final, rate-limiting step in triacylglycerol (TAG) biosynthesis via the acyl-CoA-dependent acylation of diacylglycerol. In this study, type-2 DGAT2 genes were cloned from eleven peanut cultivars. Sequence analysis revealed at least eight peanut D...

  19. [Molecular and structural-biological analysis of Nicotiana plumbaginifolia mutants for identification of the site of beta-tubulins interaction with dinitroanilines and phosphorotioamidates].

    PubMed

    Emets, A I; Baiard, U V; Nyporko, A Iu; Swire-Clark, G A; Blium, Ia B

    2009-01-01

    The identification of point mutation locations on beta-tubulin molecules of amiprophosmethyl- and trifluralin-resistant Nicotiana plumbaginifolia lines have described in the work. It was shown that in the first case this mutation is connected with the substitution ofserine residue on proline in position 248; in the second case--with the substitution of phenilalanine on serine in position 317 of beta-tubulin amino acid sequence. Three-dimensional models of beta-tubulin molecule from Chlamydomonas with well-known location of mutations conferring dinitroaniline- and phosphorotioamidate resistance (substitution of lysine residue to methionine on position 350), and beta-tubulin from Nicotiana plumbaginifolia have been reconstructed. On the basis of analysis of site of interaction with dinitroanilines and phosphorotioamides on Chlamydomonas beta-tubulin molecule it was concluded that the revealed mutations on Nicotiana plumbaginifolia beta-tubulin affect amino acid residues participating in formation of this site.

  20. Isolation and sequence analysis of the complete NS1 and VP2 genes of canine parvovirus from domestic dogs in 2013 and 2014 in China.

    PubMed

    Wang, Hualei; Jin, Hongli; Li, Qian; Zhao, Guoxing; Cheng, Nan; Feng, Na; Zheng, Xuexing; Wang, Jianzhong; Zhao, Yongkun; Li, Ling; Cao, Zengguo; Yan, Feihu; Wang, Lina; Wang, Tiecheng; Gao, Yuwei; Yang, Songtao; Xia, Xianzhu

    2016-02-01

    Canine parvovirus (CPV) can cause severe disease in animals and continuously generates new variant and recombinant strains in dogs that have a strong impact on sanitation. It is therefore necessary to investigate epidemic CPV strains to improve our understanding of CPV transmission and epidemic behavior. However, most studies have focused on the analysis of VP2, and therefore, information about recombination and relationships between strains is still lacking. Here, 14 strains of CPV were isolated from domestic dogs suspected of hosting CPV between 2013 and 2014 in China. The complete NS1 and VP2 genes were sequenced and analyzed. The results suggest that the new CPV-2a and new CPV-2b types are the prevalent strains in China. In addition to a few mutations (residues 19, 544, 545, 572 and 583 of NS1 and residues 267, 370, 377 and 440 of VP2) that were preserved during transmission, new mutations (residues 60, 630 of NS1, and residues 21, 310 of VP2) were found in the isolated strains. A phylogenetic tree based on VP2 sequences illustrated that the new CPV-2a and new CPV-2b strains from China form single clusters that are distinct from lineages from other countries. Moreover, recombination between the new CPV-2a and new CPV-2b types was also identified in the isolated strains. Due to differences in selection pressures or recombination, there were a small number of inconsistencies between the phylogenetic trees for VP2 and NS1, which indicated that phylogenetic relationships based on VP2 might not be representative of those based on NS1. The data indicated that mutations and recombination are constantly occurring along with the spread of CPV in China.

  1. Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

    NASA Astrophysics Data System (ADS)

    Weigt, Martin

    Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).

  2. T4-Like Genome Organization of the Escherichia coli O157:H7 Lytic Phage AR1▿†

    PubMed Central

    Liao, Wei-Chao; Ng, Wailap Victor; Lin, I-Hsuan; Syu, Wan-Jr; Liu, Tze-Tze; Chang, Chuan-Hsiung

    2011-01-01

    We report the genome organization and analysis of the first completely sequenced T4-like phage, AR1, of Escherichia coli O157:H7. Unlike most of the other sequenced phages of O157:H7, which belong to the temperate Podoviridae and Siphoviridae families, AR1 is a T4-like phage known to efficiently infect this pathogenic bacterial strain. The 167,435-bp AR1 genome is currently the largest among all the sequenced E. coli O157:H7 phages. It carries a total of 281 potential open reading frames (ORFs) and 10 putative tRNA genes. Of these, 126 predicted proteins could be classified into six viral orthologous group categories, with at least 18 proteins of the structural protein category having been detected by tandem mass spectrometry. Comparative genomic analysis of AR1 and four other completely sequenced T4-like genomes (RB32, RB69, T4, and JS98) indicated that they share a well-organized and highly conserved core genome, particularly in the regions encoding DNA replication and virion structural proteins. The major diverse features between these phages include the modules of distal tail fibers and the types and numbers of internal proteins, tRNA genes, and mobile elements. Codon usage analysis suggested that the presence of AR1-encoded tRNAs may be relevant to the codon usage of structural proteins. Furthermore, protein sequence analysis of AR1 gp37, a potential receptor binding protein, indicated that eight residues in the C terminus are unique to O157:H7 T4-like phages AR1 and PP01. These residues are known to be located in the T4 receptor recognition domain, and they may contribute to specificity for adsorption to the O157:H7 strain. PMID:21507986

  3. A Quantitative Tool to Distinguish Isobaric Leucine and Isoleucine Residues for Mass Spectrometry-Based De Novo Monoclonal Antibody Sequencing

    NASA Astrophysics Data System (ADS)

    Poston, Chloe N.; Higgs, Richard E.; You, Jinsam; Gelfanova, Valentina; Hale, John E.; Knierman, Michael D.; Siegel, Robert; Gutierrez, Jesus A.

    2014-07-01

    De novo sequencing by mass spectrometry (MS) allows for the determination of the complete amino acid (AA) sequence of a given protein based on the mass difference of detected ions from MS/MS fragmentation spectra. The technique relies on obtaining specific masses that can be attributed to characteristic theoretical masses of AAs. A major limitation of de novo sequencing by MS is the inability to distinguish between the isobaric residues leucine (Leu) and isoleucine (Ile). Incorrect identification of Ile as Leu or vice versa often results in loss of activity in recombinant antibodies. This functional ambiguity is commonly resolved with costly and time-consuming AA mutation and peptide sequencing experiments. Here, we describe a set of orthogonal biochemical protocols, which experimentally determine the identity of Ile or Leu residues in monoclonal antibodies (mAb) based on the selectivity that leucine aminopeptidase shows for n-terminal Leu residues and the cleavage preference for Leu by chymotrypsin. The resulting observations are combined with germline frequencies and incorporated into a logistic regression model, called Predictor for Xle Sites (PXleS) to provide a statistical likelihood for the identity of Leu at an ambiguous site. We demonstrate that PXleS can generate a probability for an Xle site in mAbs with 96% accuracy. The implementation of PXleS precludes the expression of several possible sequences and, therefore, reduces the overall time and resources required to go from spectra generation to a biologically active sequence for a mAb when an Ile or Leu residue is in question.

  4. A quantitative tool to distinguish isobaric leucine and isoleucine residues for mass spectrometry-based de novo monoclonal antibody sequencing.

    PubMed

    Poston, Chloe N; Higgs, Richard E; You, Jinsam; Gelfanova, Valentina; Hale, John E; Knierman, Michael D; Siegel, Robert; Gutierrez, Jesus A

    2014-07-01

    De novo sequencing by mass spectrometry (MS) allows for the determination of the complete amino acid (AA) sequence of a given protein based on the mass difference of detected ions from MS/MS fragmentation spectra. The technique relies on obtaining specific masses that can be attributed to characteristic theoretical masses of AAs. A major limitation of de novo sequencing by MS is the inability to distinguish between the isobaric residues leucine (Leu) and isoleucine (Ile). Incorrect identification of Ile as Leu or vice versa often results in loss of activity in recombinant antibodies. This functional ambiguity is commonly resolved with costly and time-consuming AA mutation and peptide sequencing experiments. Here, we describe a set of orthogonal biochemical protocols, which experimentally determine the identity of Ile or Leu residues in monoclonal antibodies (mAb) based on the selectivity that leucine aminopeptidase shows for n-terminal Leu residues and the cleavage preference for Leu by chymotrypsin. The resulting observations are combined with germline frequencies and incorporated into a logistic regression model, called Predictor for Xle Sites (PXleS) to provide a statistical likelihood for the identity of Leu at an ambiguous site. We demonstrate that PXleS can generate a probability for an Xle site in mAbs with 96% accuracy. The implementation of PXleS precludes the expression of several possible sequences and, therefore, reduces the overall time and resources required to go from spectra generation to a biologically active sequence for a mAb when an Ile or Leu residue is in question.

  5. Library analysis of SCHEMA-guided protein recombination.

    PubMed

    Meyer, Michelle M; Silberg, Jonathan J; Voigt, Christopher A; Endelman, Jeffrey B; Mayo, Stephen L; Wang, Zhen-Gang; Arnold, Frances H

    2003-08-01

    The computational algorithm SCHEMA was developed to estimate the disruption caused when amino acid residues that interact in the three-dimensional structure of a protein are inherited from different parents upon recombination. To evaluate how well SCHEMA predicts disruption, we have shuffled the distantly-related beta-lactamases PSE-4 and TEM-1 at 13 sites to create a library of 2(14) (16,384) chimeras and examined which ones retain lactamase function. Sequencing the genes from ampicillin-selected clones revealed that the percentage of functional clones decreased exponentially with increasing calculated disruption (E = the number of residue-residue contacts that are broken upon recombination). We also found that chimeras with low E have a higher probability of maintaining lactamase function than chimeras with the same effective level of mutation but chosen at random from the library. Thus, the simple distance metric used by SCHEMA to identify interactions and compute E allows one to predict which chimera sequences are most likely to retain their function. This approach can be used to evaluate crossover sites for recombination and to create highly mosaic, folded chimeras.

  6. Folding of polyglutamine chains

    NASA Astrophysics Data System (ADS)

    Chopra, Manan; Reddy, Allam S.; Abbott, N. L.; de Pablo, J. J.

    2008-10-01

    Long polyglutamine chains have been associated with a number of neurodegenerative diseases. These include Huntington's disease, where expanded polyglutamine (PolyQ) sequences longer than 36 residues are correlated with the onset of symptoms. In this paper we study the folding pathway of a 54-residue PolyQ chain into a β-helical structure. Transition path sampling Monte Carlo simulations are used to generate unbiased reactive pathways between unfolded configurations and the folded β-helical structure of the polyglutamine chain. The folding process is examined in both explicit water and an implicit solvent. Both models reveal that the formation of a few critical contacts is necessary and sufficient for the molecule to fold. Once the primary contacts are formed, the fate of the protein is sealed and it is largely committed to fold. We find that, consistent with emerging hypotheses about PolyQ aggregation, a stable β-helical structure could serve as the nucleus for subsequent polymerization of amyloid fibrils. Our results indicate that PolyQ sequences shorter than 36 residues cannot form that nucleus, and it is also shown that specific mutations inferred from an analysis of the simulated folding pathway exacerbate its stability.

  7. Sequence and properties of HMW subunit 1Bx20 from pasta wheat (Triticum durum) which is associated with poor end use properties.

    PubMed

    Shewry, P R; Gilbert, S M; Savage, A W J; Tatham, A S; Wan, Y-F; Belton, P S; Wellner, N; D'Ovidio, R; Békés, F; Halford, N G

    2003-02-01

    The gene encoding high-molecular-weight (HMW) subunit 1Bx20 was isolated from durum wheat cv. Lira. It encodes a mature protein of 774 amino acid residues with an M(r) of 83,913. Comparison with the sequence of subunit 1Bx7 showed over 96% identity, the main difference being the substitution of two cysteine residues in the N-terminal domain of subunit 1Bx7 with tyrosine residues in 1Bx20. Comparison of the structures and stabilities of the two subunits purified from wheat using Fourier-transform infra-red and circular dichroism spectroscopy showed no significant differences. However, incorporation of subunit 1Bx7 into a base flour gave increased dough strength and stability measured by Mixograph analysis, while incorporation of subunit 1Bx20 resulted in small positive or negative effects on the parameters measured. It is concluded that the different effects of the two subunits could relate to the differences in their cysteine contents, thereby affecting the cross-linking and hence properties of the glutenin polymers.

  8. Protein 3D Structure Computed from Evolutionary Sequence Variation

    PubMed Central

    Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes. PMID:22163331

  9. [Molecular cloning and characterization of an acetylcholinesterase gene Dd-ace-2 from sweet potato stem nematode Ditylenchus destructor].

    PubMed

    Ding, Zhong; Peng, Deliang; Huang, Wenkun; He, Wenting; Gao, Bida

    2008-02-01

    A cDNA, named Dd-ace-2, encoding an acetylcholinesterase (AChE, EC3.1.1.7), was isolated from sweet-potato-stem nematode, Ditylenchus destructor. The nucleotide and amino acid sequences among different nematode species were compared and analyzed with DNAMAN5.0, MEGA3.0 softwares. The results showed that the complete nucleotide sequence of Dd-ace-2 gene of Ditylenchus destructor contains 2425 base pairs from which deduced 734 amino acids (GenBank accession No. EF583058). The homology rates of amino acid sequences of Dd-ace-2 gene between Ditylenchus destructor and Meloidogyne incognita, Caenorhabditis elegans, Dictyocaulus viviparous were 48.0%, 42.7%, 42.1% respectively. The mature acetylcholinesterase sequences of Ditylenchus destructor may encode by the first 701 residues of deduced 734 amino acids.The conserved motifs involved in the catalytic triad, the choline binding site and 10 aromatic residues lining the catalytic gorge were present in the Dd-ace-2 deduced protein. Phylogenetic analysis based on AChEs of other nematodes and species showed that the deduced AChE formed the same cluster with ACE-2s.

  10. Phylogenetic analysis of Austrian canine distemper virus strains from clinical samples from dogs and wild carnivores.

    PubMed

    Benetka, V; Leschnik, M; Affenzeller, N; Möstl, K

    2011-04-09

    Austrian field cases of canine distemper (14 dogs, one badger [Meles meles] and one stone marten [Martes foina]) from 2002 to 2007 were investigated and the case histories were summarised briefly. Phylogenetic analysis of fusion (F) and haemagglutinin (H) gene sequences revealed different canine distemper virus (CDV) lineages circulating in Austria. The majority of CDV strains detected from 2002 to 2004 were well embedded in the European lineage. One Austrian canine sample detected in 2003, with a high similarity to Hungarian sequences from 2005 to 2006, could be assigned to the Arctic group (phocine distemper virus type 2-like). The two canine sequences from 2007 formed a clearly distinct group flanked by sequences detected previously in China and the USA on an intermediate position between the European wildlife and the Asia-1 cluster. The Austrian wildlife strains (2006 and 2007) could be assigned to the European wildlife group and were most closely related to, yet clearly different from, the 2007 canine samples. To elucidate the epidemiological role of Austrian wildlife in the transmission of the disease to dogs and vice versa, H protein residues related to receptor and host specificity (residues 530 and 549) were analysed. All samples showed the amino acids expected for their host of origin, with the exception of a canine sequence from 2007, which had an intermediate position between wildlife and canine viral strains. In the period investigated, canine strains circulating in Austria could be assigned to four different lineages reflecting both a high diversity and probably different origins of virus introduction to Austria in different years.

  11. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity.

    PubMed

    Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2015-09-01

    The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  12. Purification of an alpha amylase from Aspergillus flavus NSH9 and molecular characterization of its nucleotide gene sequence.

    PubMed

    Karim, Kazi Muhammad Rezaul; Husaini, Ahmad; Sing, Ngieng Ngui; Sinang, Fazia Mohd; Roslan, Hairul Azman; Hussain, Hasnain

    2018-04-01

    In this study, an alpha-amylase enzyme from a locally isolated Aspergillus flavus NSH9 was purified and characterized. The extracellular α-amylase was purified by ammonium sulfate precipitation and anion-exchange chromatography at a final yield of 2.55-fold and recovery of 11.73%. The molecular mass of the purified α-amylase was estimated to be 54 kDa using SDS-PAGE and the enzyme exhibited optimal catalytic activity at pH 5.0 and temperature of 50 °C. The enzyme was also thermally stable at 50 °C, with 87% residual activity after 60 min. As a metalloenzymes containing calcium, the purified α-amylase showed significantly increased enzyme activity in the presence of Ca 2+ ions. Further gene isolation and characterization shows that the α-amylase gene of A. flavus NSH9 contained eight introns and an open reading frame that encodes for 499 amino acids with the first 21 amino acids presumed to be a signal peptide. Analysis of the deduced peptide sequence showed the presence of three conserved catalytic residues of α-amylase, two Ca 2+ -binding sites, seven conserved peptide sequences, and several other properties that indicates the protein belongs to glycosyl hydrolase family 13 capable of acting on α-1,4-bonds only. Based on sequence similarity, the deduced peptide sequence of A. flavus NSH9 α-amylase was also found to carry two potential surface/secondary-binding site (SBS) residues (Trp 237 and Tyr 409) that might be playing crucial roles in both the enzyme activity and also the binding of starch granules.

  13. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity

    PubMed Central

    Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2015-01-01

    The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648

  14. Cloning and Sequence Analysis of Vibrio halioticoli Genes Encoding Three Types of Polyguluronate Lyase.

    PubMed

    Sugimura; Sawabe; Ezura

    2000-01-01

    The alginate lyase-coding genes of Vibrio halioticoli IAM 14596(T), which was isolated from the gut of the abalone Haliotis discus hannai, were cloned using plasmid vector pUC 18, and expressed in Escherichia coli. Three alginate lyase-positive clones, pVHB, pVHC, and pVHE, were obtained, and all clones expressed the enzyme activity specific for polyguluronate. Three genes, alyVG1, alyVG2, and alyVG3, encoding polyguluronate lyase were sequenced: alyVG1 from pVHB was composed of a 1056-bp open reading frame (ORF) encoding 352 amino acid residues; alyVG2 gene from pVHC was composed of a 993-bp ORF encoding 331 amino acid residues; and alyVG3 gene from pVHE was composed of a 705-bp ORF encoding 235 amino acid residues. Comparison of nucleotide and deduced amino acid sequences among AlyVG1, AlyVG2, and AlyVG3 revealed low homologies. The identity value between AlyVG1 and AlyVG2 was 18.7%, and that between AlyVG2 and AlyVG3 was 17.0%. A higher identity value (26.0%) was observed between AlyVG1 and AlyVG3. Sequence comparison among known polyguluronate lyases including AlyVG1, AlyVG2, and AlyVG3 also did not reveal an identical region in these sequences. However, AlyVG1 showed the highest identity value (36.2%) and the highest similarity (73.3%) to AlyA from Klebsiella pneumoniae. A consensus region comprising nine amino acid (YFKAGXYXQ) in the carboxy-terminal region previously reported by Mallisard and colleagues was observed only in AlyVG1 and AlyVG2.

  15. Understanding sequence similarity and framework analysis between centromere proteins using computational biology.

    PubMed

    Doss, C George Priya; Chakrabarty, Chiranjib; Debajyoti, C; Debottam, S

    2014-11-01

    Certain mysteries pointing toward their recruitment pathways, cell cycle regulation mechanisms, spindle checkpoint assembly, and chromosome segregation process are considered the centre of attraction in cancer research. In modern times, with the established databases, ranges of computational platforms have provided a platform to examine almost all the physiological and biochemical evidences in disease-associated phenotypes. Using existing computational methods, we have utilized the amino acid residues to understand the similarity within the evolutionary variance of different associated centromere proteins. This study related to sequence similarity, protein-protein networking, co-expression analysis, and evolutionary trajectory of centromere proteins will speed up the understanding about centromere biology and will create a road map for upcoming researchers who are initiating their work of clinical sequencing using centromere proteins.

  16. Fuzzy cluster analysis of simple physicochemical properties of amino acids for recognizing secondary structure in proteins.

    PubMed Central

    Mocz, G.

    1995-01-01

    Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein's sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction. PMID:7549882

  17. Novel Molecular Method for Identification of Streptococcus pneumoniae Applicable to Clinical Microbiology and 16S rRNA Sequence-Based Microbiome Studies

    PubMed Central

    Scholz, Christian F. P.; Poulsen, Knud

    2012-01-01

    The close phylogenetic relationship of the important pathogen Streptococcus pneumoniae and several species of commensal streptococci, particularly Streptococcus mitis and Streptococcus pseudopneumoniae, and the recently demonstrated sharing of genes and phenotypic traits previously considered specific for S. pneumoniae hamper the exact identification of S. pneumoniae. Based on sequence analysis of 16S rRNA genes of a collection of 634 streptococcal strains, identified by multilocus sequence analysis, we detected a cytosine at position 203 present in all 440 strains of S. pneumoniae but replaced by an adenosine residue in all strains representing other species of mitis group streptococci. The S. pneumoniae-specific sequence signature could be demonstrated by sequence analysis or indirectly by restriction endonuclease digestion of a PCR amplicon covering the site. The S. pneumoniae-specific signature offers an inexpensive means for validation of the identity of clinical isolates and should be used as an integrated marker in the annotation procedure employed in 16S rRNA-based molecular studies of complex human microbiotas. This may avoid frequent misidentifications such as those we demonstrate to have occurred in previous reports and in reference sequence databases. PMID:22442329

  18. Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software.

    PubMed

    Nakano, Shogo; Asano, Yasuhisa

    2015-02-03

    Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.

  19. Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software

    NASA Astrophysics Data System (ADS)

    Nakano, Shogo; Asano, Yasuhisa

    2015-02-01

    Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.

  20. Transmembrane Domains of Attraction on the TSH Receptor

    PubMed Central

    Ali, M. Rejwan; Mezei, Mihaly; Davies, Terry F.

    2015-01-01

    The TSH receptor (TSHR) has the propensity to form dimers and oligomers. Our data using ectodomain-truncated TSHRs indicated that the predominant interfaces for oligomerization reside in the transmembrane (TM) domain. To map the potentially interacting residues, we first performed in silico studies of the TSHR transmembrane domain using a homology model and using Brownian dynamics (BD). The cluster of dimer conformations obtained from BD analysis indicated that TM1 made contact with TM4 and two residues in TM2 made contact with TM5. To confirm the proximity of these contact residues, we then generated cysteine mutants at all six contact residues predicted by the BD analysis and performed cysteine cross-linking studies. These results showed that the predicted helices in the protomer were indeed involved in proximity interactions. Furthermore, an alternative experimental approach, receptor truncation experiments and LH receptor sequence substitution experiments, identified TM1 harboring a major region involved in TSHR oligomerization, in agreement with the conclusion from the cross-linking studies. Point mutations of the predicted interacting residues did not yield a substantial decrease in oligomerization, unlike the truncation of the TM1, so we concluded that constitutive oligomerization must involve interfaces forming domains of attraction in a cooperative manner that is not dominated by interactions between specific residues. PMID:25406938

  1. Protein structure based prediction of catalytic residues.

    PubMed

    Fajardo, J Eduardo; Fiser, Andras

    2013-02-22

    Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases.

  2. Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment

    PubMed Central

    Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378

  3. Identification of Delta5-fatty acid desaturase from the cellular slime mold dictyostelium discoideum.

    PubMed

    Saito, T; Ochiai, H

    1999-10-01

    cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.

  4. CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments.

    PubMed

    Zhou, Carol L Ecale

    2015-01-01

    In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.

  5. Local backbone structure prediction of proteins

    PubMed Central

    De Brevern, Alexandre G.; Benros, Cristina; Gautier, Romain; Valadié, Hélène; Hazout, Serge; Etchebest, Catherine

    2004-01-01

    Summary A statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (φ, Ψ) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3D structure of proteins from the sole knowledge of their sequences. LocPred is a software which allows the users to submit a protein sequence and performs a prediction in terms of PBs. The prediction results are given both textually and graphically. PMID:15724288

  6. Integrative View of the Diversity and Evolution of SWEET and SemiSWEET Sugar Transporters

    PubMed Central

    Jia, Baolei; Zhu, Xiao Feng; Pu, Zhong Ji; Duan, Yu Xi; Hao, Lu Jiang; Zhang, Jie; Chen, Li-Qing; Jeon, Che Ok; Xuan, Yuan Hu

    2017-01-01

    Sugars Will Eventually be Exported Transporter (SWEET) and SemiSWEET are recently characterized families of sugar transporters in eukaryotes and prokaryotes, respectively. SemiSWEETs contain 3 transmembrane helices (TMHs), while SWEETs contain 7. Here, we performed sequence-based comprehensive analyses for SWEETs and SemiSWEETs across the biosphere. In total, 3,249 proteins were identified and ≈60% proteins were found in green plants and Oomycota, which include a number of important plant pathogens. Protein sequence similarity networks indicate that proteins from different organisms are significantly clustered. Of note, SemiSWEETs with 3 or 4 TMHs that may fuse to SWEET were identified in plant genomes. 7-TMH SWEETs were found in bacteria, implying that SemiSWEET can be fused directly in prokaryote. 15-TMH extraSWEET and 25-TMH superSWEET were also observed in wild rice and oomycetes, respectively. The transporters can be classified into 4, 2, 2, and 2 clades in plants, Metazoa, unicellular eukaryotes, and prokaryotes, respectively. The consensus and coevolution of amino acids in SWEETs were identified by multiple sequence alignments. The functions of the highly conserved residues were analyzed by molecular dynamics analysis. The 19 most highly conserved residues in the SWEETs were further confirmed by point mutagenesis using SWEET1 from Arabidopsis thaliana. The results proved that the conserved residues located in the extrafacial gate (Y57, G58, G131, and P191), the substrate binding pocket (N73, N192, and W176), and the intrafacial gate (P43, Y83, F87, P145, M161, P162, and Q202) play important roles for substrate recognition and transport processes. Taken together, our analyses provide a foundation for understanding the diversity, classification, and evolution of SWEETs and SemiSWEETs using large-scale sequence analysis and further show that gene duplication and gene fusion are important factors driving the evolution of SWEETs. PMID:29326750

  7. Integrative View of the Diversity and Evolution of SWEET and SemiSWEET Sugar Transporters.

    PubMed

    Jia, Baolei; Zhu, Xiao Feng; Pu, Zhong Ji; Duan, Yu Xi; Hao, Lu Jiang; Zhang, Jie; Chen, Li-Qing; Jeon, Che Ok; Xuan, Yuan Hu

    2017-01-01

    Sugars Will Eventually be Exported Transporter (SWEET) and SemiSWEET are recently characterized families of sugar transporters in eukaryotes and prokaryotes, respectively. SemiSWEETs contain 3 transmembrane helices (TMHs), while SWEETs contain 7. Here, we performed sequence-based comprehensive analyses for SWEETs and SemiSWEETs across the biosphere. In total, 3,249 proteins were identified and ≈60% proteins were found in green plants and Oomycota, which include a number of important plant pathogens. Protein sequence similarity networks indicate that proteins from different organisms are significantly clustered. Of note, SemiSWEETs with 3 or 4 TMHs that may fuse to SWEET were identified in plant genomes. 7-TMH SWEETs were found in bacteria, implying that SemiSWEET can be fused directly in prokaryote. 15-TMH extraSWEET and 25-TMH superSWEET were also observed in wild rice and oomycetes, respectively. The transporters can be classified into 4, 2, 2, and 2 clades in plants, Metazoa, unicellular eukaryotes, and prokaryotes, respectively. The consensus and coevolution of amino acids in SWEETs were identified by multiple sequence alignments. The functions of the highly conserved residues were analyzed by molecular dynamics analysis. The 19 most highly conserved residues in the SWEETs were further confirmed by point mutagenesis using SWEET1 from Arabidopsis thaliana . The results proved that the conserved residues located in the extrafacial gate (Y57, G58, G131, and P191), the substrate binding pocket (N73, N192, and W176), and the intrafacial gate (P43, Y83, F87, P145, M161, P162, and Q202) play important roles for substrate recognition and transport processes. Taken together, our analyses provide a foundation for understanding the diversity, classification, and evolution of SWEETs and SemiSWEETs using large-scale sequence analysis and further show that gene duplication and gene fusion are important factors driving the evolution of SWEETs.

  8. Cloning and sequencing of the pheP gene, which encodes the phenylalanine-specific transport system of Escherichia coli.

    PubMed Central

    Pi, J; Wookey, P J; Pittard, A J

    1991-01-01

    The phenylalanine-specific permease gene (pheP) of Escherichia coli has been cloned and sequenced. The gene was isolated on a 6-kb Sau3AI fragment from a chromosomal library, and its presence was verified by complementation of a mutant lacking the functional phenylalanine-specific permease. Subcloning from this fragment localized the pheP gene on a 2.7-kb HindIII-HindII fragment. The nucleotide sequence of this 2.7-kb region was determined. An open reading frame was identified which extends from a putative start point of translation (GTG at position 636) to a termination signal (TAA at position 2010). The assignment of the GTG as the initiation codon was verified by site-directed mutagenesis of the initiation codon and by introducing a chain termination mutation into the pheP-lacZ fusion construct. A single initiation site of transcription 30 bp upstream of the start point of translation was identified by the primer extension analysis. The pheP structural gene consists of 1,374 nucleotides specifying a protein of 458 amino acid residues. The PheP protein is very hydrophobic (71% nonpolar residues). A topological model predicted from the sequence analysis defines 12 transmembrane segments. This protein is highly homologous with the AroP (general aromatic transport) system of E. coli (59.6% identity) and to a lesser extent with the yeast permeases CAN1 (arginine), PUT4 (proline), and HIP1 (histidine) of Saccharomyces cerevisiae. Images PMID:1711024

  9. Purification, cloning, characterization and essential amino acid residues analysis of a new ι-carrageenase from Cellulophaga sp. QY3.

    PubMed

    Ma, Su; Duan, Gaofei; Chai, Wengang; Geng, Cunliang; Tan, Yulong; Wang, Lushan; Le Sourd, Frédéric; Michel, Gurvan; Yu, Wengong; Han, Feng

    2013-01-01

    ι-Carrageenases belong to family 82 of glycoside hydrolases that degrade sulfated galactans in the red algae known as ι-carrageenans. The catalytic mechanism and some substrate-binding residues of family GH82 have been studied but the substrate recognition and binding mechanism of this family have not been fully elucidated. We report here the purification, cloning and characterization of a new ι-carrageenase CgiA_Ce from the marine bacterium Cellulophaga sp. QY3. CgiA_Ce was the most thermostable carrageenase described so far. It was most active at 50°C and pH 7.0 and retained more than 70% of the original activity after incubation at 50°C for 1 h at pH 7.0 or at pH 5.0-10.6 for 24 h. CgiA_Ce was an endo-type ι-carrageenase; it cleaved ι-carrageenan yielding neo-ι-carrabiose and neo-ι-carratetraose as the main end products, and neo-ι-carrahexaose was the minimum substrate. Sequence analysis and structure modeling showed that CgiA_Ce is indeed a new member of family GH82. Moreover, sequence analysis of ι-carrageenases revealed that the amino acid residues at subsites -1 and +1 were more conserved than those at other subsites. Site-directed mutagenesis followed by kinetic analysis identified three strictly conserved residues at subsites -1 and +1 of ι-carrageenases, G228, Y229 and R254 in CgiA_Ce, which played important roles for substrate binding. Furthermore, our results suggested that Y229 and R254 in CgiA_Ce interacted specifically with the sulfate groups of the sugar moieties located at subsites -1 and +1, shedding light on the mechanism of ι-carrageenan recognition in the family GH82.

  10. Purification, Cloning, Characterization and Essential Amino Acid Residues Analysis of a New ι-Carrageenase from Cellulophaga sp. QY3

    PubMed Central

    Ma, Su; Duan, Gaofei; Chai, Wengang; Geng, Cunliang; Tan, Yulong; Wang, Lushan; Le Sourd, Frédéric; Michel, Gurvan; Yu, Wengong; Han, Feng

    2013-01-01

    ι-Carrageenases belong to family 82 of glycoside hydrolases that degrade sulfated galactans in the red algae known as ι-carrageenans. The catalytic mechanism and some substrate-binding residues of family GH82 have been studied but the substrate recognition and binding mechanism of this family have not been fully elucidated. We report here the purification, cloning and characterization of a new ι-carrageenase CgiA_Ce from the marine bacterium Cellulophaga sp. QY3. CgiA_Ce was the most thermostable carrageenase described so far. It was most active at 50°C and pH 7.0 and retained more than 70% of the original activity after incubation at 50°C for 1 h at pH 7.0 or at pH 5.0–10.6 for 24 h. CgiA_Ce was an endo-type ι-carrageenase; it cleaved ι-carrageenan yielding neo-ι-carrabiose and neo-ι-carratetraose as the main end products, and neo-ι-carrahexaose was the minimum substrate. Sequence analysis and structure modeling showed that CgiA_Ce is indeed a new member of family GH82. Moreover, sequence analysis of ι-carrageenases revealed that the amino acid residues at subsites −1 and +1 were more conserved than those at other subsites. Site-directed mutagenesis followed by kinetic analysis identified three strictly conserved residues at subsites −1 and +1 of ι-carrageenases, G228, Y229 and R254 in CgiA_Ce, which played important roles for substrate binding. Furthermore, our results suggested that Y229 and R254 in CgiA_Ce interacted specifically with the sulfate groups of the sugar moieties located at subsites −1 and +1, shedding light on the mechanism of ι-carrageenan recognition in the family GH82. PMID:23741363

  11. Isolation and determination of the primary structure of a lectin protein from the serum of the American alligator (Alligator mississippiensis).

    PubMed

    Darville, Lancia N F; Merchant, Mark E; Maccha, Venkata; Siddavarapu, Vivekananda Reddy; Hasan, Azeem; Murray, Kermit K

    2012-02-01

    Mass spectrometry in conjunction with de novo sequencing was used to determine the amino acid sequence of a 35kDa lectin protein isolated from the serum of the American alligator that exhibits binding to mannose. The protein N-terminal sequence was determined using Edman degradation and enzymatic digestion with different proteases was used to generate peptide fragments for analysis by liquid chromatography tandem mass spectrometry (LC MS/MS). Separate analysis of the protein digests with multiple enzymes enhanced the protein sequence coverage. De novo sequencing was accomplished using MASCOT Distiller and PEAKS software and the sequences were searched against the NCBI database using MASCOT and BLAST to identify homologous peptides. MS analysis of the intact protein indicated that it is present primarily as monomer and dimer in vitro. The isolated 35kDa protein was ~98% sequenced and found to have 313 amino acids and nine cysteine residues and was identified as an alligator lectin. The alligator lectin sequence was aligned with other lectin sequences using DIALIGN and ClustalW software and was found to exhibit 58% and 59% similarity to both human and mouse intelectin-1. The alligator lectin exhibited strong binding affinities toward mannan and mannose as compared to other tested carbohydrates. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. [Oligonucleotide derivatives in the nucleic acid hybridization analysis. III. Synthesis and investigation of properties of oligonucleotides, bearing bifunctional non-nucleotide insert].

    PubMed

    Kupriushkin, M S; Pyshnyĭ, D V

    2012-01-01

    Non-nucleotide phosporamidites were synthetized, having branched backbone with different position of functional groups. Obtained phosphoramidite monomers contain intercalator moiety--6-chloro-2-methoxyacridine, and additional hydroxyl residue protected with dimethoxytrityl group or with tert-butyldimethylsilyl group for post-synthetic modification. Synthesized oligothymidilates contain one or more modified units in different positions of sequence. Melting temperature and thermodynamic parameters of formation of complementary duplexes formed by modified oligonucleotides was defined (change in enthalpy and entropy). The introduction of intercalating residue causes a significant stabilization of DNA duplexes. It is shown that the efficiency of the fluorescence of acridine residue in the oligonucleotide conjugate significantly changes upon hybridization with DNA.

  13. Characterization of a GHF45 cellulase, AkEG21, from the common sea hare Aplysia kurodai

    NASA Astrophysics Data System (ADS)

    Rahman, Mohammad; Inoue, Akira; Ojima, Takao

    2014-08-01

    The common sea hare Aplysia kurodai is known to be a good source for the enzymes degrading seaweed polysaccharides. Recently four cellulases, i.e., 95 kDa, 66 kDa, 45 kDa and 21 kDa enzymes, were isolated from A. kurodai (Tsuji et al., PLoS ONE, 8, e65418, 2013). The former three cellulases were regarded as glycosyl-hydrolase-family 9 (GHF9) enzymes, while the 21 kDa cellulase was suggested to be a GHF45 enzyme. The 21 kDa cellulase was significantly heat stable, and appeared to be advantageous in performing heterogeneous expression and protein-engineering study. In the present study, we determined some enzymatic properties of the 21 kDa cellulase and cloned its cDNA to provide the basis for the protein engineering study of this cellulase. The purified 21 kDa enzyme, termed AkEG21 in the present study, hydrolyzed carboxymethyl cellulose with an optimal pH and temperature at 4.5 and 40oC, respectively. AkEG21 was considerably heat-stable, i.e., it was not inactivated by the incubation at 55oC for 30 min. AkEG21 degraded phosphoric-acid-swollen cellulose producing cellotriose and cellobiose as major end products but hardly degraded oligosaccharides smaller than tetrasaccharide. This indicated that AkEG21 is an endolytic ?-1,4-glucanase (EC 3.2.1.4). A cDNA of 1,013 bp encoding AkEG21 was amplified by PCR and the amino-acid sequence of 197 residues was deduced. The sequence comprised the initiation Met, the putative signal peptide of 16 residues for secretion and the catalytic domain of 180 residues, which lined from the N-terminus in this order. The sequence of the catalytic domain showed 47-62% amino-acid identities to those of GHF45 cellulases reported in other mollusks. Both the catalytic residues and the N-glycosylation residues known in other GHF45 cellulases were conserved in AkEG21. Phylogenetic analysis for the amino-acid sequences suggested the close relation between AkEG21 and fungal GHF45 cellulases.

  14. Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

    PubMed Central

    Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

    1987-01-01

    To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded. Images PMID:2823109

  15. Important role of N108 residue in binding of bovine foamy virus transactivator Tas to viral promoters.

    PubMed

    Bing, Tiejun; Zhang, Suzhen; Liu, Xiaojuan; Liang, Zhibin; Shao, Peng; Zhang, Song; Qiao, Wentao; Tan, Juan

    2016-06-30

    Bovine foamy virus (BFV) encodes the transactivator BTas, which enhances viral gene transcription by binding to the long terminal repeat promoter and the internal promoter. In this study, we investigated the different replication capacities of two similar BFV full-length DNA clones, pBS-BFV-Y and pBS-BFV-B. Here, functional analysis of several chimeric clones revealed a major role for the C-terminal region of the viral genome in causing this difference. Furthermore, BTas-B, which is located in this C-terminal region, exhibited a 20-fold higher transactivation activity than BTas-Y. Sequence alignment showed that these two sequences differ only at amino acid 108, with BTas-B containing N108 and BTas-Y containing D108 at this position. Results of mutagenesis studies demonstrated that residue N108 is important for BTas binding to viral promoters. In addition, the N108D mutation in pBS-BFV-B reduced the viral replication capacity by about 1.5-fold. Our results suggest that residue N108 is important for BTas binding to BFV promoters and has a major role in BFV replication. These findings not only advances our understanding of the transactivation mechanism of BTas, but they also highlight the importance of certain sequence polymorphisms in modulating the replication capacity of isolated BFV clones.

  16. Optical resolution of phenylthiohydantoin-amino acids by capillary electrophoresis and identification of the phenylthiohydantoin-D-amino acid residue of [D-Ala2]-methionine enkephalin.

    PubMed

    Kurosu, Y; Murayama, K; Shindo, N; Shisa, Y; Ishioka, N

    1996-11-01

    This is an initial report to propose a protein sequence analysis system with DL differentiation using capillary electrophoresis (CE). This system consists of a protein sequencer and a CE system. After fractionation of phenyl-thiohydantoin (PTH)-amino acids using a protein sequencer, optical resolution for each PTH-amino acid is performed by CE using some chiral selectors such as digitonin, beta-escin and others. As a model peptide, [D-Ala2]-methionine enkephalin (L-Tyr-D-Ala-Gly-L-Phe-L-Met), was used and the sequence with DL differentiation was determined, with the exception of the fourth amino acid, L-Phe, using our proposed system.

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wemmer, D.E.; Kumar, N.V.; Metrione, R.M.

    Toxin II from Radianthus paumotensis (Rp/sub II/) has been investigated by high-resolution NMR and chemical sequencing methods. Resonance assignments have been obtained for this protein by the sequential approach. NMR assignments could not be made consistent with the previously reported primary sequence for this protein, and chemical methods have been used to determine a sequence with which the NMR data are consistent. Analysis of the 2D NOE spectra shows that the protein secondary structure is comprised of two sequences of ..beta..-sheet, probably joined into a distorted continuous sheet, connected by turns and extended loops, without any regular ..cap alpha..-helical segments.more » The residues previously implicated in activity in this class of proteins, D8 and R13, occur in a loop region.« less

  18. Endophyte Microbiome Diversity in Micropropagated Atriplex canescens and Atriplex torreyi var griffithsii

    PubMed Central

    Lucero, Mary E.; Unc, Adrian; Cooke, Peter; Dowd, Scot; Sun, Shulei

    2011-01-01

    Microbial diversity associated with micropropagated Atriplex species was assessed using microscopy, isolate culturing, and sequencing. Light, electron, and confocal microscopy revealed microbial cells in aseptically regenerated leaves and roots. Clone libraries and tag-encoded FLX amplicon pyrosequencing (TEFAP) analysis amplified sequences from callus homologous to diverse fungal and bacterial taxa. Culturing isolated some seed borne endophyte taxa which could be readily propagated apart from the host. Microbial cells were observed within biofilm-like residues associated with plant cell surfaces and intercellular spaces. Various universal primers amplified both plant and microbial sequences, with different primers revealing different patterns of fungal diversity. Bacterial and fungal TEFAP followed by alignment with sequences from curated databases revealed 7 bacterial and 17 ascomycete taxa in A. canescens, and 5 bacterial taxa in A. torreyi. Additional diversity was observed among isolates and clone libraries. Micropropagated Atriplex retains a complex, intimately associated microbiome which includes diverse strains well poised to interact in manners that influence host physiology. Microbiome analysis was facilitated by high throughput sequencing methods, but primer biases continue to limit recovery of diverse sequences from even moderately complex communities. PMID:21437280

  19. Retreatability of two endodontic sealers, EndoSequence BC Sealer and AH Plus: a micro-computed tomographic comparison.

    PubMed

    Oltra, Enrique; Cox, Timothy C; LaCourse, Matthew R; Johnson, James D; Paranjpe, Avina

    2017-02-01

    Recently, bioceramic sealers like EndoSequence BC Sealer (BC Sealer) have been introduced and are being used in endodontic practice. However, this sealer has limited research related to its retreatability. Hence, the aim of this study was to evaluate the retreatability of two sealers, BC Sealer as compared with AH Plus using micro-computed tomographic (micro-CT) analysis. Fifty-six extracted human maxillary incisors were instrumented and randomly divided into 4 groups of 14 teeth: 1A, gutta-percha, AH Plus retreated with chloroform; 1B, gutta-percha, AH Plus retreated without chloroform; 2A, gutta-percha, EndoSequence BC Sealer retreated with chloroform; 2B, gutta-percha, EndoSequence BC Sealer retreated without chloroform. Micro-CT scans were taken before and after obturation and retreatment and analyzed for the volume of residual material. The specimens were longitudinally sectioned and digitized images were taken with the dental operating microscope. Data was analyzed using an ANOVA and a post-hoc Tukey test. Fisher exact tests were performed to analyze the ability to regain patency. There was significantly less residual root canal filling material in the AH Plus groups retreated with chloroform as compared to the others. The BC Sealer samples retreated with chloroform had better results than those retreated without chloroform. Furthermore, patency could be re-established in only 14% of teeth in the BC Sealer without chloroform group. The results of this study demonstrate that the BC Sealer group had significantly more residual filling material than the AH Plus group regardless of whether or not both sealers were retreated with chloroform.

  20. Characterization of intronic uridine-rich sequence elements acting as possible targets for nuclear proteins during pre-mRNA splicing in Nicotiana plumbaginifolia.

    PubMed

    Gniadkowski, M; Hemmings-Mieszczak, M; Klahre, U; Liu, H X; Filipowicz, W

    1996-02-15

    Introns of nuclear pre-mRNAs in dicotyledonous plants, unlike introns in vertebrates or yeast, are distinctly rich in A+U nucleotides and this feature is essential for their processing. In order to define more precisely sequence elements important for intron recognition in plants, we investigated the effects of short insertions, either U-rich or A-rich, on splicing of synthetic introns in transfected protoplast of Nicotiana plumbaginifolia. It was found that insertions of U-rich (sequence UUUUUAU) but not A-rich (AUAAAAA) segments can activate splicing of a GC-rich synthetic infron, and that U-rich segments, or multimers thereof, can function irrespective of the site of insertion within the intron. Insertions of multiple U-rich segments, either at the same or different locations, generally had an additive, stimulatory effect on splicing. Mutational analysis showed that replacement of one or two U residues in the UUUUUAU sequence with A or C residues had only a small effect on splicing, but replacement with G residues was strongly inhibitory. Proteins that interact with fragments of natural and synthetic pre-mRNAs in vitro were identified in nuclear extracts of N.plumbaginifolia by UV cross- linking. The profile of cross-linked plant proteins was considerably less complex than that obtained with a HeLa cell nuclear extract. Two major cross-linkable plant proteins had apparent molecular mass of 50 and 54 kDa and showed affinity for oligouridilates present in synGC introns or for poly(U).

  1. Characterization of intronic uridine-rich sequence elements acting as possible targets for nuclear proteins during pre-mRNA splicing in Nicotiana plumbaginifolia.

    PubMed Central

    Gniadkowski, M; Hemmings-Mieszczak, M; Klahre, U; Liu, H X; Filipowicz, W

    1996-01-01

    Introns of nuclear pre-mRNAs in dicotyledonous plants, unlike introns in vertebrates or yeast, are distinctly rich in A+U nucleotides and this feature is essential for their processing. In order to define more precisely sequence elements important for intron recognition in plants, we investigated the effects of short insertions, either U-rich or A-rich, on splicing of synthetic introns in transfected protoplast of Nicotiana plumbaginifolia. It was found that insertions of U-rich (sequence UUUUUAU) but not A-rich (AUAAAAA) segments can activate splicing of a GC-rich synthetic infron, and that U-rich segments, or multimers thereof, can function irrespective of the site of insertion within the intron. Insertions of multiple U-rich segments, either at the same or different locations, generally had an additive, stimulatory effect on splicing. Mutational analysis showed that replacement of one or two U residues in the UUUUUAU sequence with A or C residues had only a small effect on splicing, but replacement with G residues was strongly inhibitory. Proteins that interact with fragments of natural and synthetic pre-mRNAs in vitro were identified in nuclear extracts of N.plumbaginifolia by UV cross- linking. The profile of cross-linked plant proteins was considerably less complex than that obtained with a HeLa cell nuclear extract. Two major cross-linkable plant proteins had apparent molecular mass of 50 and 54 kDa and showed affinity for oligouridilates present in synGC introns or for poly(U). PMID:8604302

  2. High-throughput sequencing in acute lymphoblastic leukemia: Follow-up of minimal residual disease and emergence of new clones.

    PubMed

    Salson, Mikaël; Giraud, Mathieu; Caillault, Aurélie; Grardel, Nathalie; Duployez, Nicolas; Ferret, Yann; Duez, Marc; Herbert, Ryan; Rocher, Tatiana; Sebda, Shéhérazade; Quief, Sabine; Villenet, Céline; Figeac, Martin; Preudhomme, Claude

    2017-02-01

    Minimal residual disease (MRD) is known to be an independent prognostic factor in patients with acute lymphoblastic leukemia (ALL). High-throughput sequencing (HTS) is currently used in routine practice for the diagnosis and follow-up of patients with hematological neoplasms. In this retrospective study, we examined the role of immunoglobulin/T-cell receptor-based MRD in patients with ALL by HTS analysis of immunoglobulin H and/or T-cell receptor gamma chain loci in bone marrow samples from 11 patients with ALL, at diagnosis and during follow-up. We assessed the clinical feasibility of using combined HTS and bioinformatics analysis with interactive visualization using Vidjil software. We discuss the advantages and drawbacks of HTS for monitoring MRD. HTS gives a more complete insight of the leukemic population than conventional real-time quantitative PCR (qPCR), and allows identification of new emerging clones at each time point of the monitoring. Thus, HTS monitoring of Ig/TR based MRD is expected to improve the management of patients with ALL. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Prognostic value of residual fluorescent tissue in glioblastoma patients after gross total resection in 5-aminolevulinic Acid-guided surgery.

    PubMed

    Aldave, Guillermo; Tejada, Sonia; Pay, Eva; Marigil, Miguel; Bejarano, Bartolomé; Idoate, Miguel A; Díez-Valle, Ricardo

    2013-06-01

    There is evidence in the literature supporting that fluorescent tissue signal in fluorescence-guided surgery extends farther than tissue highlighted in gadolinium in T1 sequence magnetic resonance imaging (MRI), which is the standard to quantify the extent of resection. To study whether the presence of residual fluorescent tissue after surgery carries a different prognosis for glioblastoma (GBM) cases with complete resection confirmed by MRI. A retrospective review in our center found 118 consecutive patients with high-grade gliomas operated on with the use of fluorescence-guided surgery with 5-aminolevulinic acid. Within that series, the 52 patients with newly diagnosed GBM and complete resection of enhancing tumor (CRET) in early MRI were selected for analysis. We studied the influence of residual fluorescence in the surgical field on overall survival and neurological complication rate. Multivariate analysis included potential relevant factors: age, Karnofsky Performance Scale, O-methylguanine methyltransferase methylation promoter status, tumor eloquent location, preoperative tumor volume, and adjuvant therapy. The median overall survival was 27.0 months (confidence interval = 22.4-31.6) in patients with nonresidual fluorescence (n = 25) and 17.5 months (confidence interval = 12.5-22.5) for the group with residual fluorescence (n = 27) (P = .015). The influence of residual fluorescence was maintained in the multivariate analysis with all covariables, hazard ratio = 2.5 (P = .041). The neurological complication rate was 18.5% in patients with nonresidual fluorescence and 8% for the group with residual fluorescence (P = .267). GBM patients with CRET in early MRI and no fluorescent residual tissue had longer overall survival than patients with CRET and residual fluorescent tissue.

  4. Cloning and expression of a cDNA coding for catalase from zebrafish (Danio rerio).

    PubMed

    Ken, C F; Lin, C T; Wu, J L; Shaw, J F

    2000-06-01

    A full-length complementary DNA (cDNA) clone encoding a catalase was amplified by the rapid amplication of cDNA ends-polymerase chain reaction (RACE-PCR) technique from zebrafish (Danio rerio) mRNA. Nucleotide sequence analysis of this cDNA clone revealed that it comprised a complete open reading frame coding for 526 amino acid residues and that it had a molecular mass of 59 654 Da. The deduced amino acid sequence showed high similarity with the sequences of catalase from swine (86.9%), mouse (85.8%), rat (85%), human (83.7%), fruit fly (75.6%), nematode (71.1%), and yeast (58.6%). The amino acid residues for secondary structures are apparently conserved as they are present in other mammal species. Furthermore, the coding region of zebrafish catalase was introduced into an expression vector, pET-20b(+), and transformed into Escherichia coli expression host BL21(DE3)pLysS. A 60-kDa active catalase protein was expressed and detected by Coomassie blue staining as well as activity staining on polyacrylamide gel followed electrophoresis.

  5. His-426 of the Pseudomonas aeruginosa exotoxin A is required for ADP-ribosylation of elongation factor II.

    PubMed Central

    Wozniak, D J; Hsu, L Y; Galloway, D R

    1988-01-01

    Exotoxin A (ETA) is recognized as the most toxic product associated with the opportunistic pathogen Pseudomonas aeruginosa. Identification of the amino acids in the polypeptide sequence that are required for toxin activity is critical for vaccine development. By defining the nucleotide sequence of the structural gene of a mutant that encodes an enzymatically inactive ETA (CRM 66), we identified an essential amino acid (His-426), which is involved in the ADP-ribosyltransferase activity associated with functional ETA. A monoclonal antibody that inhibits ETA enzymatic activity in vitro fails to react with ETA variants that have a His 426----Tyr substitution. Several mono-ADP-ribosylating toxins, including diphtheria and pertussis toxins, within the primary amino acid sequences carry a histidine residue that is conserved in spacing and in location with respect to other critical residues. Analysis of the three-dimensional structure of ETA revealed that His-426 is not associated with the proposed NAD+ binding site. These findings should be useful for the design and construction of toxin vaccines. Images PMID:3143111

  6. Redox Proteomics of Protein-bound Methionine Oxidation*

    PubMed Central

    Ghesquière, Bart; Jonckheere, Veronique; Colaert, Niklaas; Van Durme, Joost; Timmerman, Evy; Goethals, Marc; Schymkowitz, Joost; Rousseau, Frederic; Vandekerckhove, Joël; Gevaert, Kris

    2011-01-01

    We here present a new method to measure the degree of protein-bound methionine sulfoxide formation at a proteome-wide scale. In human Jurkat cells that were stressed with hydrogen peroxide, over 2000 oxidation-sensitive methionines in more than 1600 different proteins were mapped and their extent of oxidation was quantified. Meta-analysis of the sequences surrounding the oxidized methionine residues revealed a high preference for neighboring polar residues. Using synthetic methionine sulfoxide containing peptides designed according to the observed sequence preferences in the oxidized Jurkat proteome, we discovered that the substrate specificity of the cellular methionine sulfoxide reductases is a major determinant for the steady-state of methionine oxidation. This was supported by a structural modeling of the MsrA catalytic center. Finally, we applied our method onto a serum proteome from a mouse sepsis model and identified 35 in vivo methionine oxidation events in 27 different proteins. PMID:21406390

  7. Characterization of Novel Fusaricidins Produced by Paenibacillus polymyxa-M1 Using MALDI-TOF Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Vater, Joachim; Niu, Ben; Dietel, Kristin; Borriss, Rainer

    2015-09-01

    Paenibacillus polymyxa-M1 is a potent producer of bioactive compounds, such as lipopeptides, polyketides, and lantibiotics of biotechnological and medical interest. Genome sequencing revealed nine gene clusters for nonribosomal biosynthesis of such agents. Here we report on the investigation of the fusaricidins, a complex of cyclic lipopeptides containing 15-guanidino-3-hydroxypentadecanoic acid (GHPD) as fatty acid component by matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). More than 20 variants of these compounds were detected and characterized in detail. Mass spectrometric sequence analysis was performed by MALDI-LIFT-TOF/TOF fragment analysis. The obtained product ion spectra show a specific processing in the fatty acid part. GHPD is cleaved between the α- and ß-position yielding two fragments a and b, one bearing the end-standing guanidine group and another one comprising the residual two C-atoms of GHPD with the attached peptide moiety. The complete sequence of all fusaricidins was derived from sets of bn- and yn-ions. The fusaricidin complex can be divided into four lipopeptide families, three of them showing variations of the amino acid in position 3, Val or Ile for the first and Tyr or Phe for families 2 and 3, respectively. A collection of novel fusaricidins was detected differing from those of families 1-3 by an additional residue of 71 Da (family 4). LIFT-TOF/TOF fragment spectra of these species imply that in their peptide moiety, an Ala-residue is attached by an ester bond to the free hydroxyl group of Thr4. More than 10 novel fusaricidins were characterized mass spectrometrically.

  8. Functional analysis of the C-terminal region of human adenovirus E1A reveals a misidentified nuclear localization signal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cohen, Michael J.; King, Cason R.; Dikeakos, Jimmy D.

    The immortalizing function of the human adenovirus 5 E1A oncoprotein requires efficient localization to the nucleus. In 1987, a consensus monopartite nuclear localization sequence (NLS) was identified at the C-terminus of E1A. Since that time, various experiments have suggested that other regions of E1A influence nuclear import. In addition, a novel bipartite NLS was recently predicted at the C-terminal region of E1A in silico. In this study, we used immunofluorescence microscopy and co-immunoprecipitation analysis with importin-α to verify that full nuclear localization of E1A requires the well characterized NLS spanning residues 285–289, as well as a second basic patch situatedmore » between residues 258 and 263 ({sup 258}RVGGRRQAVECIEDLLNEPGQPLDLSCKRPRP{sup 289}). Thus, the originally described NLS located at the C-terminus of E1A is actually a bipartite signal, which had been misidentified in the existing literature as a monopartite signal, altering our understanding of one of the oldest documented NLSs. - Highlights: • Human adenovirus E1A is localized to the nucleus. • The C-terminus of E1A contains a bipartite nuclear localization signal (NLS). • This signal was previously misidentified to be a monopartite NLS. • Key basic amino acid residues within this sequence are highly conserved.« less

  9. The nonamer UUAUUUAUU is the key AU-rich sequence motif that mediates mRNA degradation.

    PubMed Central

    Zubiaga, A M; Belasco, J G; Greenberg, M E

    1995-01-01

    Labile mRNAs that encode cytokine and immediate-early gene products often contain AU-rich sequences within their 3' untranslated region (UTR). These AU-rich sequences appear to be key determinants of the short half-lives of these mRNAs, although the sequence features of these elements and the mechanism by which they target mRNAs for rapid decay have not been fully defined. We have examined the features of AU-rich elements (AREs) that are crucial for their function as determinants of mRNA instability in mammalian cells by testing the ability of various mutant c-fos AREs and synthetic AREs to direct rapid mRNA deadenylation and decay when inserted within the 3' UTR of the normally stable beta-globin mRNA. Evidence is presented that the pentamer AUUUA, which previously was suggested to be the minimal determinant of instability present in mammalian AREs, cannot direct rapid mRNA deadenylation and decay. Instead, the nonomer UUAUUUAUU is the elemental AU-rich sequence motif that destabilizes mRNA. Removal of one uridine residue from either end of the nonamer (UUAUUUAU or UAUUUAUU) results in a decrease of potency of the element, while removal of a uridine residue from both ends of the nonamer (UAUUUAU) eliminates detectable destabilizing activity. The inclusion of an additional uridine residue at both ends of the nonamer (UUUAUUUAUUU) does not further increase the efficacy of the element. Taken together, these findings suggest that the nonamer UUAUUUAUU is the minimal AU-rich motif that effectively destabilizes mRNA. Additional ARE potency is achieved by combining multiple copies of this nonamer in a single mRNA 3' UTR. Furthermore, analysis of poly(A) shortening rates for ARE-containing mRNAs reveals that the UUAUUUAUU sequence also accelerates mRNA deadenylation and suggests that the UUAUUUAUU motif targets mRNA for rapid deadenylation as an early step in the mRNA decay process. PMID:7891716

  10. Ancient DNA sequence revealed by error-correcting codes.

    PubMed

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  11. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  12. Amino-terminal sequence of glycoprotein D of herpes simplex virus types 1 and 2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eisenberg, R.J.; Long, D.; Hogue-Angeletti, R.

    1984-01-01

    Glycoprotein D (gD) of herpes simplex virus is a structural component of the virion envelope which stimulates production of high titers of herpes simplex virus type-common neutralizing antibody. The authors caried out automated N-terminal amino acid sequencing studies on radiolabeled preparations of gD-1 (gD of herpes simplex virus type 1) and gD-2 (gD of herpes simplex virus type 2). Although some differences were noted, particularly in the methionine and alanine profiles for gD-1 and gD-2, the amino acid sequence of a number of the first 30 residues of the amino terminus of gD-1 and gD-2 appears to be quite similar.more » For both proteins, the first residue is a lysine. When we compared out sequence data for gD-1 with those predicted by nucleic acid sequencing, the two sequences could be aligned (with one exception) starting at residue 26 (lysine) of the predicted sequence. Thus, the first 25 amino acids of the predicted sequence are absent from the polypeptides isolated from infected cells.« less

  13. Uncoupling cis-Acting RNA Elements from Coding Sequences Revealed a Requirement of the N-Terminal Region of Dengue Virus Capsid Protein in Virus Particle Formation

    PubMed Central

    Samsa, Marcelo M.; Mondotte, Juan A.; Caramelo, Julio J.

    2012-01-01

    Little is known about the mechanism of flavivirus genome encapsidation. Here, functional elements of the dengue virus (DENV) capsid (C) protein were investigated. Study of the N-terminal region of DENV C has been limited by the presence of overlapping cis-acting RNA elements within the protein-coding region. To dissociate these two functions, we used a recombinant DENV RNA with a duplication of essential RNA structures outside the C coding sequence. By the use of this system, the highly conserved amino acids FNML, which are encoded in the RNA cyclization sequence 5′CS, were found to be dispensable for C function. In contrast, deletion of the N-terminal 18 amino acids of C impaired DENV particle formation. Two clusters of basic residues (R5-K6-K7-R9 and K17-R18-R20-R22) were identified as important. A systematic mutational analysis indicated that a high density of positive charges, rather than particular residues at specific positions, was necessary. Furthermore, a differential requirement of N-terminal sequences of C for viral particle assembly was observed in mosquito and human cells. While no viral particles were observed in human cells with a virus lacking the first 18 residues of C, DENV propagation was detected in mosquito cells, although to a level about 50-fold less than that observed for a wild-type (WT) virus. We conclude that basic residues at the N terminus of C are necessary for efficient particle formation in mosquito cells but that they are crucial for propagation in human cells. This is the first report demonstrating that the N terminus of C plays a role in DENV particle formation. In addition, our results suggest that this function of C is differentially modulated in different host cells. PMID:22072762

  14. In silico prediction of the pathogenic effect of a novel variant of BCKDHA leading to classical maple syrup urine disease identified using clinical exome sequencing.

    PubMed

    Fernández-Lainez, Cynthia; Aláez-Verson, Carmen; Ibarra-González, Isabel; Enríquez-Flores, Sergio; Carrillo-Sanchez, Karol; Flores-Lagunes, Leonardo; Guillén-López, Sara; Belmont-Martínez, Leticia; Vela-Amieva, Marcela

    2018-04-16

    Maple syrup urine disease (MSUD) is a metabolic disorder caused by mutations in three of the branched-chain α-keto acid dehydrogenase complex (BCKDC) genes. Classical MSUD symptom can be observed immediately after birth and include ketoacidosis, irritability, lethargy, and coma, which can lead to death or irreversible neurodevelopmental delay in survivors. The molecular diagnosis of MSUD can be time-consuming and difficult to establish using conventional Sanger sequencing because it could be due to pathogenic variants of any of the BCKDC genes. Next-generation sequencing-based methodologies have revolutionized the molecular diagnosis of inborn errors in metabolism and offer a superior approach for genotyping these patients. Here, we report an MSUD case whose molecular diagnosis was performed by clinical exome sequencing (CES), and the possible structural pathogenic effect of a novel E1α subunit pathogenic variant was analyzed using in silico analysis of α and β subunit crystallographic structure. Molecular analysis revealed a new homozygous non-sense c.1267C>T or p.Gln423Ter variant of BCKDHA. The novel BCKDHA variant is considered pathogenic because it caused a premature stop codon that probably led to the loss of the last 22 amino acid residues of the E1α subunit C-terminal end. In silico analysis of this region showed that it is in contact with several residues of the E1β subunit mainly through polar contacts, hydrogen bonds, and hydrophobic interactions. CES strategy could benefit the patients and families by offering precise and prompt diagnosis and better genetic counseling. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase.

    PubMed Central

    Haggarty, N W; Dunbar, B; Fothergill, L A

    1983-01-01

    The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase, comprising 239 residues, was determined. The sequence was deduced from the four cyanogen bromide fragments, and from the peptides derived from these fragments after digestion with a number of proteolytic enzymes. Comparison of this sequence with that of the yeast glycolytic enzyme, phosphoglycerate mutase, shows that these enzymes are 47% identical. Most, but not all, of the residues implicated as being important for the activity of the glycolytic mutase are conserved in the erythrocyte diphosphoglycerate mutase. PMID:6313356

  16. AMP-acetyl CoA synthetase from Leishmania donovani: identification and functional analysis of 'PX4GK' motif.

    PubMed

    Soumya, Neelagiri; Kumar, I Sravan; Shivaprasad, S; Gorakh, Landage Nitin; Dinesh, Neeradi; Swamy, Kayala Kambagiri; Singh, Sushma

    2015-04-01

    An adenosine monophosphate forming acetyl CoA synthetase (AceCS) which is the key enzyme involved in the conversion of acetate to acetyl CoA has been identified from Leishmania donovani for the first time. Sequence analysis of L. donovani AceCS (LdAceCS) revealed the presence of a 'PX4GK' motif which is highly conserved throughout organisms with higher sequence identity (96%) to lower sequence identity (38%). A ∼ 77 kDa heterologous protein with C-terminal 6X His-tag was expressed in Escherichia coli. Expression of LdAceCS in promastigotes was confirmed by western blot and RT-PCR analysis. Immunolocalization studies revealed that it is a cytosolic protein. We also report the kinetic characterization of recombinant LdAceCS with acetate, adenosine 5'-triphosphate, coenzyme A and propionate as substrates. Site directed mutagenesis of residues in conserved PX4GK motif of LdAceCS was performed to gain insight into its potential role in substrate binding, catalysis and its role in maintaining structural integrity of the protein. P646A, G651A and K652R exhibited more than 90% loss in activity signifying its indispensible role in the enzyme activity. Substitution of other residues in this motif resulted in altered substrate specificity and catalysis. However, none of them had any role in modulation of the secondary structure of the protein except G651A mutant. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost generate H5 hemagglutinin specific broad neutralizing antibodies

    PubMed Central

    Wang, Guiqin; Yin, Renfu; Zhou, Paul; Ding, Zhuang

    2017-01-01

    Hemagglutinin (HA) head has long been considered to be able to elicit only a narrow, strain-specific antibody response as it undergoes rapid antigenic drift. However, we previously showed that a heterologous prime-boost strategy, in which mice were primed twice with DNA encoding HA and boosted once with virus-like particles (VLP) from an H5N1 strain A/Thailand/1(KAN)-1/2004 (noted as TH DDV), induced anti-head broad cross-H5 neutralizing antibody response. To explain why TH DDV immunization could generate such breadth, we systemically compared the neutralization breadth and potency between TH DDV sera and immune sera elicited by TH DDD (three times of DNA immunizations), TH VVV (three times of VLP immunizations), TH DV (one DNA prime plus one VLP boost) and TK DDV (plasmid DNA and VLP derived from another H5N1 strain, A/Turkey/65596/2006). Then we determined the antigenic sites (AS) on TH HA head and the key residues of the main antigenic site. Through the comparison of different regiments, we found that the combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost caused that TH DDV immunization generate broad neutralizing antibodies. Antigenic analysis showed that TH DDV, TH DV, TH DDD and TH VVV sera recognize the common antigenic site AS1. Antibodies directed to AS1 contribute to the largest proportion of the neutralizing activity of these immune sera. Residues 188 and 193 in AS1 are the key residues which are responsible for neutralization breadth of the immune sera. Interestingly, residues 188 and 193 locate in classical antigen sites but are relatively conserved among the 16 tested strains and 1,663 HA sequences from NCBI database. Thus, our results strongly indicate that it is feasible to develop broad cross-H5 influenza vaccines against HA head. PMID:28542275

  18. Complete cDNA sequence of SAP-like pentraxin from Limulus polyphemus: implications for pentraxin evolution.

    PubMed

    Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J

    2002-02-22

    The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.

  19. Structural analysis of key gap junction domains--Lessons from genome data and disease-linked mutants.

    PubMed

    Bai, Donglin

    2016-02-01

    A gap junction (GJ) channel is formed by docking of two GJ hemichannels and each of these hemichannels is a hexamer of connexins. All connexin genes have been identified in human, mouse, and rat genomes and their homologous genes in many other vertebrates are available in public databases. The protein sequences of these connexins align well with high sequence identity in the same connexin across different species. Domains in closely related connexins and several residues in all known connexins are also well-conserved. These conserved residues form signatures (also known as sequence logos) in these domains and are likely to play important biological functions. In this review, the sequence logos of individual connexins, groups of connexins with common ancestors, and all connexins are analyzed to visualize natural evolutionary variations and the hot spots for human disease-linked mutations. Several gap junction domains are homologous, likely forming similar structures essential for their function. The availability of a high resolution Cx26 GJ structure and the subsequently-derived homology structure models for other connexin GJ channels elevated our understanding of sequence logos at the three-dimensional GJ structure level, thus facilitating the understanding of how disease-linked connexin mutants might impair GJ structure and function. This knowledge will enable the design of complementary variants to rescue disease-linked mutants. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Isolation, cDNA cloning and gene expression of an antibacterial protein from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros.

    PubMed

    Yang, J; Yamamoto, M; Ishibashi, J; Taniai, K; Yamakawa, M

    1998-08-01

    An antibacterial protein, designated rhinocerosin, was purified to homogeneity from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros immunized with Escherichia coli. Based on the amino acid sequence of the N-terminal region, a degenerate primer was synthesized and reverse-transcriptase PCR was performed to clone rhinocerosin cDNA. As a result, a 279-bp fragment was obtained. The complete nucleotide sequence was determined by sequencing the extended rhinocerosin cDNA clone by 5' rapid amplification of cDNA ends. The deduced amino acid sequence of the mature portion of rhinocerosin was composed of 72 amino acids without cystein residues and was shown to be rich in glycine (11.1%) and proline (11.1%) residues. Comparison of the deduced amino acid sequence of rhinocerosin with those of other antibacterial proteins indicated that it has 77.8% and 44.6% identity with holotricin 2 and coleoptrecin, respectively. Rhinocerosin had strong antibacterial activity against E. coli, Streptococcus pyogenes, Staphylococcus aureus but not against Pseudomonas aeruginosa. Results of reverse-transcriptase PCR analysis of gene expression in different tissues indicated that the rhinocerosin gene is strongly expressed in the fat body and the Malpighian tubule, and weakly expressed in hemocytes and midgut. In addition, gene expression was inducible by bacteria in the fat body, the Malpighian tubule and hemocyte but constitutive expression was observed in the midgut.

  1. Albumin Redhill (-1 Arg, 320 Ala yields Thr): A glycoprotein variant of human serum albumin whose precursor has an aberrant signal peptidase cleavage site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brennan, S.O.; Myles, T.; Peach, R.J.

    1990-01-01

    Albumin Redhill is an electrophoretically slow genetic variant of human serum albumin that does not bind {sup 63}Ni{sup 2+} and has a molecular mass 2.5 kDa higher than normal albumin. Its inability to bind Ni{sup 2+} was explained by the finding of an additional residue of Arg at position -1. This did not explain the molecular basis of the genetic variation or the increase in apparent molecular mass. Fractionation of tryptic digests on concanavalin A-Sepharose followed by peptide mapping of the bound and unbound fractions and sequence analysis of the glycopeptides identified a mutation of 320 Ala {yields} Thr. Thismore » introduces as Asn-Tyr-Thr oligosaccharide attachment sequence centered on Asn-318 and explains the increase in molecular mass. This, however, did not satisfactorily explain the presence of the additional Arg residue at position -1. DNA sequencing of polymerase chain reaction-amplified genomic DNA encoding the prepro sequence of albumin indicated an additional mutation of -2 Arg {yields} Cys. The authors propose that the new Phe-Cys-Arg sequence in the propeptide is an aberrant signal peptidase cleavage site and that the signal peptidase cleaves the propeptide of albumin Redhill in the lumen of the endoplasmic reticulum before it reaches the Golgi vesicles, the site of the diarginyl-specific proalbumin convertase.« less

  2. Sulphur Atoms from Methionines Interacting with Aromatic Residues Are Less Prone to Oxidation

    PubMed Central

    Aledo, Juan C.; Cantón, Francisco R.; Veredas, Francisco J.

    2015-01-01

    Methionine residues exhibit different degrees of susceptibility to oxidation. Although solvent accessibility is a relevant factor, oxidation at particular sites cannot be unequivocally explained by accessibility alone. To explore other possible structural determinants, we assembled different sets of oxidation-sensitive and oxidation-resistant methionines contained in human proteins. Comparisons of the proteins containing oxidized methionines with all proteins in the human proteome led to the conclusion that the former exhibit a significantly higher mean value of methionine content than the latter. Within a given protein, an examination of the sequence surrounding the non-oxidized methionine revealed a preference for neighbouring tyrosine and tryptophan residues, but not for phenylalanine residues. However, because the interaction between sulphur atoms and aromatic residues has been reported to be important for the stabilization of protein structure, we carried out an analysis of the spatial interatomic distances between methionines and aromatic residues, including phenylalanine. The results of these analyses uncovered a new determinant for methionine oxidation: the S-aromatic motif, which decreases the reactivity of the involved sulphur towards oxidants. PMID:26597773

  3. KM+, a mannose-binding lectin from Artocarpus integrifolia: amino acid sequence, predicted tertiary structure, carbohydrate recognition, and analysis of the beta-prism fold.

    PubMed Central

    Rosa, J. C.; De Oliveira, P. S.; Garratt, R.; Beltramini, L.; Resing, K.; Roque-Barreira, M. C.; Greene, L. J.

    1999-01-01

    The complete amino acid sequence of the lectin KM+ from Artocarpus integrifolia (jackfruit), which contains 149 residues/mol, is reported and compared to those of other members of the Moraceae family, particularly that of jacalin, also from jackfruit, with which it shares 52% sequence identity. KM+ presents an acetyl-blocked N-terminus and is not posttranslationally modified by proteolytic cleavage as is the case for jacalin. Rather, it possesses a short, glycine-rich linker that unites the regions homologous to the alpha- and beta-chains of jacalin. The results of homology modeling implicate the linker sequence in sterically impeding rotation of the side chain of Asp141 within the binding site pocket. As a consequence, the aspartic acid is locked into a conformation adequate only for the recognition of equatorial hydroxyl groups on the C4 epimeric center (alpha-D-mannose, alpha-D-glucose, and their derivatives). In contrast, the internal cleavage of the jacalin chain permits free rotation of the homologous aspartic acid, rendering it capable of accepting hydrogen bonds from both possible hydroxyl configurations on C4. We suggest that, together with direct recognition of epimeric hydroxyls and the steric exclusion of disfavored ligands, conformational restriction of the lectin should be considered to be a new mechanism by which selectivity may be built into carbohydrate binding sites. Jacalin and KM+ adopt the beta-prism fold already observed in two unrelated protein families. Despite presenting little or no sequence similarity, an analysis of the beta-prism reveals a canonical feature repeatedly present in all such structures, which is based on six largely hydrophobic residues within a beta-hairpin containing two classic-type beta-bulges. We suggest the term beta-prism motif to describe this feature. PMID:10210179

  4. The carbohydrate-binding module (CBM)-like sequence is crucial for rice CWA1/BC1 function in proper assembly of secondary cell wall materials.

    PubMed

    Sato, Kanna; Ito, Sachiko; Fujii, Takeo; Suzuki, Ryu; Takenouchi, Sachi; Nakaba, Satoshi; Funada, Ryo; Sano, Yuzou; Kajita, Shinya; Kitano, Hidemi; Katayama, Yoshihiro

    2010-11-01

    We recently reported that the cwa1 mutation disturbed the deposition and assembly of secondary cell wall materials in the cortical fiber of rice internodes. Genetic analysis revealed that cwa1 is allelic to bc1, which encodes glycosylphosphatidylinositol (GPI)-anchored COBRA-like protein with the highest homology to Arabidopsis COBRA-like 4 (COBL4) and maize Brittle Stalk 2 (Bk2). Our results suggested that CWA1/BC1 plays a role in assembling secondary cell wall materials at appropriate sites, enabling synthesis of highly ordered secondary cell wall structure with solid and flexible internodes in rice. The N-terminal amino acid sequence of CWA1/BC1, as well as its orthologs (COBL4, Bk2) and other BC1-like proteins in rice, shows weak similarity to a family II carbohydrate-binding module (CBM2) of several bacterial cellulases. To investigate the importance of the CBM-like sequence of CWA1/BC1 in the assembly of secondary cell wall materials, Trp residues in the CBM-like sequence, which is important for carbohydrate binding, were substituted for Val residues and introduced into the cwa1 mutant. CWA1/BC1 with the mutated sequence did not complement the abnormal secondary cell walls seen in the cwa1 mutant, indicating that the CBM-like sequence is essential for the proper function of CWA1/BC1, including assembly of secondary cell wall materials.

  5. The amino acid sequence around the active-site cysteine and histidine residues of stem bromelain

    PubMed Central

    Husain, S. S.; Lowe, G.

    1970-01-01

    Stem bromelain that had been irreversibly inhibited with 1,3-dibromo[2-14C]-acetone was reduced with sodium borohydride and carboxymethylated with iodoacetic acid. After digestion with trypsin and α-chymotrypsin three radioactive peptides were isolated chromatographically. The amino acid sequences around the cross-linked cysteine and histidine residues were determined and showed a high degree of homology with those around the active-site cysteine and histidine residues of papain and ficin. PMID:5420046

  6. Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

    NASA Astrophysics Data System (ADS)

    Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

    2012-10-01

    In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.

  7. Modeling coding-sequence evolution within the context of residue solvent accessibility.

    PubMed

    Scherrer, Michael P; Meyer, Austin G; Wilke, Claus O

    2012-09-12

    Protein structure mediates site-specific patterns of sequence divergence. In particular, residues in the core of a protein (solvent-inaccessible residues) tend to be more evolutionarily conserved than residues on the surface (solvent-accessible residues). Here, we present a model of sequence evolution that explicitly accounts for the relative solvent accessibility of each residue in a protein. Our model is a variant of the Goldman-Yang 1994 (GY94) model in which all model parameters can be functions of the relative solvent accessibility (RSA) of a residue. We apply this model to a data set comprised of nearly 600 yeast genes, and find that an evolutionary-rate ratio ω that varies linearly with RSA provides a better model fit than an RSA-independent ω or an ω that is estimated separately in individual RSA bins. We further show that the branch length t and the transition-transverion ratio κ also vary with RSA. The RSA-dependent GY94 model performs better than an RSA-dependent Muse-Gaut 1994 (MG94) model in which the synonymous and non-synonymous rates individually are linear functions of RSA. Finally, protein core size affects the slope of the linear relationship between ω and RSA, and gene expression level affects both the intercept and the slope. Structure-aware models of sequence evolution provide a significantly better fit than traditional models that neglect structure. The linear relationship between ω and RSA implies that genes are better characterized by their ω slope and intercept than by just their mean ω.

  8. Active site of tripeptidyl peptidase II from human erythrocytes is of the subtilisin type.

    PubMed Central

    Tomkinson, B; Wernstedt, C; Hellman, U; Zetterqvist, O

    1987-01-01

    The present report presents evidence that the amino acid sequence around the serine of the active site of human tripeptidyl peptidase II is of the subtilisin type. The enzyme from human erythrocytes was covalently labeled at its active site with [3H]diisopropyl fluorophosphate, and the protein was subsequently reduced, alkylated, and digested with trypsin. The labeled tryptic peptides were purified by gel filtration and repeated reversed-phase HPLC, and their amino-terminal sequences were determined. Residue 9 contained the radioactive label and was, therefore, considered to be the active serine residue. The primary structure of the part of the active site (residues 1-10) containing this residue was concluded to be Xaa-Thr-Gln-Leu-Met-Asx-Gly-Thr-Ser-Met. This amino acid sequence is homologous to the sequence surrounding the active serine of the microbial peptidases subtilisin and thermitase. These data demonstrate that human tripeptidyl peptidase II represents a potentially distinct class of human peptidases and raise the question of an evolutionary relationship between the active site of a mammalian peptidase and that of the subtilisin family of serine peptidases. PMID:3313395

  9. Phosphorylation of Ser136 is critical for potent bone sialoprotein-mediated nucleation of hydroxyapatite crystals.

    PubMed

    Baht, Gurpreet S; O'Young, Jason; Borovina, Antonia; Chen, Hong; Tye, Coralee E; Karttunen, Mikko; Lajoie, Gilles A; Hunter, Graeme K; Goldberg, Harvey A

    2010-05-27

    Acidic phosphoproteins of mineralized tissues such as bone and dentin are believed to play important roles in HA (hydroxyapatite) nucleation and growth. BSP (bone sialoprotein) is the most potent known nucleator of HA, an activity that is thought to be dependent on phosphorylation of the protein. The present study identifies the role phosphate groups play in mineral formation. Recombinant BSP and peptides corresponding to residues 1-100 and 133-205 of the rat sequence were phosphorylated with CK2 (protein kinase CK2). Phosphorylation increased the nucleating activity of BSP and BSP-(133-205), but not BSP-(1-100). MS analysis revealed that the major site phosphorylated within BSP-(133-205) was Ser136, a site adjacent to the series of contiguous glutamate residues previously implicated in HA nucleation. The critical role of phosphorylated Ser136 in HA nucleation was confirmed by site-directed mutagenesis and functional analyses. Furthermore, peptides corresponding to the 133-148 sequence of rat BSP were synthesized with or without a phosphate group on Ser136. As expected, the phosphopeptide was a more potent nucleator. The mechanism of nucleation was investigated using molecular-dynamics simulations analysing BSP-(133-148) interacting with the {100} crystal face of HA. Both phosphorylated and non-phosphorylated sequences adsorbed to HA in extended conformations with alternating residues in contact with and facing away from the crystal face. However, this alternating-residue pattern was more pronounced when Ser136 was phosphorylated. These studies demonstrate a critical role for Ser136 phosphorylation in BSP-mediated HA nucleation and identify a unique mode of interaction between the nucleating site of the protein and the {100} face of HA.

  10. The NH2-terminal php domain of the alpha subunit of the Escherichia coli replicase binds the epsilon proofreading subunit.

    PubMed

    Wieczorek, Anna; McHenry, Charles S

    2006-05-05

    The alpha subunit of the replicase of all bacteria contains a php domain, initially identified by its similarity to histidinol phosphatase but of otherwise unknown function (Aravind, L., and Koonin, E. V. (1998) Nucleic Acids Res. 26, 3746-3752). Deletion of 60 residues from the NH2 terminus of the alpha php domain destroys epsilon binding. The minimal 255-residue php domain, estimated by sequence alignment with homolog YcdX, is insufficient for epsilon binding. However, a 320-residue segment including sequences that immediately precede the polymerase domain binds epsilon with the same affinity as the 1160-residue full-length alpha subunit. A subset of mutations of a conserved acidic residue (Asp43 in Escherichia coli alpha) present in the php domain of all bacterial replicases resulted in defects in epsilon binding. Using sequence alignments, we show that the prototypical gram+ Pol C, which contains the polymerase and proofreading activities within the same polypeptide chain, has an epsilon-like sequence inserted in a surface loop near the center of the homologous YcdX protein. These findings suggest that the php domain serves as a platform to enable coordination of proofreading and polymerase activities during chromosomal replication.

  11. On the Role of Aggregation Prone Regions in Protein Evolution, Stability, and Enzymatic Catalysis: Insights from Diverse Analyses

    PubMed Central

    Buck, Patrick M.; Kumar, Sandeep; Singh, Satish K.

    2013-01-01

    The various roles that aggregation prone regions (APRs) are capable of playing in proteins are investigated here via comprehensive analyses of multiple non-redundant datasets containing randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins (IDPs) and catalytic residues. Results from this study indicate that the aggregation propensities of monomeric protein sequences have been minimized compared to random sequences with uniform and natural amino acid compositions, as observed by a lower average aggregation propensity and fewer APRs that are shorter in length and more often punctuated by gate-keeper residues. However, evidence for evolutionary selective pressure to disrupt these sequence regions among homologous proteins is inconsistent. APRs are less conserved than average sequence identity among closely related homologues (≥80% sequence identity with a parent) but APRs are more conserved than average sequence identity among homologues that have at least 50% sequence identity with a parent. Structural analyses of APRs indicate that APRs are three times more likely to contain ordered versus disordered residues and that APRs frequently contribute more towards stabilizing proteins than equal length segments from the same protein. Catalytic residues and APRs were also found to be in structural contact significantly more often than expected by random chance. Our findings suggest that proteins have evolved by optimizing their risk of aggregation for cellular environments by both minimizing aggregation prone regions and by conserving those that are important for folding and function. In many cases, these sequence optimizations are insufficient to develop recombinant proteins into commercial products. Rational design strategies aimed at improving protein solubility for biotechnological purposes should carefully evaluate the contributions made by candidate APRs, targeted for disruption, towards protein structure and activity. PMID:24146608

  12. Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

    PubMed

    Neuwald, Andrew F; Altschul, Stephen F

    2016-12-01

    Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).

  13. Alternative dimerization interfaces in the glucocorticoid receptor-α ligand binding domain.

    PubMed

    Bianchetti, Laurent; Wassmer, Bianca; Defosset, Audrey; Smertina, Anna; Tiberti, Marion L; Stote, Roland H; Dejaegere, Annick

    2018-04-30

    Nuclear hormone receptors (NRs) constitute a large family of multi-domain ligand-activated transcription factors. Dimerization is essential for their regulation, and both DNA binding domain (DBD) and ligand binding domain (LBD) are implicated in dimerization. Intriguingly, the glucocorticoid receptor-α (GRα) presents a DBD dimeric architecture similar to that of the homologous estrogen receptor-α (ERα), but an atypical dimeric architecture for the LBD. The physiological relevance of the proposed GRα LBD dimer is a subject of debate. We analyzed all GRα LBD homodimers observed in crystals using an energetic analysis based on the PISA and on the MM/PBSA methods and a sequence conservation analysis, using the ERα LBD dimer as a reference point. Several dimeric assemblies were observed for GRα LBD. The assembly generally taken to be physiologically relevant showed weak binding free energy and no significant residue conservation at the contact interface, while an alternative homodimer mediated by both helix 9 and C-terminal residues showed significant binding free energy and residue conservation. However, none of the GRα LBD assemblies found in crystals are as stable or conserved as the canonical ERα LBD dimer. GRα C-terminal sequence (F-domain) forms a steric obstacle to the canonical dimer assembly in all available structures. Our analysis calls for a re-examination of the currently accepted GRα homodimer structure and experimental investigations of the alternative architectures. This work questions the validity of the currently accepted architecture. This has implications for interpreting physiological data and for therapeutic design pertaining to glucocorticoid research. Copyright © 2018. Published by Elsevier B.V.

  14. Phenotype–genotype correlation in Hirschsprung disease is illuminated by comparative analysis of the RET protein sequence

    PubMed Central

    Kashuk, Carl S.; Stone, Eric A.; Grice, Elizabeth A.; Portnoy, Matthew E.; Green, Eric D.; Sidow, Arend; Chakravarti, Aravinda; McCallion, Andrew S.

    2005-01-01

    The ability to discriminate between deleterious and neutral amino acid substitutions in the genes of patients remains a significant challenge in human genetics. The increasing availability of genomic sequence data from multiple vertebrate species allows inclusion of sequence conservation and physicochemical properties of residues to be used for functional prediction. In this study, the RET receptor tyrosine kinase serves as a model disease gene in which a broad spectrum (≥116) of disease-associated mutations has been identified among patients with Hirschsprung disease and multiple endocrine neoplasia type 2. We report the alignment of the human RET protein sequence with the orthologous sequences of 12 non-human vertebrates (eight mammalian, one avian, and three teleost species), their comparative analysis, the evolutionary topology of the RET protein, and predicted tolerance for all published missense mutations. We show that, although evolutionary conservation alone provides significant information to predict the effect of a RET mutation, a model that combines comparative sequence data with analysis of physiochemical properties in a quantitative framework provides far greater accuracy. Although the ability to discern the impact of a mutation is imperfect, our analyses permit substantial discrimination between predicted functional classes of RET mutations and disease severity even for a multigenic disease such as Hirschsprung disease. PMID:15956201

  15. A comparison of dynamic and static economic models of uneven-aged stand management

    Treesearch

    Robert G. Haight

    1985-01-01

    Numerical techniques have been used to compute the discrete-time sequence of residual diameter distributions that maximize the present net worth (PNW) of harvestable volume from an uneven-aged stand. Results contradicted optimal steady-state diameter distributions determined with static analysis. In this paper, optimality conditions for solutions to dynamic and static...

  16. Functional analysis of (4 S)-limonene synthase mutants reveals determinants of catalytic outcome in a model monoterpene synthase

    DOE PAGES

    Srividya, Narayanan; Davis, Edward M.; Croteau, Rodney B.; ...

    2015-03-02

    We used crystal structural data for (4S)-limonene synthase [(4S)-LS] of spearmint (Mentha spicata L.) to infer which amino acid residues are in close proximity to the substrate and carbocation intermediates of the enzymatic reaction. Alanine-scanning mutagenesis of 48 amino acids combined with enzyme fidelity analysis [percentage of (-)-limonene produced] indicated which residues are most likely to constitute the active site. Furthermore, the mutation of residues W324 and H579 caused a significant drop in enzyme activity and formation of products (myrcene, linalool, and terpineol) characteristic of a premature termination of the reaction. A double mutant (W324A/H579A) had no detectable enzyme activity,more » indicating that either substrate binding or the terminating reaction was impaired. Exchanges to other aromatic residues (W324H, W324F, W324Y, H579F, H579Y, and H579W) resulted in enzyme catalysts with significantly reduced activity. Sequence comparisons across the angiosperm lineage provided evidence that W324 is a conserved residue, whereas the position equivalent to H579 is occupied by aromatic residues (H, F, or Y). Our results are consistent with a critical role of W324 and H579 in the stabilization of carbocation intermediates. Finally, the potential of these residues to serve as the catalytic base facilitating the terminal deprotonation reaction is discussed.« less

  17. Analysis of deep learning methods for blind protein contact prediction in CASP12.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2018-03-01

    Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.

  18. Quantitative functional characterization of conserved molecular interactions in the active site of mannitol 2-dehydrogenase

    PubMed Central

    Lucas, James E; Siegel, Justin B

    2015-01-01

    Enzyme active site residues are often highly conserved, indicating a significant role in function. In this study we quantitate the functional contribution for all conserved molecular interactions occurring within a Michaelis complex for mannitol 2-dehydrogenase derived from Pseudomonas fluorescens (pfMDH). Through systematic mutagenesis of active site residues, we reveal that the molecular interactions in pfMDH mediated by highly conserved residues not directly involved in reaction chemistry can be as important to catalysis as those directly involved in the reaction chemistry. This quantitative analysis of the molecular interactions within the pfMDH active site provides direct insight into the functional role of each molecular interaction, several of which were unexpected based on canonical sequence conservation and structural analyses. PMID:25752240

  19. Molecular modeling of ligand-receptor interactions in the OR5 olfactory receptor.

    PubMed

    Singer, M S; Shepherd, G M

    1994-06-02

    Olfactory receptors belong to the superfamily of seven transmembrane domain, G protein-coupled receptors. In order to begin analysis of mechanisms of receptor activation, a computer model of the OR5 olfactory receptor has been constructed and compared with other members of this superfamily. We have tested docking of the odor molecule lyral, which is known to activate the OR5 receptor. The results point to specific ligand-binding residues on helices III through VII that form a binding pocket in the receptor. Some of these residues occupy sequence positions identical to ligand-binding residues conserved among other superfamily members. The results provide new insights into possible molecular mechanisms of odor recognition and suggest hypotheses to guide future experimental studies using site-directed mutagenesis.

  20. The Minimum M3-M4 Loop Length of Neurotransmitter-activated Pentameric Receptors Is Critical for the Structural Integrity of Cytoplasmic Portals*

    PubMed Central

    Baptista-Hon, Daniel T.; Deeb, Tarek Z.; Lambert, Jeremy J.; Peters, John A.; Hales, Tim G.

    2013-01-01

    The 5-HT3A receptor homology model, based on the partial structure of the nicotinic acetylcholine receptor from Torpedo marmorata, reveals an asymmetric ion channel with five portals framed by adjacent helical amphipathic (HA) stretches within the 114-residue loop between the M3 and M4 membrane-spanning domains. The positive charge of Arg-436, located within the HA stretch, is a rate-limiting determinant of single channel conductance (γ). Further analysis reveals that positive charge and volume of residue 436 are determinants of 5-HT3A receptor inward rectification, exposing an additional role for portals. A structurally unresolved stretch of 85 residues constitutes the bulk of the M3-M4 loop, leaving a >45-Å gap in the model between M3 and the HA stretch. There are no additional structural data for this loop, which is vestigial in bacterial pentameric ligand-gated ion channels and was largely removed for crystallization of the Caenorhabditis elegans glutamate-activated pentameric ligand-gated ion channels. We created 5-HT3A subunit loop truncation mutants, in which sequences framing the putative portals were retained, to determine the minimum number of residues required to maintain their functional integrity. Truncation to between 90 and 75 amino acids produced 5-HT3A receptors with unaltered rectification. Truncation to 70 residues abolished rectification and increased γ. These findings reveal a critical M3-M4 loop length required for functions attributable to cytoplasmic portals. Examination of all 44 subunits of the human neurotransmitter-activated Cys-loop receptors reveals that, despite considerable variability in their sequences and lengths, all M3-M4 loops exceed 70 residues, suggesting a fundamental requirement for portal integrity. PMID:23740249

  1. Evolutionary analysis of FAM83H in vertebrates.

    PubMed

    Huang, Wushuang; Yang, Mei; Wang, Changning; Song, Yaling

    2017-01-01

    Amelogenesis imperfecta is a group of disorders causing abnormalities in enamel formation in various phenotypes. Many mutations in the FAM83H gene have been identified to result in autosomal dominant hypocalcified amelogenesis imperfecta in different populations. However, the structure and function of FAM83H and its pathological mechanism have yet to be further explored. Evolutionary analysis is an alternative for revealing residues or motifs that are important for protein function. In the present study, we chose 50 vertebrate species in public databases representative of approximately 230 million years of evolution, including 1 amphibian, 2 fishes, 7 sauropsidas and 40 mammals, and we performed evolutionary analysis on the FAM83H protein. By sequence alignment, conserved residues and motifs were indicated, and the loss of important residues and motifs of five special species (Malayan pangolin, platypus, minke whale, nine-banded armadillo and aardvark) was discovered. A phylogenetic time tree showed the FAM83H divergent process. Positive selection sites in the C-terminus suggested that the C-terminus of FAM83H played certain adaptive roles during evolution. The results confirmed some important motifs reported in previous findings and identified some new highly conserved residues and motifs that need further investigation. The results suggest that the C-terminus of FAM83H contain key conserved regions critical to enamel formation and calcification.

  2. Protein structure based prediction of catalytic residues

    PubMed Central

    2013-01-01

    Background Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. Results We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. Conclusions We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases. PMID:23433045

  3. Primary structures of ribosomal proteins from the archaebacterium Halobacterium marismortui and the eubacterium Bacillus stearothermophilus.

    PubMed

    Arndt, E; Scholzen, T; Krömer, W; Hatakeyama, T; Kimura, M

    1991-06-01

    Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.

  4. Cleavage sites within the poliovirus capsid protein precursors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Larsen, G.R.; Anderson, C.W.; Dorner, A.J.

    1982-01-01

    Partial amino-terminal sequence analysis was performed on radiolabeled poliovirus capsid proteins VP1, VP2, and VP3. A computer-assisted comparison of the amino acid sequences obtained with that predicted by the nucleotide sequence of the poliovirus genome allows assignment of the amino terminus of each capsid protein to a unique position within the virus polyprotein. Sequence analysis of trypsin-digested VP4, which has a blocked amino terminus, demonstrates that VP4 is encoded at or very near to the amino terminus of the polyprotein. The gene order of the capsid proteins is VP4-VP2-VP3-VP1. Cleavage of VP0 to VP4 and VP2 is shown to occurmore » between asparagine and serine, whereas the cleavages that separate VP2/VP3 and VP3/VP1 occur between glutamine and glycine residues. This finding supports the hypothesis that the cleavage of VP0, which occurs during virion morphogenesis, is distinct from the cleavages that separate functional regions of the polyprotein.« less

  5. Application of the MIDAS approach for analysis of lysine acetylation sites.

    PubMed

    Evans, Caroline A; Griffiths, John R; Unwin, Richard D; Whetton, Anthony D; Corfe, Bernard M

    2013-01-01

    Multiple Reaction Monitoring Initiated Detection and Sequencing (MIDAS™) is a mass spectrometry-based technique for the detection and characterization of specific post-translational modifications (Unwin et al. 4:1134-1144, 2005), for example acetylated lysine residues (Griffiths et al. 18:1423-1428, 2007). The MIDAS™ technique has application for discovery and analysis of acetylation sites. It is a hypothesis-driven approach that requires a priori knowledge of the primary sequence of the target protein and a proteolytic digest of this protein. MIDAS essentially performs a targeted search for the presence of modified, for example acetylated, peptides. The detection is based on the combination of the predicted molecular weight (measured as mass-charge ratio) of the acetylated proteolytic peptide and a diagnostic fragment (product ion of m/z 126.1), which is generated by specific fragmentation of acetylated peptides during collision induced dissociation performed in tandem mass spectrometry (MS) analysis. Sequence information is subsequently obtained which enables acetylation site assignment. The technique of MIDAS was later trademarked by ABSciex for targeted protein analysis where an MRM scan is combined with full MS/MS product ion scan to enable sequence confirmation.

  6. Identification of short single disulfide-containing contryphans from the venom of cone snails using de novo mass spectrometry-based sequencing methods.

    PubMed

    Franklin, Jayaseelan Benjamin; Rajesh, Rajaian Pushpabai; Vinithkumar, Nambali Valsalan; Kirubagaran, Ramalingam

    2017-06-15

    We identified 12 short single disulfide-containing conopeptides from the venom of Conus coronatus, C. leopardus, C. lividus and C. zonatus. Interestingly, we detected the shortest contryphan sequence thus far characterized which contains only six amino acid residues. We also identified three distinct contryphan sequences of C. lividus without any proline residues and one sequence with an unusual post-translational modification (bromination of tryptophan). Furthermore, we characterized venom peptides of C. zonatus for the first time. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Isotope-labeled cross-linkers and Fourier transform ion cyclotron resonance mass spectrometry for structural analysis of a protein/peptide complex.

    PubMed

    Ihling, Christian; Schmidt, Andreas; Kalkhof, Stefan; Schulz, Daniela M; Stingl, Christoph; Mechtler, Karl; Haack, Michael; Beck-Sickinger, Annette G; Cooper, Dermot M F; Sinz, Andrea

    2006-08-01

    For structural studies of proteins and their complexes, chemical cross-linking combined with mass spectrometry presents a promising strategy to obtain structural data of protein interfaces from low quantities of proteins within a short time. We explore the use of isotope-labeled cross-linkers in combination with Fourier transform ion cyclotron resonance (FTICR) mass spectrometry for a more efficient identification of cross-linker containing species. For our studies, we chose the calcium-independent complex between calmodulin and a 25-amino acid peptide from the C-terminal region of adenylyl cyclase 8 containing an "IQ-like motif." Cross-linking reactions between calmodulin and the peptide were performed in the absence of calcium using the amine-reactive, isotope-labeled (d0 and d4) cross-linkers BS3 (bis[sulfosuccinimidyl]suberate) and BS2G (bis[sulfosuccinimidyl]glutarate). Tryptic in-gel digestion of excised gel bands from covalently cross-linked complexes resulted in complicated peptide mixtures, which were analyzed by nano-HPLC/nano-ESI-FTICR mass spectrometry. In cases where more than one reactive functional group, e.g., amine groups of lysine residues, is present in a sequence stretch, MS/MS analysis is a prerequisite for unambiguously identifying the modified residues. MS/MS experiments revealed two lysine residues in the central alpha-helix of calmodulin as well as three lysine residues both in the C-terminal and N-terminal lobes of calmodulin to be cross-linked with one single lysine residue of the adenylyl cyclase 8 peptide. Further cross-linking studies will have to be conducted to propose a structural model for the calmodulin/peptide complex, which is formed in the absence of calcium. The combination of using isotope-labeled cross-linkers, determining the accurate mass of intact cross-linked products, and verifying the amino acid sequences of cross-linked species by MS/MS presents a convenient approach that offers the perspective to obtain structural data of protein assemblies within a few days.

  8. Observed ground-motion variabilities and implication for source properties

    NASA Astrophysics Data System (ADS)

    Cotton, F.; Bora, S. S.; Bindi, D.; Specht, S.; Drouet, S.; Derras, B.; Pina-Valdes, J.

    2016-12-01

    One of the key challenges of seismology is to be able to calibrate and analyse the physical factors that control earthquake and ground-motion variabilities. Within the framework of empirical ground-motion prediction equation (GMPE) developments, ground-motions residuals (differences between recorded ground motions and the values predicted by a GMPE) are computed. The exponential growth of seismological near-field records and modern regression algorithms allow to decompose these residuals into between-event and a within-event residual components. The between-event term quantify all the residual effects of the source (e.g. stress-drops) which are not accounted by magnitude term as the only source parameter of the model. Between-event residuals provide a new and rather robust way to analyse the physical factors that control earthquake source properties and associated variabilities. We first will show the correlation between classical stress-drops and between-event residuals. We will also explain why between-event residuals may be a more robust way (compared to classical stress-drop analysis) to analyse earthquake source-properties. We will finally calibrate between-events variabilities using recent high-quality global accelerometric datasets (NGA-West 2, RESORCE) and datasets from recent earthquakes sequences (Aquila, Iquique, Kunamoto). The obtained between-events variabilities will be used to evaluate the variability of earthquake stress-drops but also the variability of source properties which cannot be explained by a classical Brune stress-drop variations. We will finally use the between-event residual analysis to discuss regional variations of source properties, differences between aftershocks and mainshocks and potential magnitude dependencies of source characteristics.

  9. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  10. Applications of Mass Spectrometry to Structural Analysis of Marine Oligosaccharides

    PubMed Central

    Lang, Yinzhi; Zhao, Xia; Liu, Lili; Yu, Guangli

    2014-01-01

    Marine oligosaccharides have attracted increasing attention recently in developing potential drugs and biomaterials for their particular physical and chemical properties. However, the composition and sequence analysis of marine oligosaccharides are very challenging for their structural complexity and heterogeneity. Mass spectrometry (MS) has become an important technique for carbohydrate analysis by providing more detailed structural information, including molecular mass, sugar constituent, sequence, inter-residue linkage position and substitution pattern. This paper provides an overview of the structural analysis based on MS approaches in marine oligosaccharides, which are derived from some biologically important marine polysaccharides, including agaran, carrageenan, alginate, sulfated fucan, chitosan, glycosaminoglycan (GAG) and GAG-like polysaccharides. Applications of electrospray ionization mass spectrometry (ESI-MS) are mainly presented and the general applications of matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) are also outlined. Some technical challenges in the structural analysis of marine oligosaccharides by MS have also been pointed out. PMID:24983643

  11. Structure of genes for dermaseptins B, antimicrobial peptides from frog skin. Exon 1-encoded prepropeptide is conserved in genes for peptides of highly different structures and activities.

    PubMed

    Vouille, V; Amiche, M; Nicolas, P

    1997-09-01

    We cloned the genes of two members of the dermaseptin family, broad-spectrum antimicrobial peptides isolated from the skin of the arboreal frog Phyllomedusa bicolor. The dermaseptin gene Drg2 has a 2-exon coding structure interrupted by a small 137-bp intron, wherein exon 1 encoded a 22-residue hydrophobic signal peptide and the first three amino acids of the acidic propiece; exon 2 contained the 18 additional acidic residues of the propiece plus a typical prohormone processing signal Lys-Arg and a 32-residue dermaseptin progenitor sequence. The dermaseptin genes Drg2 and Drg1g2 have conserved sequences at both untranslated ends and in the first and second coding exons. In contrast, Drg1g2 comprises a third coding exon for a short version of the acidic propiece and a second dermaseptin progenitor sequence. Structural conservation between the two genes suggests that Drg1g2 arose recently from an ancestral Drg2-like gene through amplification of part of the second coding exon and 3'-untranslated region. Analysis of the cDNAs coding precursors for several frog skin peptides of highly different structures and activities demonstrates that the signal peptides and part of the acidic propieces are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The organization of the genes that belong to this family, with the signal peptide and the progenitor sequence on separate exons, permits strikingly different peptides to be directed into the secretory pathway. The recruitment of such a homologous 'secretory' exon by otherwise non-homologous genes may have been an early event in the evolution of amphibian.

  12. Retreatability of two endodontic sealers, EndoSequence BC Sealer and AH Plus: a micro-computed tomographic comparison

    PubMed Central

    Oltra, Enrique; Cox, Timothy C.; LaCourse, Matthew R.; Johnson, James D.

    2017-01-01

    Objectives Recently, bioceramic sealers like EndoSequence BC Sealer (BC Sealer) have been introduced and are being used in endodontic practice. However, this sealer has limited research related to its retreatability. Hence, the aim of this study was to evaluate the retreatability of two sealers, BC Sealer as compared with AH Plus using micro-computed tomographic (micro-CT) analysis. Materials and Methods Fifty-six extracted human maxillary incisors were instrumented and randomly divided into 4 groups of 14 teeth: 1A, gutta-percha, AH Plus retreated with chloroform; 1B, gutta-percha, AH Plus retreated without chloroform; 2A, gutta-percha, EndoSequence BC Sealer retreated with chloroform; 2B, gutta-percha, EndoSequence BC Sealer retreated without chloroform. Micro-CT scans were taken before and after obturation and retreatment and analyzed for the volume of residual material. The specimens were longitudinally sectioned and digitized images were taken with the dental operating microscope. Data was analyzed using an ANOVA and a post-hoc Tukey test. Fisher exact tests were performed to analyze the ability to regain patency. Results There was significantly less residual root canal filling material in the AH Plus groups retreated with chloroform as compared to the others. The BC Sealer samples retreated with chloroform had better results than those retreated without chloroform. Furthermore, patency could be re-established in only 14% of teeth in the BC Sealer without chloroform group. Conclusion The results of this study demonstrate that the BC Sealer group had significantly more residual filling material than the AH Plus group regardless of whether or not both sealers were retreated with chloroform. PMID:28194360

  13. Interferon-gamma of the giant panda (Ailuropoda melanoleuca): complementary DNA cloning, expression, and phylogenetic analysis.

    PubMed

    Tao, Yaqiong; Zeng, Bo; Xu, Liu; Yue, Bisong; Yang, Dong; Zou, Fangdong

    2010-01-01

    Interferon-gamma (IFN-gamma) is the only member of type II IFN and is vital in the regulation of immune and inflammatory responses. Herein we report the cloning, expression, and sequence analysis of IFN-gamma from the giant panda (Ailuropoda melanoleuca). The open reading frame of this gene is 501 base pair in length and encodes a polypeptide consisting of 166 amino acids. All conserved N-linked glycosylation sites and cysteine residues among carnivores were found in the predicted amino acid sequence of the giant panda. Recombinant giant panda IFN-gamma with a V5 epitope and polyhistidine tag was expressed in HEK293 host cells and confirmed by Western blotting. Phylogenetic analysis of mammalian IFN-gamma-coding sequences indicated that the giant panda IFN-gamma was closest to that of carnivores, then to ungulates and dolphin, and shared a distant relationship with mouse and human. These results represent a first step into the study of IFN-gamma in giant panda.

  14. Sequence and structural characterization of Trx-Grx type of monothiol glutaredoxins from Ashbya gossypii.

    PubMed

    Yadav, Saurabh; Kumari, Pragati; Kushwaha, Hemant Ritturaj

    2013-01-01

    Glutaredoxins are enzymatic antioxidants which are small, ubiquitous, glutathione dependent and essentially classified under thioredoxin-fold superfamily. Glutaredoxins are classified into two types: dithiol and monothiol. Monothiol glutaredoxins which carry the signature "CGFS" as a redox active motif is known for its role in oxidative stress, inside the cell. In the present analysis, the 138 amino acid long monothiol glutaredoxin, AgGRX1 from Ashbya gossypii was identified and has been used for the analysis. The multiple sequence alignment of the AgGRX1 protein sequence revealed the characteristic motif of typical monothiol glutaredoxin as observed in various other organisms. The proposed structure of the AgGRX1 protein was used to analyze signature folds related to the thioredoxin superfamily. Further, the study highlighted the structural features pertaining to the complex mechanism of glutathione docking and interacting residues.

  15. Molecular Characterization of the Skate Peripherin/rds Gene: Relationship to Its Orthologues and Paralogues

    PubMed Central

    Li, Chibo; Ding, Xi-Qin; O’Brien, John; Al-Ubaidi, Muayyad R.

    2010-01-01

    PURPOSE A great deal of information about functionally significant domains of a protein may be obtained by comparison of primary sequences of gene homologues over a broad phylogenetic base. This study was designed to identify evolutionarily conserved domains of the photoreceptor disc membrane protein peripherin/rds by analysis of the homologue in a primitive vertebrate, the skate. METHODS A skate retinal cDNA library was screened using a mouse peripherin/rds clone. The 5′ and 3′ untranslated regions of the skate peripherin/rds (srds) cDNA were isolated by the rapid amplification of cDNA ends (RACE) approach. The gene structure was characterized by PCR amplification and sequencing of genomic fragments. Northern and Western blot analyses were used to identify srds transcript and protein, respectively. RESULTS A new homologue of peripherin/rds was identified from the skate retinal cDNA library. SRDS is a glycoprotein with a predicted molecular mass of 40.2 kDa. The srds gene consists of two exons and one small intron and transcribes into a single 6-kb message. Phylogenetic analysis places SRDS at the base of peripherin/rds family and near the division of that group and the branch leading to rds-like and rom-1 genes. SRDS protein is 54.5% identical with peripherin/rds across species. Identity is significantly higher (73%) in the intradiscal domains. Sequence comparison revealed the conservation of all residues that have been shown, on mutation, to associate with retinitis pigmentosa and showed conservation of most residues associated with macular dystrophies. Comparison with ROM-1 and other rds-like proteins revealed the presence of a highly conserved domain in the large intradiscal loop. CONCLUSIONS Srds represents the skate orthologue of mammalian peripherin/rds genes. Conservation of most of the residues associated with human retinal diseases indicates that these residues serve important functional roles. The high degree of conservation of a short stretch within the large intradiscal loop also suggests an important function for this domain. PMID:12766040

  16. Three acidic residues are at the active site of a beta-propeller architecture in glycoside hydrolase families 32, 43, 62, and 68.

    PubMed

    Pons, Tirso; Naumoff, Daniil G; Martínez-Fleites, Carlos; Hernández, Lázaro

    2004-02-15

    Multiple-sequence alignment of glycoside hydrolase (GH) families 32, 43, 62, and 68 revealed three conserved blocks, each containing an acidic residue at an equivalent position in all the enzymes. A detailed analysis of the site-directed mutations so far performed on invertases (GH32), arabinanases (GH43), and bacterial fructosyltransferases (GH68) indicated a direct implication of the conserved residues Asp/Glu (block I), Asp (block II), and Glu (block III) in substrate binding and hydrolysis. These residues are close in space in the 5-bladed beta-propeller fold determined for Cellvibrio japonicus alpha-L-arabinanase Arb43A [Nurizzo et al., Nat Struct Biol 2002;9:665-668] and Bacillus subtilis endo-1,5-alpha-L-arabinanase. A sequence-structure compatibility search using 3D-PSSM, mGenTHREADER, INBGU, and SAM-T02 programs predicted indistinctly the 5-bladed beta-propeller fold of Arb43A and the 6-bladed beta-propeller fold of sialidase/neuraminidase (GH33, GH34, and GH83) as the most reliable topologies for GH families 32, 62, and 68. We conclude that the identified acidic residues are located at the active site of a beta-propeller architecture in GH32, GH43, GH62, and GH68, operating with a canonical reaction mechanism of either inversion (GH43 and likely GH62) or retention (GH32 and GH68) of the anomeric configuration. Also, we propose that the beta-propeller architecture accommodates distinct binding sites for the acceptor saccharide in glycosyl transfer reaction. Copyright 2003 Wiley-Liss, Inc.

  17. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India.

    PubMed

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-03-01

    Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.

  18. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India

    PubMed Central

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-01-01

    Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199

  19. Comparative sequence analysis suggests a conserved gating mechanism for TRP channels

    PubMed Central

    Palovcak, Eugene; Delemotte, Lucie; Klein, Michael L.

    2015-01-01

    The transient receptor potential (TRP) channel superfamily plays a central role in transducing diverse sensory stimuli in eukaryotes. Although dissimilar in sequence and domain organization, all known TRP channels act as polymodal cellular sensors and form tetrameric assemblies similar to those of their distant relatives, the voltage-gated potassium (Kv) channels. Here, we investigated the related questions of whether the allosteric mechanism underlying polymodal gating is common to all TRP channels, and how this mechanism differs from that underpinning Kv channel voltage sensitivity. To provide insight into these questions, we performed comparative sequence analysis on large, comprehensive ensembles of TRP and Kv channel sequences, contextualizing the patterns of conservation and correlation observed in the TRP channel sequences in light of the well-studied Kv channels. We report sequence features that are specific to TRP channels and, based on insight from recent TRPV1 structures, we suggest a model of TRP channel gating that differs substantially from the one mediating voltage sensitivity in Kv channels. The common mechanism underlying polymodal gating involves the displacement of a defect in the H-bond network of S6 that changes the orientation of the pore-lining residues at the hydrophobic gate. PMID:26078053

  20. A sulfated alpha-L-fucan from sea cucumber.

    PubMed

    Ribeiro, A C; Vieira, R P; Mourão, P A; Mulloy, B

    1994-03-04

    A purified sulfated alpha-L-fucan from the sea cucumber body wall was studied, before and after almost complete desulfation, using methylation analysis and NMR spectroscopy. NMR analysis indicates that 2,4-di-O-sulfo-L-fucopyranose and unsubstituted fucopyranose are present in equal proportions, and that 2-O-sulfo-L-fucopyranose is present in twice that proportion. There is some NMR evidence that a regular repeating sequence of four residues comprises most or all of the polysaccharide chain.

  1. Computational analysis of the receptor binding specificity of novel influenza A/H7N9 viruses.

    PubMed

    Zhou, Xinrui; Zheng, Jie; Ivan, Fransiskus Xaverius; Yin, Rui; Ranganathan, Shoba; Chow, Vincent T K; Kwoh, Chee-Keong

    2018-05-09

    Influenza viruses are undergoing continuous and rapid evolution. The fatal influenza A/H7N9 has drawn attention since the first wave of infections in March 2013, and raised more grave concerns with its increased potential to spread among humans. Experimental studies have revealed several host and virulence markers, indicating differential host binding preferences which can help estimate the potential of causing a pandemic. Here we systematically investigate the sequence pattern and structural characteristics of novel influenza A/H7N9 using computational approaches. The sequence analysis highlighted mutations in protein functional domains of influenza viruses. Molecular docking and molecular dynamics simulation revealed that the hemagglutinin (HA) of A/Taiwan/1/2017(H7N9) strain enhanced the binding with both avian and human receptor analogs, compared with the previous A/Shanghai/02/2013(H7N9) strain. The Molecular Mechanics - Poisson Boltzmann Surface Area (MM-PBSA) calculation revealed the change of residue-ligand interaction energy and detected the residues with conspicuous binding preference. The results are novel and specific to the emerging influenza A/Taiwan/1/2017(H7N9) strain compared with A/Shanghai/02/2013(H7N9). Its enhanced ability to bind human receptor analogs, which are abundant in the human upper respiratory tract, may be responsible for the recent outbreak. Residues showing binding preference were detected, which could facilitate monitoring the circulating influenza viruses.

  2. Mass spectrometric determination of early and advanced glycation in biology.

    PubMed

    Rabbani, Naila; Ashour, Amal; Thornalley, Paul J

    2016-08-01

    Protein glycation in biological systems occurs predominantly on lysine, arginine and N-terminal residues of proteins. Major quantitative glycation adducts are found at mean extents of modification of 1-5 mol percent of proteins. These are glucose-derived fructosamine on lysine and N-terminal residues of proteins, methylglyoxal-derived hydroimidazolone on arginine residues and N(ε)-carboxymethyl-lysine residues mainly formed by the oxidative degradation of fructosamine. Total glycation adducts of different types are quantified by stable isotopic dilution analysis liquid chromatography-tandem mass spectrometry (LC-MS/MS) in multiple reaction monitoring mode. Metabolism of glycated proteins is followed by LC-MS/MS of glycation free adducts as minor components of the amino acid metabolome. Glycated proteins and sites of modification within them - amino acid residues modified by the glycating agent moiety - are identified and quantified by label-free and stable isotope labelling with amino acids in cell culture (SILAC) high resolution mass spectrometry. Sites of glycation by glucose and methylglyoxal in selected proteins are listed. Key issues in applying proteomics techniques to analysis of glycated proteins are: (i) avoiding compromise of analysis by formation, loss and relocation of glycation adducts in pre-analytic processing; (ii) specificity of immunoaffinity enrichment procedures, (iii) maximizing protein sequence coverage in mass spectrometric analysis for detection of glycation sites, and (iv) development of bioinformatics tools for prediction of protein glycation sites. Protein glycation studies have important applications in biology, ageing and translational medicine - particularly on studies of obesity, diabetes, cardiovascular disease, renal failure, neurological disorders and cancer. Mass spectrometric analysis of glycated proteins has yet to find widespread use clinically. Future use in health screening, disease diagnosis and therapeutic monitoring, and drug and functional food development is expected. A protocol for high resolution mass spectrometry proteomics of glycated proteins is given.

  3. Porcine MAP3K5 analysis: molecular cloning, characterization, tissue expression pattern, and copy number variations associated with residual feed intake.

    PubMed

    Pu, L; Zhang, L C; Zhang, J S; Song, X; Wang, L G; Liang, J; Zhang, Y B; Liu, X; Yan, H; Zhang, T; Yue, J W; Li, N; Wu, Q Q; Wang, L X

    2016-08-12

    Mitogen-activated protein kinase kinase kinase 5 (MAP3K5) is essential for apoptosis, proliferation, differentiation, and immune responses, and is a candidate marker for residual feed intake (RFI) in pig. We cloned the full-length cDNA sequence of porcine MAP3K5 by rapid-amplification of cDNA ends. The 5451-bp gene contains a 5'-untranslated region (UTR) (718 bp), a coding region (3738 bp), and a 3'-UTR (995 bp), and encodes a peptide of 1245 amino acids, which shares 97, 99, 97, 93, 91, and 84% sequence identity with cattle, sheep, human, mouse, chicken, and zebrafish MAP3K5, respectively. The deduced MAP3K5 protein sequence contains two conserved domains: a DUF4071 domain and a protein kinase domain. Phylogenetic analysis showed that porcine MAP3K5 forms a separate branch to vicugna and camel MAP3K5. Tissue expression analysis using real-time quantitative polymerase chain reaction (qRT-PCR) revealed that MAP3K5 was expressed in the heart, liver, spleen, lung, kidney, muscle, fat, pancrea, ileum, and stomach tissues. Copy number variation was detected for porcine MAP3K5 and validated by qRT-PCR. Furthermore, a significant increase in average copy number was detected in the low RFI group when compared to the high RFI group in a Duroc pig population. These results provide useful information regarding the influence of MAP3K5 on RFI in pigs.

  4. A diverse family of serine proteinase genes expressed in cotton boll weevil (Anthonomus grandis): implications for the design of pest-resistant transgenic cotton plants.

    PubMed

    Oliveira-Neto, Osmundo B; Batista, João A N; Rigden, Daniel J; Fragoso, Rodrigo R; Silva, Rodrigo O; Gomes, Eliane A; Franco, Octávio L; Dias, Simoni C; Cordeiro, Célia M T; Monnerat, Rose G; Grossi-De-Sá, Maria F

    2004-09-01

    Fourteen different cDNA fragments encoding serine proteinases were isolated by reverse transcription-PCR from cotton boll weevil (Anthonomus grandis) larvae. A large diversity between the sequences was observed, with a mean pairwise identity of 22% in the amino acid sequence. The cDNAs encompassed 11 trypsin-like sequences classifiable into three families and three chymotrypsin-like sequences belonging to a single family. Using a combination of 5' and 3' RACE, the full-length sequence was obtained for five of the cDNAs, named Agser2, Agser5, Agser6, Agser10 and Agser21. The encoded proteins included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Southern blotting analysis suggested that one or two copies of these serine proteinase genes exist in the A. grandis genome. Northern blotting analysis of Agser2 and Agser5 showed that for both genes, expression is induced upon feeding and is concentrated in the gut of larvae and adult insects. Reverse northern analysis of the 14 cDNA fragments showed that only two trypsin-like and two chymotrypsin-like were expressed at detectable levels. Under the effect of the serine proteinase inhibitors soybean Kunitz trypsin inhibitor and black-eyed pea trypsin/chymotrypsin inhibitor, expression of one of the trypsin-like sequences was upregulated while expression of the two chymotrypsin-like sequences was downregulated. Copyright 2004 Elsevier Ltd.

  5. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

    PubMed

    Wang, Sheng; Sun, Siqi; Li, Zhen; Zhang, Renyu; Xu, Jinbo

    2017-01-01

    Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. http://raptorx.uchicago.edu/ContactMap/.

  6. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

    PubMed Central

    Li, Zhen; Zhang, Renyu

    2017-01-01

    Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. Availability http://raptorx.uchicago.edu/ContactMap/ PMID:28056090

  7. Prediction of beta-turns from amino acid sequences using the residue-coupled model.

    PubMed

    Guruprasad, K; Shukla, S

    2003-04-01

    We evaluated the prediction of beta-turns from amino acid sequences using the residue-coupled model with an enlarged representative protein data set selected from the Protein Data Bank. Our results show that the probability values derived from a data set comprising 425 protein chains yielded an overall beta-turn prediction accuracy 68.74%, compared with 94.7% reported earlier on a data set of 30 proteins using the same method. However, we noted that the overall beta-turn prediction accuracy using probability values derived from the 30-protein data set reduces to 40.74% when tested on the data set comprising 425 protein chains. In contrast, using probability values derived from the 425 data set used in this analysis, the overall beta-turn prediction accuracy yielded consistent results when tested on either the 30-protein data set (64.62%) used earlier or a more recent representative data set comprising 619 protein chains (64.66%) or on a jackknife data set comprising 476 representative protein chains (63.38%). We therefore recommend the use of probability values derived from the 425 representative protein chains data set reported here, which gives more realistic and consistent predictions of beta-turns from amino acid sequences.

  8. In silico analysis of β-mannanases and β-mannosidase from Aspergillus flavus and Trichoderma virens UKM1

    NASA Astrophysics Data System (ADS)

    Yee, Chai Sin; Murad, Abdul Munir Abdul; Bakar, Farah Diba Abu

    2013-11-01

    A gene encoding an endo-β-1,4-mannanase from Trichoderma virens UKM1 (manTV) and Aspergillus flavus UKM1 (manAF) was analysed with bioinformatic tools. In addition, A. flavus NRRL 3357 genome database was screened for a β-mannosidase gene and analysed (mndA-AF). These three genes were analysed to understand their gene properties. manTV and manAF both consists of 1,332-bp and 1,386-bp nucleotides encoding 443 and 461 amino acid residues, respectively. Both the endo-β-1,4-mannanases belong to the glycosyl hydrolase family 5 and contain a carbohydrate-binding module family 1 (CBM1). On the other hand, mndA-AF which is a 2,745-bp gene encodes a protein sequence of 914 amino acid residues. This β-mannosidase belongs to the glycosyl hydrolase family 2. Predicted molecular weight of manTV, manAF and mndA-AF are 47.74 kDa, 49.71 kDa and 103 kDa, respectively. All three predicted protein sequences possessed signal peptide sequence and are highly conserved among other fungal β-mannanases and β-mannosidases.

  9. Polysaccharides from heterocyst and spore envelopes of a blue-green alga. [Anabaena cylindrica

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cardemil, L.; Wolk, C.P.

    The polysaccharides from the envelopes of heterocysts and spores of Anabaena cylindrica consist of repeating units containing 1 mannosyl and 3 glucosyl residues, all linked by ..beta..(1 ..-->.. 3) glucosidic bonds, with glucose, xylose, galactose, and mannose present in side branches. Degradation of the polysaccharides with specific glycosidases has permitted identification of the linkages to almost all of the branches. When the polysaccharides, from which all but two types of side branches had been cleaved, were digested with a ..beta..(1 ..-->.. 3) endoglucanase, glucose, a tri-, and a pentasaccharide were produced. The oligosaccharide products were identified. The backbones of themore » polysaccharides were sequenced from the reducing terminus by a modified Smith degradation. Analysis with NaB/sup 3/H/sub 4/ at each stage of the degradation showed that the backbones terminate in the sequence Man-Glc-Glc-Glc and are therefore presumed to have the structure (Man-Glc-Glc-Glc)/sub n/, and that they contain an average of from 128 to 150 sugar residues. From the information obtained, the repeating sequences of the original polysaccharides from the two types of differentiated cells of A. cylindrica could be largely deduced and appeared to be identical.« less

  10. Hidden Structural Codes in Protein Intrinsic Disorder.

    PubMed

    Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo

    2017-10-17

    Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.

  11. SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments.

    PubMed

    Jessen, Leon Eyrich; Hoof, Ilka; Lund, Ole; Nielsen, Morten

    2013-07-01

    Identifying which mutation(s) within a given genotype is responsible for an observable phenotype is important in many aspects of molecular biology. Here, we present SigniSite, an online application for subgroup-free residue-level genotype-phenotype correlation. In contrast to similar methods, SigniSite does not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype. SigniSite will then identify which amino acid residues are significantly associated with the data set phenotype. As output, SigniSite displays a sequence logo, depicting the strength of the phenotype association of each residue and a heat-map identifying 'hot' or 'cold' regions. SigniSite was benchmarked against SPEER, a state-of-the-art method for the prediction of specificity determining positions (SDP) using a set of human immunodeficiency virus protease-inhibitor genotype-phenotype data and corresponding resistance mutation scores from the Stanford University HIV Drug Resistance Database, and a data set of protein families with experimentally annotated SDPs. For both data sets, SigniSite was found to outperform SPEER. SigniSite is available at: http://www.cbs.dtu.dk/services/SigniSite/.

  12. Genome-wide analysis of esterase-like genes in the striped rice stem borer, Chilo suppressalis.

    PubMed

    Wang, Baoju; Wang, Ying; Zhang, Yang; Han, Ping; Li, Fei; Han, Zhaojun

    2015-06-01

    The striped rice stem borer, Chilo suppressalis, a destructive pest of rice, has developed high levels of resistance to certain insecticides. Esterases are reported to be involved in insecticide resistance in several insects. Therefore, this study systematically analyzed esterase-like genes in C. suppressalis. Fifty-one esterase-like genes were identified in the draft genomic sequences of the species, and 20 cDNA sequences were derived which encoded full- or nearly full-length proteins. The putative esterase proteins derived from these full-length genes are overall highly diversified. However, key residues that are functionally important including the serine residue in the active site are conserved in 18 out of the 20 proteins. Phylogenetic analysis revealed that most of these genes have homologues in other lepidoptera insects. Genes CsuEst6, CsuEst10, CsuEst11, and CsuEst51 were induced by the insecticide triazophos, and genes CsuEst9, CsuEst11, CsuEst14, and CsuEst51 were induced by the insecticide chlorantraniliprole. Our results provide a foundation for future studies of insecticide resistance in C. suppressalis and for comparative research with esterase genes from other insect species.

  13. Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis

    DOE PAGES

    Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.; ...

    2017-10-16

    This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less

  14. Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.

    This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less

  15. A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy.

    PubMed

    Xiong, Dapeng; Zeng, Jianyang; Gong, Haipeng

    2017-09-01

    Residue-residue contacts are of great value for protein structure prediction, since contact information, especially from those long-range residue pairs, can significantly reduce the complexity of conformational sampling for protein structure prediction in practice. Despite progresses in the past decade on protein targets with abundant homologous sequences, accurate contact prediction for proteins with limited sequence information is still far from satisfaction. Methodologies for these hard targets still need further improvement. We presented a computational program DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-range residue contacts from primary sequences. When compared with previous prediction approaches, our framework employed an effective scheme to identify optimal and important features for contact prediction, and was only trained with coevolutionary information derived from a limited number of homologous sequences to ensure robustness and usefulness for hard targets. Independent tests showed that 59.33%/49.97%, 64.39%/54.01% and 70.00%/59.81% of the top L/5, top L/10 and top 5 predictions were correct for CASP10/CASP11 proteins, respectively. In general, our algorithm ranked as one of the best methods for CASP targets. All source data and codes are available at http://166.111.152.91/Downloads.html . hgong@tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  16. Proline: The Distribution, Frequency, Positioning, and Common Functional Roles of Proline and Polyproline Sequences in the Human Proteome

    PubMed Central

    Morgan, Alexander A.; Rubenstein, Edward

    2013-01-01

    Proline is an anomalous amino acid. Its nitrogen atom is covalently locked within a ring, thus it is the only proteinogenic amino acid with a constrained phi angle. Sequences of three consecutive prolines can fold into polyproline helices, structures that join alpha helices and beta pleats as architectural motifs in protein configuration. Triproline helices are participants in protein-protein signaling interactions. Longer spans of repeat prolines also occur, containing as many as 27 consecutive proline residues. Little is known about the frequency, positioning, and functional significance of these proline sequences. Therefore we have undertaken a systematic bioinformatics study of proline residues in proteins. We analyzed the distribution and frequency of 687,434 proline residues among 18,666 human proteins, identifying single residues, dimers, trimers, and longer repeats. Proline accounts for 6.3% of the 10,882,808 protein amino acids. Of all proline residues, 4.4% are in trimers or longer spans. We detected patterns that influence function based on proline location, spacing, and concentration. We propose a classification based on proline-rich, polyproline-rich, and proline-poor status. Whereas singlet proline residues are often found in proteins that display recurring architectural patterns, trimers or longer proline sequences tend be associated with the absence of repetitive structural motifs. Spans of 6 or more are associated with DNA/RNA processing, actin, and developmental processes. We also suggest a role for proline in Kruppel-type zinc finger protein control of DNA expression, and in the nucleation and translocation of actin by the formin complex. PMID:23372670

  17. Cloning and characterization of the novel D-aspartyl endopeptidase, paenidase, from Paenibacillus sp. B38.

    PubMed

    Nirasawa, Satoru; Nakahara, Kazuhiko; Takahashi, Saori

    2018-02-27

    Paenidase is the first microorganism-derived D-aspartyl endopeptidase that specifically recognizes an internal D-Asp residue to cleave [D-Asp]-X peptide bonds. Using peptide sequences obtained from the protein, we performed PCR with degenerate primers to amplify the paenidase I-encoding gene. Nucleotide sequencing revealed that mature paenidase I consists of 322 amino acid residues and that the protein is encoded as a pro-protein with a 197-amino-acid N-terminal extension compared to the mature protein. Paenidase I exhibits amino acid sequence similarity to several penicillin-binding proteins. In addition, paenidase I was classified into peptidase family S12 based on a MEROPS database search. Family S12 contains serine-type D-Ala-D-Ala carboxypeptidases that have three active site residues (Ser, Lys, and Tyr) in the conserved motifs Ser-Xaa-Thr-Lys and Tyr-Xaa-Asn. These motifs were conserved in the primary structure of paenidase I, and the role of these residues was confirmed by site-directed mutagenesis.

  18. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-01-01

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363

  19. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.

    PubMed

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.

  20. Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins

    PubMed Central

    Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie

    2005-01-01

    Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486

  1. Molecular cloning and sequence analysis of full-length growth hormone cDNAs from six important economic fishes.

    PubMed

    Zhang, Jing-Nan; Song, Ping; Hu, Jia-Rui; Mo, Sai-Jun; Peng, Mao-Yu; Zhou, Wei; Zou, Ji-Xing; Hu, Yin-Chang

    2005-01-01

    In this study,the full-length cDNAs of GH (Growth Hormone) gene was isolated from six important economic fishes, Siniperca kneri, Epinephelus coioides, Monopterus albus, Silurus asotus, Misgurnus anguillicaudatus and Carassius auratus gibelio Bloch. It is the first time to clone these GH sequences except E. coioides GH. The lengths of the above cDNAs are as follows: 953 bp, 1 023 bp, 825 bp, 1 082 bp, 1 154 bp and 1 180 bp. Each sequence includes an ORF of about 600 bp which encodes a protein of about 200 amino acid: S. kneri, E. coioides and M. albus GHs of 204 amino acid, S. asotus GH of 200 amino acid, M. anguillicaudatus and C. auratus gibelio GHs of 210 amino acid. Then detailed sequence analysis of the six GHs with many other fish sequences was performed. The six sequences all showed high homology to other sequences, especially to sequences within the same order, and many conserved residues were identified, most localized in five domains. The phylogenetic trees (MP and NJ) of many fish GH ORF sequences (including the new six) with Amia calva as outgroup were generally resolved and largely congruent with the morphology-based tree though some incongruities were observed, suggesting GH ORF should be paid more attention to in teleostean phylogeny.

  2. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  3. Evolution and the Distribution of Glutaminyl and Asparaginyl Residues in Proteins

    PubMed Central

    Robinson, Arthur B.

    1974-01-01

    Recent experiments on the deamidation of glutaminyl and asparaginyl residues in peptides and proteins support the hypothesis that these residues may serve as molecular clocks that control biological processes. A hypothesis is now offered that suggests that these molecular clocks are set by rejection or accumulation of appropriate sequences of residues including a glutaminyl or asparaginyl residue during evolution. PMID:4522799

  4. Homonuclear Hartmann-Hahn transfer with reduced relaxation losses by use of the MOCCA-XY16 multiple pulse sequence

    NASA Astrophysics Data System (ADS)

    Furrer, Julien; Kramer, Frank; Marino, John P.; Glaser, Steffen J.; Luy, Burkhard

    2004-01-01

    Homonuclear Hartmann-Hahn transfer is one of the most important building blocks in modern high-resolution NMR. It constitutes a very efficient transfer element for the assignment of proteins, nucleic acids, and oligosaccharides. Nevertheless, in macromolecules exceeding ˜10 kDa TOCSY-experiments can show decreasing sensitivity due to fast transverse relaxation processes that are active during the mixing periods. In this article we propose the MOCCA-XY16 multiple pulse sequence, originally developed for efficient TOCSY transfer through residual dipolar couplings, as a homonuclear Hartmann-Hahn sequence with improved relaxation properties. A theoretical analysis of the coherence transfer via scalar couplings and its relaxation behavior as well as experimental transfer curves for MOCCA-XY16 relative to the well-characterized DIPSI-2 multiple pulse sequence are given.

  5. Homonuclear Hartmann-Hahn transfer with reduced relaxation losses by use of the MOCCA-XY16 multiple pulse sequence.

    PubMed

    Furrer, Julien; Kramer, Frank; Marino, John P; Glaser, Steffen J; Luy, Burkhard

    2004-01-01

    Homonuclear Hartmann-Hahn transfer is one of the most important building blocks in modern high-resolution NMR. It constitutes a very efficient transfer element for the assignment of proteins, nucleic acids, and oligosaccharides. Nevertheless, in macromolecules exceeding approximately 10 kDa TOCSY-experiments can show decreasing sensitivity due to fast transverse relaxation processes that are active during the mixing periods. In this article we propose the MOCCA-XY16 multiple pulse sequence, originally developed for efficient TOCSY transfer through residual dipolar couplings, as a homonuclear Hartmann-Hahn sequence with improved relaxation properties. A theoretical analysis of the coherence transfer via scalar couplings and its relaxation behavior as well as experimental transfer curves for MOCCA-XY16 relative to the well-characterized DIPSI-2 multiple pulse sequence are given.

  6. Quality Control Test for Sequence-Phenotype Assignments

    PubMed Central

    Ortiz, Maria Teresa Lara; Rosario, Pablo Benjamín Leon; Luna-Nevarez, Pablo; Gamez, Alba Savin; Martínez-del Campo, Ana; Del Rio, Gabriel

    2015-01-01

    Relating a gene mutation to a phenotype is a common task in different disciplines such as protein biochemistry. In this endeavour, it is common to find false relationships arising from mutations introduced by cells that may be depurated using a phenotypic assay; yet, such phenotypic assays may introduce additional false relationships arising from experimental errors. Here we introduce the use of high-throughput DNA sequencers and statistical analysis aimed to identify incorrect DNA sequence-phenotype assignments and observed that 10–20% of these false assignments are expected in large screenings aimed to identify critical residues for protein function. We further show that this level of incorrect DNA sequence-phenotype assignments may significantly alter our understanding about the structure-function relationship of proteins. We have made available an implementation of our method at http://bis.ifc.unam.mx/en/software/chispas. PMID:25700273

  7. Characteristics of purple nonsulfur bacteria grown under Stevia residue extractions.

    PubMed

    Xu, J; Feng, Y; Wang, Y; Lin, X

    2013-11-01

    As a consequence of the large-scale cultivation of Stevia plants, releases of plant residues, the byproduct after sweetener extraction, to the environment are inevitable. Stevia residue and its effluent after batching up contain large amounts of organic matters with small molecular weight, which therefore are a potential pollution source. Meanwhile, they are favourite substrates for micro-organism growths. This investigation was aimed to utilize the simulated effluent of Stevia residue to enrich the representative purple nonsulfur bacterium (PNSB), Rhodopseudomonas palustris (Rps. palustris), which has important economic values. The growth profile and quality of Rps. palustris were characterized by spectrophotometry, compared to those grown in common PNSB mineral synthetic medium. Our results revealed that the simulated effluent of Stevia residue not only stimulated Rps. palustris growth to a greater extent, but also increased its physiologically active cytochrome concentrations and excreted indole-3-acetic acid (IAA) content. This variation in phenotype of Rps. palustris could result from the shift in its genotype, further revealed by the repetitive sequence-based PCR (rep-PCR) fingerprinting analysis. Our results showed that the effluent of Stevia residue was a promising substrate for microbial growth. © 2013 The Society for Applied Microbiology.

  8. Nucleotide sequence and transcriptional start site of the Methylobacterium organophilum XX methanol dehydrogenase structural gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Machlin, S.M.; Hanson, R.S.

    The nucleotide sequence of a cloned 2.5-kilobase-pair SmaI fragment containing the methanol dehydrogenase (MDH) structural gene from Methylobacterium organophilum XX was determined. A single open reading frame with a coding capacity of 626 amino acids (molecular weight, 66,000) was identified on one stand, and N-terminal sequencing of purified MDH revealed that 27 of these residues constituted a putative signal peptide. Primer extension mapping of in vivo transcripts indicated that the start of mRNA synthesis was 160 to 170 base pairs upstream of the ATG codon. Northern (RNA) blot analysis further demonstrated that the transcript was 2.1 kilobase pairs in lengthmore » and therefore appeared to encode only MDH.« less

  9. Backbone hydration determines the folding signature of amino acid residues.

    PubMed

    Bignucolo, Olivier; Leung, Hoi Tik Alvin; Grzesiek, Stephan; Bernèche, Simon

    2015-04-08

    The relation between the sequence of a protein and its three-dimensional structure remains largely unknown. A lasting dream is to elucidate the side-chain-dependent driving forces that govern the folding process. Different structural data suggest that aromatic amino acids play a particular role in the stabilization of protein structures. To better understand the underlying mechanism, we studied peptides of the sequence EGAAXAASS (X = Gly, Ile, Tyr, Trp) through comparison of molecular dynamics (MD) trajectories and NMR residual dipolar coupling (RDC) measurements. The RDC data for aromatic substitutions provide evidence for a kink in the peptide backbone. Analysis of the MD simulations shows that the formation of internal hydrogen bonds underlying a helical turn is key to reproduce the experimental RDC values. The simulations further reveal that the driving force leading to such helical-turn conformations arises from the lack of hydration of the peptide chain on either side of the bulky aromatic side chain, which can potentially act as a nucleation point initiating the folding process.

  10. Identification of Milk Component in Ancient Food Residue by Proteomics

    PubMed Central

    Hong, Chuan; Jiang, Hongen; Lü, Enguo; Wu, Yunfei; Guo, Lihai; Xie, Yongming; Wang, Changsui; Yang, Yimin

    2012-01-01

    Background Proteomic approaches based on mass spectrometry have been recently used in archaeological and art researches, generating promising results for protein identification. Little information is known about eastward spread and eastern limits of prehistoric milking in eastern Eurasia. Methodology/Principal Finding In this paper, an ancient visible food remain from Subeixi Cemeteries (cal. 500 to 300 years BC) of the Turpan Basin in Xinjiang, China, preliminarily determined containing 0.432 mg/kg cattle casein with ELISA, was analyzed by using an improved method based on liquid chromatography (LC) coupled with MALDI-TOF/TOF-MS to further identify protein origin. The specific sequence of bovine casein and the homology sequence of goat/sheep casein were identified. Conclusions/Significance The existence of milk component in ancient food implies goat/sheep and cattle milking in ancient Subeixi region, the furthest eastern location of prehistoric milking in the Old World up to date. It is envisioned that this work provides a new approach for ancient residue analysis and other archaeometry field. PMID:22615887

  11. LoopX: A Graphical User Interface-Based Database for Comprehensive Analysis and Comparative Evaluation of Loops from Protein Structures.

    PubMed

    Kadumuri, Rajashekar Varma; Vadrevu, Ramakrishna

    2017-10-01

    Due to their crucial role in function, folding, and stability, protein loops are being targeted for grafting/designing to create novel or alter existing functionality and improve stability and foldability. With a view to facilitate a thorough analysis and effectual search options for extracting and comparing loops for sequence and structural compatibility, we developed, LoopX a comprehensively compiled library of sequence and conformational features of ∼700,000 loops from protein structures. The database equipped with a graphical user interface is empowered with diverse query tools and search algorithms, with various rendering options to visualize the sequence- and structural-level information along with hydrogen bonding patterns, backbone φ, ψ dihedral angles of both the target and candidate loops. Two new features (i) conservation of the polar/nonpolar environment and (ii) conservation of sequence and conformation of specific residues within the loops have also been incorporated in the search and retrieval of compatible loops for a chosen target loop. Thus, the LoopX server not only serves as a database and visualization tool for sequence and structural analysis of protein loops but also aids in extracting and comparing candidate loops for a given target loop based on user-defined search options.

  12. Uncertainty and sensitivity analysis for photovoltaic system modeling.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hansen, Clifford W.; Pohl, Andrew Phillip; Jordan, Dirk

    2013-12-01

    We report an uncertainty and sensitivity analysis for modeling DC energy from photovoltaic systems. We consider two systems, each comprised of a single module using either crystalline silicon or CdTe cells, and located either at Albuquerque, NM, or Golden, CO. Output from a PV system is predicted by a sequence of models. Uncertainty in the output of each model is quantified by empirical distributions of each model's residuals. We sample these distributions to propagate uncertainty through the sequence of models to obtain an empirical distribution for each PV system's output. We considered models that: (1) translate measured global horizontal, directmore » and global diffuse irradiance to plane-of-array irradiance; (2) estimate effective irradiance from plane-of-array irradiance; (3) predict cell temperature; and (4) estimate DC voltage, current and power. We found that the uncertainty in PV system output to be relatively small, on the order of 1% for daily energy. Four alternative models were considered for the POA irradiance modeling step; we did not find the choice of one of these models to be of great significance. However, we observed that the POA irradiance model introduced a bias of upwards of 5% of daily energy which translates directly to a systematic difference in predicted energy. Sensitivity analyses relate uncertainty in the PV system output to uncertainty arising from each model. We found that the residuals arising from the POA irradiance and the effective irradiance models to be the dominant contributors to residuals for daily energy, for either technology or location considered. This analysis indicates that efforts to reduce the uncertainty in PV system output should focus on improvements to the POA and effective irradiance models.« less

  13. Molecular cloning and characterization of Hymenolepis diminuta alpha-tubulin gene.

    PubMed

    Mohajer-Maghari, Behrokh; Amini-Bavil-Olyaee, Samad; Webb, Rodney A; Coe, Imogen R

    2007-02-01

    To isolate a full-length alpha-tubulin cDNA from an eucestode, Hymenolepis diminuta, a lambda phage cDNA library was constructed. The alpha-tubulin gene was cloned, sequenced and characterized. The H. diminuta alpha-tubulin consisted of 450 amino acids. This protein contained putative sites for all posttranslational modifications as detyrosination/tyrosination at the carboxyl-terminal of protien, phosphorylation at residues R79 and K336, glycylation/glutamylation at residue G445 and acetylation at residue K40. Comparisons of H. diminuta alpha-tubulin with all full-length alpha-tubulin proteins revealed that H. diminuta alpha-tubulin possesses 10 distinctive residues, which are not found in any other alpha-tubulins. Phylogenetic analysis showed that H. diminuta alpha-tubulin has grouped in a separated branch adjacent eucestode and trematodes branch with 92% bootstrap value (1000 replicates). In conclusion, this is the first report of H. diminuta cDNA library construction, cloning and characterization of H. diminuta alpha-tubulin gene.

  14. Not all transmembrane helices are born equal: Towards the extension of the sequence homology concept to membrane proteins

    PubMed Central

    2011-01-01

    Background Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues. Results We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. Conclusion For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Reviewers This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian. PMID:22024092

  15. Mapping PDB chains to UniProtKB entries.

    PubMed

    Martin, Andrew C R

    2005-12-01

    UniProtKB/SwissProt is the main resource for detailed annotations of protein sequences. This database provides a jumping-off point to many other resources through the links it provides. Among others, these include other primary databases, secondary databases, the Gene Ontology and OMIM. While a large number of links are provided to Protein Data Bank (PDB) files, obtaining a regularly updated mapping between UniProtKB entries and PDB entries at the chain or residue level is not straightforward. In particular, there is no regularly updated resource which allows a UniProtKB/SwissProt entry to be identified for a given residue of a PDB file. We have created a completely automatically maintained database which maps PDB residues to residues in UniProtKB/SwissProt and UniProtKB/trEMBL entries. The protocol uses links from PDB to UniProtKB, from UniProtKB to PDB and a brute-force sequence scan to resolve PDB chains for which no annotated link is available. Finally the sequences from PDB and UniProtKB are aligned to obtain a residue-level mapping. The resource may be queried interactively or downloaded from http://www.bioinf.org.uk/pdbsws/.

  16. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry and tandem mass spectrometry for peptide de novo amino acid sequencing for a seven-protein mixture by paired single-residue transposed Lys-N and Lys-C digestion.

    PubMed

    Guan, Xiaoyan; Brownstein, Naomi C; Young, Nicolas L; Marshall, Alan G

    2017-01-30

    Bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics to identify proteins from a sequence database. De novo sequencing is also available for sequencing peptides with relatively short sequence lengths. We recently showed that paired Lys-C and Lys-N proteases produce peptides of identical mass and similar retention time, but different tandem mass spectra. Such parallel experiments provide complementary information, and allow for up to 100% MS/MS sequence coverage. Here, we report digestion by paired Lys-C and Lys-N proteases of a seven-protein mixture: human hemoglobin alpha, bovine carbonic anhydrase 2, horse skeletal muscle myoglobin, hen egg white lysozyme, bovine pancreatic ribonuclease, bovine rhodanese, and bovine serum albumin, followed by reversed-phase nanoflow liquid chromatography, collision-induced dissociation, and 14.5 T Fourier transform ion cyclotron resonance mass spectrometry. Matched pairs of product peptide ions of equal precursor mass and similar retention times from each digestion are compared, leveraging single-residue transposed information with independent interferences to confidently identify fragment ion types, residues, and peptides. Selected pairs of product ion mass spectra for de novo sequenced protein segments from each member of the mixture are presented. Pairs of the transposed product ions as well as complementary information from the parallel experiments allow for both high MS/MS coverage for long peptide sequences and high confidence in the amino acid identification. Moreover, the parallel experiments in the de novo sequencing reduce false-positive matches of product ions from the single-residue transposed peptides from the same segment, and thereby further improve the confidence in protein identification. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Salt bridges: geometrically specific, designable interactions.

    PubMed

    Donald, Jason E; Kulp, Daniel W; DeGrado, William F

    2011-03-01

    Salt bridges occur frequently in proteins, providing conformational specificity and contributing to molecular recognition and catalysis. We present a comprehensive analysis of these interactions in protein structures by surveying a large database of protein structures. Salt bridges between Asp or Glu and His, Arg, or Lys display extremely well-defined geometric preferences. Several previously observed preferences are confirmed, and others that were previously unrecognized are discovered. Salt bridges are explored for their preferences for different separations in sequence and in space, geometric preferences within proteins and at protein-protein interfaces, co-operativity in networked salt bridges, inclusion within metal-binding sites, preference for acidic electrons, apparent conformational side chain entropy reduction on formation, and degree of burial. Salt bridges occur far more frequently between residues at close than distant sequence separations, but, at close distances, there remain strong preferences for salt bridges at specific separations. Specific types of complex salt bridges, involving three or more members, are also discovered. As we observe a strong relationship between the propensity to form a salt bridge and the placement of salt-bridging residues in protein sequences, we discuss the role that salt bridges might play in kinetically influencing protein folding and thermodynamically stabilizing the native conformation. We also develop a quantitative method to select appropriate crystal structure resolution and B-factor cutoffs. Detailed knowledge of these geometric and sequence dependences should aid de novo design and prediction algorithms. Copyright © 2010 Wiley-Liss, Inc.

  18. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.

    PubMed

    Jones, David T; Kandathil, Shaun M

    2018-04-26

    In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation. Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions. DeepCov is freely available at https://github.com/psipred/DeepCov. d.t.jones@ucl.ac.uk.

  19. Isolation and in silico analysis of a novel H+-pyrophosphatase gene orthologue from the halophytic grass Leptochloa fusca

    NASA Astrophysics Data System (ADS)

    Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid

    2017-02-01

    Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Midura, R.J.; McQuillan, D.J.; Benham, K.J.

    The rat osteosarcoma cell line (UMR 106-01) synthesizes and secretes relatively large amounts of a sulfated glycoprotein into its culture medium (approximately 240 ng/10(6) cells/day). This glycoprotein was purified, and amino-terminal sequence analysis identified it as bone sialoprotein (BSP). (35S)Sulfate, (3H)glucosamine, and (3H)tyrosine were used as metabolic precursors to label the BSP. Sulfate esters were found on N- and O-linked oligosaccharides and on tyrosine residues, with about half of the total tyrosines in the BSP being sulfated. The proportion of 35S activity in tyrosine-O-sulfate (approximately 70%) was greater than that in N-linked (approximately 20%) and O-linked (approximately 10%) oligosaccharides. Frommore » the deduced amino acid sequence for rat BSP, the results indicate that on average approximately 12 tyrosine residues, approximately 3 N-linked, and approximately 2 O-linked oligosaccharides are sulfated/molecule. The carboxyl-terminal quarter of the BSP probably contains most, if not all, of the sulfated tyrosine residues because this region of the polypeptide contains the necessary requirements for tyrosine sulfation. Oligosaccharide analyses indicated that for every N-linked oligosaccharide on the BSP, there are also approximately 2 hexa-, approximately 5 tetra-, and approximately 2 trisaccharides O-linked to serine and threonine residues. On average, the BSP synthesized by UMR 106-01 cells would contain a total of approximately 3 N-linked and approximately 25 of the above O-linked oligosaccharides. This large number of oligosaccharides is in agreement with the known carbohydrate content (approximately 50%) of the BSP.A« less

  1. Identification and characterization of novel reptile cathelicidins from elapid snakes.

    PubMed

    Zhao, Hui; Gan, Tong-Xiang; Liu, Xiao-Dong; Jin, Yang; Lee, Wen-Hui; Shen, Ji-Hong; Zhang, Yun

    2008-10-01

    Three cDNA sequences coding for elapid cathelicidins were cloned from constructed venom gland cDNA libraries of Naja atra, Bungarus fasciatus and Ophiophagus hannah. The open reading frames of the cloned elapid cathelicidins were all composed of 576bp and coded for 191 amino acid residue protein precursors. Each of the deduced elapid cathelicidin has a 22 amino acid residue signal peptide, a conserved cathelin domain of 135 amino acid residues and a mature antimicrobial peptide of 34 amino acid residues. Unlike the highly divergent cathelicidins in mammals, the nucleotide and deduced protein sequences of the three cloned elapid cathelicidins were remarkably conserved. All the elapid mature cathelicidins were predicted to be cleaved at Valine157 by elastase. OH-CATH, the deduced mature cathelicidin from king cobra, was chemically synthesized and it showed strong antibacterial activity against various bacteria with minimal inhibitory concentration of 1-20microg/ml in the presence of 1% NaCl. Meanwhile, the synthetic peptide showed no haemolytic activity toward human red blood cells even at a high dose of 200microg/ml. Phylogenetic analysis of cathelicidins from vertebrate suggested that elapid and viperid cathelicidins were grouped together in the tree. Snake cathelicidins were evolutionary closely related to the neutrophilic granule proteins (NGPs) from mouse, rat and rabbit. Snake cathelicidins also showed a close relationship with avian fowlicidins (1-3) and chicken myeloid antimicrobial peptide 27. Elapid cathelicidins might be used as models for the development of novel therapeutic drugs.

  2. Sequence Effect on the Formation of DNA Minidumbbells.

    PubMed

    Liu, Yuan; Lam, Sik Lok

    2017-11-16

    The DNA minidumbbell (MDB) is a recently identified non-B structure. The reported MDBs contain two TTTA, CCTG, or CTTG type II loops. At present, the knowledge and understanding of the sequence criteria for MDB formation are still limited. In this study, we performed a systematic high-resolution nuclear magnetic resonance (NMR) and native gel study to investigate the effect of sequence variations in tandem repeats on the formation of MDBs. Our NMR results reveal the importance of hydrogen bonds, base-base stacking, and hydrophobic interactions from each of the participating residues. We conclude that in the MDBs formed by tandem repeats, C-G loop-closing base pairs are more stabilizing than T-A loop-closing base pairs, and thymine residues in both the second and third loop positions are more stabilizing than cytosine residues. The results from this study enrich our knowledge on the sequence criteria for the formation of MDBs, paving a path for better exploring their potential roles in biological systems and DNA nanotechnology.

  3. Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.

    PubMed

    Garrido-Martín, Diego; Pazos, Florencio

    2018-02-27

    The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.

  4. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

    PubMed Central

    2012-01-01

    Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672

  5. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

    PubMed

    Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl

    2012-07-13

    Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.

  6. Substrate specificity of mitochondrial intermediate peptidase analysed by a support-bound peptide library

    PubMed Central

    Marcondes, M.F.M.; Alves, F.M.; Assis, D.M.; Hirata, I.Y.; Juliano, L.; Oliveira, V.; Juliano, M.A.

    2015-01-01

    The substrate specificity of recombinant human mitochondrial intermediate peptidase (hMIP) using a synthetic support-bound FRET peptide library is presented. The collected fluorescent beads, which contained the hydrolysed peptides generated by hMIP, were sequenced by Edman degradation. The results showed that this peptidase presents a remarkable preference for polar uncharged residues at P1 and P1′ substrate positions: Ser = Gln > Thr at P1 and Ser > Thr at P1′. Non-polar residues were frequent at the substrate P3, P2, P2′ and P3′ positions. Analysis of the predicted MIP processing sites in imported mitochondrial matrix proteins shows these cleavages indeed occur between polar uncharged residues. Previous analysis of these processing sites indicated the importance of positions far from the MIP cleavage site, namely the presence of a hydrophobic residue (Phe or Leu) at P8 and a polar uncharged residue (Ser or Thr) at P5. To evaluate this, additional kinetic analyses were carried out, using fluorogenic substrates synthesized based on the processing sites attributed to MIP. The results described here underscore the importance of the P1 and P1′ substrate positions for the hydrolytic activity of hMIP. The information presented in this work will help in the design of new substrate-based inhibitors for this peptidase. PMID:26082885

  7. Identification of novel protein domains required for the expression of an active dehydratase fragment from a polyunsaturated fatty acid synthase.

    PubMed

    Oyola-Robles, Delise; Gay, Darren C; Trujillo, Uldaeliz; Sánchez-Parés, John M; Bermúdez, Mei-Ling; Rivera-Díaz, Mónica; Carballeira, Néstor M; Baerga-Ortiz, Abel

    2013-07-01

    Polyunsaturated fatty acids (PUFAs) are made in some strains of deep-sea bacteria by multidomain proteins that catalyze condensation, ketoreduction, dehydration, and enoyl-reduction. In this work, we have used the Udwary-Merski Algorithm sequence analysis tool to define the boundaries that enclose the dehydratase (DH) domains in a PUFA multienzyme. Sequence analysis revealed the presence of four areas of high structure in a region that was previously thought to contain only two DH domains as defined by FabA-homology. The expression of the protein fragment containing all four protein domains resulted in an active enzyme, while shorter protein fragments were not soluble. The tetradomain fragment was capable of catalyzing the conversion of crotonyl-CoA to β-hydroxybutyryl-CoA efficiently, as shown by UV absorbance change as well as by chromatographic retention of reaction products. Sequence alignments showed that the two novel domains contain as much sequence conservation as the FabA-homology domains, suggesting that they too may play a functional role in the overall reaction. Structure predictions revealed that all domains belong to the hotdog protein family: two of them contain the active site His70 residue present in FabA-like DHs, while the remaining two do not. Replacing the active site His residues in both FabA domains for Ala abolished the activity of the tetradomain fragment, indicating that the DH activity is contained within the FabA-homology regions. Taken together, these results provide a first glimpse into a rare arrangement of DH domains which constitute a defining feature of the PUFA synthases. Copyright © 2013 The Protein Society.

  8. Molecular cloning and characterization of rhesus monkey platelet glycoprotein Ibα, a major ligand-binding subunit of GPIb-IX-V complex.

    PubMed

    Qiao, Jianlin; Shen, Yang; Shi, Meimei; Lu, Yanrong; Cheng, Jingqiu; Chen, Younan

    2014-05-01

    Through binding to von Willebrand factor (VWF), platelet glycoprotein (GP) Ibα, the major ligand-binding subunit of the GPIb-IX-V complex, initiates platelet adhesion and aggregation in response to exposed VWF or elevated fluid-shear stress. There is little data regarding non-human primate platelet GPIbα. This study cloned and characterized rhesus monkey (Macaca Mullatta) platelet GPIbα. DNAMAN software was used for sequence analysis and alignment. N/O-glycosylation sites and 3-D structure modelling were predicted by online OGPET v1.0, NetOGlyc 1.0 Server and SWISS-MODEL, respectively. Platelet function was evaluated by ADP- or ristocetin-induced platelet aggregation. Rhesus monkey GPIbα contains 2,268 nucleotides with an open reading frame encoding 755 amino acids. Rhesus monkey GPIbα nucleotide and protein sequences share 93.27% and 89.20% homology respectively, with human. Sequences encoding the leucine-rich repeats of rhesus monkey GPIbα share strong similarity with human, whereas PEST sequences and N/O-glycosylated residues vary. The GPIbα-binding residues for thrombin, filamin A and 14-3-3ζ are highly conserved between rhesus monkey and human. Platelet function analysis revealed monkey and human platelets respond similarly to ADP, but rhesus monkey platelets failed to respond to low doses of ristocetin where human platelets achieved 76% aggregation. However, monkey platelets aggregated in response to higher ristocetin doses. Monkey GPIbα shares strong homology with human GPIbα, however there are some differences in rhesus monkey platelet activation through GPIbα engagement, which need to be considered when using rhesus monkey platelet to investigate platelet GPIbα function. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Identification of novel protein domains required for the expression of an active dehydratase fragment from a polyunsaturated fatty acid synthase

    PubMed Central

    Oyola-Robles, Delise; Gay, Darren C; Trujillo, Uldaeliz; Sánchez-Parés, John M; Bermúdez, Mei-Ling; Rivera-Díaz, Mónica; Carballeira, Néstor M; Baerga-Ortiz, Abel

    2013-01-01

    Polyunsaturated fatty acids (PUFAs) are made in some strains of deep-sea bacteria by multidomain proteins that catalyze condensation, ketoreduction, dehydration, and enoyl-reduction. In this work, we have used the Udwary-Merski Algorithm sequence analysis tool to define the boundaries that enclose the dehydratase (DH) domains in a PUFA multienzyme. Sequence analysis revealed the presence of four areas of high structure in a region that was previously thought to contain only two DH domains as defined by FabA-homology. The expression of the protein fragment containing all four protein domains resulted in an active enzyme, while shorter protein fragments were not soluble. The tetradomain fragment was capable of catalyzing the conversion of crotonyl-CoA to β-hydroxybutyryl-CoA efficiently, as shown by UV absorbance change as well as by chromatographic retention of reaction products. Sequence alignments showed that the two novel domains contain as much sequence conservation as the FabA-homology domains, suggesting that they too may play a functional role in the overall reaction. Structure predictions revealed that all domains belong to the hotdog protein family: two of them contain the active site His70 residue present in FabA-like DHs, while the remaining two do not. Replacing the active site His residues in both FabA domains for Ala abolished the activity of the tetradomain fragment, indicating that the DH activity is contained within the FabA-homology regions. Taken together, these results provide a first glimpse into a rare arrangement of DH domains which constitute a defining feature of the PUFA synthases. PMID:23696301

  10. Transcriptional Activation Signals Found in the Epstein-Barr Virus (EBV) Latency C Promoter Are Conserved in the Latency C Promoter Sequences from Baboon and Rhesus Monkey EBV-Like Lymphocryptoviruses (Cercopithicine Herpesviruses 12 and 15)

    PubMed Central

    Fuentes-Pananá, Ezequiel M.; Swaminathan, Sankar; Ling, Paul D.

    1999-01-01

    The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (−1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates. PMID:9847397

  11. Transcriptional activation signals found in the Epstein-Barr virus (EBV) latency C promoter are conserved in the latency C promoter sequences from baboon and Rhesus monkey EBV-like lymphocryptoviruses (cercopithicine herpesviruses 12 and 15).

    PubMed

    Fuentes-Pananá, E M; Swaminathan, S; Ling, P D

    1999-01-01

    The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.

  12. fLPS: Fast discovery of compositional biases for the protein universe.

    PubMed

    Harrison, Paul M

    2017-11-13

    Proteins often contain regions that are compositionally biased (CB), i.e., they are made from a small subset of amino-acid residue types. These CB regions can be functionally important, e.g., the prion-forming and prion-like regions that are rich in asparagine and glutamine residues. Here I report a new program fLPS that can rapidly annotate CB regions. It discovers both single-residue and multiple-residue biases. It works through a process of probability minimization. First, contigs are constructed for each amino-acid type out of sequence windows with a low degree of bias; second, these contigs are searched exhaustively for low-probability subsequences (LPSs); third, such LPSs are iteratively assessed for merger into possible multiple-residue biases. At each of these stages, efficiency measures are taken to avoid or delay probability calculations unless/until they are necessary. On a current desktop workstation, the fLPS algorithm can annotate the biased regions of the yeast proteome (>5700 sequences) in <1 s, and of the whole current TrEMBL database (>65 million sequences) in as little as ~1 h, which is >2 times faster than the commonly used program SEG, using default parameters. fLPS discovers both shorter CB regions (of the sort that are often termed 'low-complexity sequence'), and milder biases that may only be detectable over long tracts of sequence. fLPS can readily handle very large protein data sets, such as might come from metagenomics projects. It is useful in searching for proteins with similar CB regions, and for making functional inferences about CB regions for a protein of interest. The fLPS package is available from: http://biology.mcgill.ca/faculty/harrison/flps.html , or https://github.com/pmharrison/flps , or is a supplement to this article.

  13. Equation Chapter 1 Section 1Sequence-To-Conformation Relationships of Disordered Regions Tethered to Folded Domains of Proteins.

    PubMed

    Mittal, Anuradha; Holehouse, Alex S; Cohan, Megan C; Pappu, Rohit V

    2018-05-12

    Intrinsically disordered proteins and regions (IDPs / IDRs) are characterized by well-defined sequence-to-conformation relationships (SCRs). These relationships refer to the sequence-specific preferences for average sizes, shapes, residue-specific secondary structure propensities, and amplitudes of multiscale conformational fluctuations. SCRs are discerned from the sequence-specific conformational ensembles of IDPs. A vast majority of IDPs are actually tethered to folded domains (FDs). This raises the question of whether or not SCRs inferred for IDPs are applicable to IDRs tethered to folded domains. Here, we use atomistic simulations based on a well-established forcefield paradigm and an enhanced sampling method to obtain comparative assessments of SCRs for thirteen archetypal IDRs modeled as autonomous units, as C-terminal tails connected to folded domains, and as linkers between pairs of folded domains. Our studies uncover a set of general observations regarding context-independent versus context-dependent SCRs of IDRs. SCRs are minimally perturbed upon tethering to folded domains if the IDRs are deficient in charged residues and for polyampholytic IDRs where the oppositely charged residues within the sequence of the IDR are separated into distinct blocks. In contrast, the interplay between IDRs and tethered folded domains has a significant modulatory effect on SCRs if the IDRs have intermediate fractions of charged residues or if they have sequence-intrinsic conformational preferences for canonical random coils. Our findings suggest that IDRs with context-independent SCRs might be independent evolutionary modules whereas IDRs with context-dependent intrinsic SCRs might co-evolve with the FDs to which they are tethered. Copyright © 2018. Published by Elsevier Ltd.

  14. Purification and characterization of Campylobacter rectus surface layer proteins.

    PubMed Central

    Nitta, H; Holt, S C; Ebersole, J L

    1997-01-01

    Campylobacter rectus is a putative periodontopathogen which expresses a proteinaceous surface layer (S-layer) external to the outer membrane. S-layers are considered to play a protective role for the microorganism in hostile environments. The S-layer proteins from six different C. rectus strains (five human isolates and a nonhuman primate [NHP] isolate) were isolated, purified, and characterized. The S-layer proteins of these strains varied in molecular mass (ca. 150 to 166 kDa) as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. They all reacted with monospecific rabbit antiserum to the purified S-layer of C. rectus 314, but a quantitative enzyme-linked immunosorbent assay demonstrated a strong antigenic relationship among the five human strains, while the NHP strain, 6250, showed weaker reactivity. Amino acid composition analysis showed that the S-layers of four C. rectus strains contained large proportions of acidic amino acids (13 to 27%) and that >34% of the amino acid residues were hydrophobic. Amino acid sequence analysis of six S-layer proteins revealed that the first 15 amino-terminal amino acids were identical and showed seven residues of identity with the amino-terminal sequence of the Campylobacter fetus S-layer protein SapA1. CNBr peptide profiles of the S-layer proteins from C. rectus 314, ATCC 33238, and 6250 confirmed that the S-layer proteins from the human strains were similar to each other and somewhat different from that of the NHP isolate (strain 6250). However, the S-layer proteins from the two human isolates do show some structural heterogeneity. For example, there was a 17-kDa fragment unique to the C. rectus 314 S-layer. The amino-terminal sequence of this peptide had homology with the C. rectus 51-kDa porin and was composed of nearly 50% hydrophobic residues. Thus, the S-layer protein from C. rectus has structural heterogeneity among different human strains and immunoheterogeneity with the NHP strain. PMID:9009300

  15. A new cofactor in prokaryotic enzyme: Tryptophan tryptophylquinone as the redox prosthetic group in methylamine dehydrogenase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McIntire, W.S.; Wemmer, D.E.; Chistoserdov, A.

    Methylamine dehydrogenase (MADH), an {alpha}{sub 2}{beta}{sub 2} enzyme from numerous methylotrophic soil bacteria, contains a novel quinonoid redox prosthetic group that is covalently bound to its small {beta} subunit through two amino acyl residues. A comparison of the amino acid sequence deduced from the gene sequence of the small subunit for the enzyme from Methylobacterium extorquens AM1 with the published amino acid sequence obtained by Edman degradation method, allowed the identification of the amino acyl constituents of the cofactor as two tryptophyl residues. This information was crucial for interpreting {sup 1}H and {sup 13}C nuclear magnetic resonance, and mass spectralmore » data collected for the semicarbazide- and carboxymethyl-derivatized bis(tripeptidyl)-cofactor of MADH from bacterium W3A1. The cofactor is composed of two cross-linked tryptophyl residues. Although there are many possible isomers, only one is consistent with all the data: The first tryptophyl residue in the peptide sequence exists as an indole-6,7-dione, and is attached at its 4 position to the 2 position of the second, otherwise unmodified, indole side group. Contrary to earlier reports, the cofactor of MADH is not 2,7,9-tricarboxypyrroloquinoline quinone (PQQ), a derivative thereof, of pro-PQQ. This appears to be the only example of two cross-linked, modified amino acyl residues having a functional role in the active site of an enzyme, in the absence of other cofactors or metal ions.« less

  16. The importance of being kinked: role of Pro residues in the selectivity of the helical antimicrobial peptide P5.

    PubMed

    Bobone, Sara; Bocchinfuso, Gianfranco; Park, Yoonkyung; Palleschi, Antonio; Hahm, Kyung-Soo; Stella, Lorenzo

    2013-12-01

    Antimicrobial peptides (AMPs) are promising compounds for developing new antibiotic drugs against drug-resistant bacteria. Many of them kill bacteria by perturbing their membranes but exhibit no significant toxicity towards eukaryotic cells. The identification of the features responsible for this selectivity is essential for their pharmacological development. AMPs exhibit few conserved features, but a statistical analysis of an AMP sequence database indicated that many α-helical AMPs surprisingly have a helix-breaking Pro residue in the middle of their sequence. To discriminate among the different possible hypotheses for the functional role of this feature, we designed an analogue of the antimicrobial peptide P5, in which the central Pro was deleted (analogue P5Del). Pro removal resulted in a dramatic increase of toxicity. This was explained by the observation that P5Del binds both charged and neutral membranes, whereas P5 has no appreciable affinity towards neutral bilayers. CD and simulative data provided a rationalization of this behavior. In solution P5, due to the presence of Pro, attains compact conformations, in which its apolar residues are partially shielded from the solvent, whereas P5Del is more helical. These structural differences reduce the hydrophobic driving force for association of P5 to neutral membranes, whereas its binding to anionic bilayers can still take place because of electrostatic attraction. After membrane binding, the Pro residue does not preclude the attainment of a membrane-active amphiphilic helical conformation. These findings shed light on the role of Pro residues in the selectivity of AMPs and provide hints for the design of new, highly selective compounds. Copyright © 2013 European Peptide Society and John Wiley & Sons, Ltd.

  17. Partial characterization of an atypical family I inorganic pyrophosphatase from cattle tick Rhipicephalus (Boophilus) microplus.

    PubMed

    Costa, Evenilton P; Campos, Eldo; de Andrade, Caroline P; Façanha, Arnoldo R; Saramago, Luiz; Masuda, Aoi; Vaz, Itabajara da Silva; Fernandez, Jorge H; Moraes, Jorge; Logullo, Carlos

    2012-03-23

    The present paper presents the partial characterization of a family I inorganic pyrophosphatase from the hard tick Rhipicephalus (Boophilus) microplus (BmPPase). The BmPPase gene was cloned from the tick embryo and sequenced. The deduced amino acid sequence shared high similarity with other eukaryotic PPases, on the other hand, BmPPase presented some cysteine residues non-conserved in other groups. This pyrophosphatase is inhibited by Ca(2+), and the inhibition is antagonized by Mg(2+), suggesting that the balance between free Ca(2+) and free Mg(2+) in the eggs could be involved in BmPPase activity control. We observed that the BmPPase transcripts are present in the fat body, midgut and ovary of ticks, in two developmental stages (partially and fully engorged females). However, higher transcription amounts were found in ovary from fully engorged females. BmPPase activity was considerably abolished by the thiol reagent dithionitrobenzoic acid (DTNB), suggesting that cysteine residues are exposed in its structure. Therefore, these cysteine residues play a critical role in the structural stability of BmPPase. Molecular dynamics simulation analysis indicates that BmPPase is the first Family I PPase that could promote disulfide bonds between cysteine residues 138-339 and 167-295. Finally, we believe that these cysteine residues exposed in the BmPPase structure can play an important controlling role regarding enzyme activity, which would be an interesting mechanism of redox control. The results presented here also indicate that this enzyme can be involved in embryogenesis of this arthropod, and may be useful as a target in the development of new tick control strategies. Published by Elsevier B.V.

  18. Structural Insights into Cellulolytic and Chitinolytic Enzymes Revealing Crucial Residues of Insect β-N-acetyl-D-hexosaminidase

    PubMed Central

    Liu, Tian; Zhou, Yong; Chen, Lei; Chen, Wei; Liu, Lin; Shen, Xu; Zhang, Wenqing; Zhang, Jianzhen; Yang, Qing

    2012-01-01

    The chemical similarity of cellulose and chitin supports the idea that their corresponding hydrolytic enzymes would bind β-1,4-linked glucose residues in a similar manner. A structural and mutational analysis was performed for the plant cellulolytic enzyme BGlu1 from Oryza sativa and the insect chitinolytic enzyme OfHex1 from Ostrinia furnacalis. Although BGlu1 shows little amino-acid sequence or topological similarity with OfHex1, three residues (Trp490, Glu328, Val327 in OfHex1, and Trp358, Tyr131 and Ile179 in BGlu1) were identified as being conserved in the +1 sugar binding site. OfHex1 Glu328 together with Trp490 was confirmed to be necessary for substrate binding. The mutant E328A exhibited a 8-fold increment in K m for (GlcNAc)2 and a 42-fold increment in K i for TMG-chitotriomycin. A crystal structure of E328A in complex with TMG-chitotriomycin was resolved at 2.5 Å, revealing the obvious conformational changes of the catalytic residues (Glu368 and Asp367) and the absence of the hydrogen bond between E328A and the C3-OH of the +1 sugar. V327G exhibited the same activity as the wild-type, but acquired the ability to efficiently hydrolyse β-1,2-linked GlcNAc in contrast to the wild-type. Thus, Glu328 and Val327 were identified as important for substrate-binding and as glycosidic-bond determinants. A structure-based sequence alignment confirmed the spatial conservation of these three residues in most plant cellulolytic, insect and bacterial chitinolytic enzymes. PMID:23300622

  19. Proteome Adaptation to High Temperatures in the Ectothermic Hydrothermal Vent Pompeii Worm

    PubMed Central

    Jollivet, Didier; Mary, Jean; Gagnière, Nicolas; Tanguy, Arnaud; Fontanillas, Eric; Boutet, Isabelle; Hourdez, Stéphane; Segurens, Béatrice; Weissenbach, Jean; Poch, Olivier; Lecompte, Odile

    2012-01-01

    Taking advantage of the massive genome sequencing effort made on thermophilic prokaryotes, thermal adaptation has been extensively studied by analysing amino acid replacements and codon usage in these unicellular organisms. In most cases, adaptation to thermophily is associated with greater residue hydrophobicity and more charged residues. Both of these characteristics are positively correlated with the optimal growth temperature of prokaryotes. In contrast, little information has been collected on the molecular ‘adaptive’ strategy of thermophilic eukaryotes. The Pompeii worm A. pompejana, whose transcriptome has recently been sequenced, is currently considered as the most thermotolerant eukaryote on Earth, withstanding the greatest thermal and chemical ranges known. We investigated the amino-acid composition bias of ribosomal proteins in the Pompeii worm when compared to other lophotrochozoans and checked for putative adaptive changes during the course of evolution using codon-based Maximum likelihood analyses. We then provided a comparative analysis of codon usage and amino-acid replacements from a greater set of orthologous genes between the Pompeii worm and Paralvinella grasslei, one of its closest relatives living in a much cooler habitat. Analyses reveal that both species display the same high GC-biased codon usage and amino-acid patterns favoring both positively-charged residues and protein hydrophobicity. These patterns may be indicative of an ancestral adaptation to the deep sea and/or thermophily. In addition, the Pompeii worm displays a set of amino-acid change patterns that may explain its greater thermotolerance, with a significant increase in Tyr, Lys and Ala against Val, Met and Gly. Present results indicate that, together with a high content in charged residues, greater proportion of smaller aliphatic residues, and especially alanine, may be a different path for metazoans to face relatively ‘high’ temperatures and thus a novelty in thermophilic metazoans. PMID:22348046

  20. Proteome adaptation to high temperatures in the ectothermic hydrothermal vent Pompeii worm.

    PubMed

    Jollivet, Didier; Mary, Jean; Gagnière, Nicolas; Tanguy, Arnaud; Fontanillas, Eric; Boutet, Isabelle; Hourdez, Stéphane; Segurens, Béatrice; Weissenbach, Jean; Poch, Olivier; Lecompte, Odile

    2012-01-01

    Taking advantage of the massive genome sequencing effort made on thermophilic prokaryotes, thermal adaptation has been extensively studied by analysing amino acid replacements and codon usage in these unicellular organisms. In most cases, adaptation to thermophily is associated with greater residue hydrophobicity and more charged residues. Both of these characteristics are positively correlated with the optimal growth temperature of prokaryotes. In contrast, little information has been collected on the molecular 'adaptive' strategy of thermophilic eukaryotes. The Pompeii worm A. pompejana, whose transcriptome has recently been sequenced, is currently considered as the most thermotolerant eukaryote on Earth, withstanding the greatest thermal and chemical ranges known. We investigated the amino-acid composition bias of ribosomal proteins in the Pompeii worm when compared to other lophotrochozoans and checked for putative adaptive changes during the course of evolution using codon-based Maximum likelihood analyses. We then provided a comparative analysis of codon usage and amino-acid replacements from a greater set of orthologous genes between the Pompeii worm and Paralvinella grasslei, one of its closest relatives living in a much cooler habitat. Analyses reveal that both species display the same high GC-biased codon usage and amino-acid patterns favoring both positively-charged residues and protein hydrophobicity. These patterns may be indicative of an ancestral adaptation to the deep sea and/or thermophily. In addition, the Pompeii worm displays a set of amino-acid change patterns that may explain its greater thermotolerance, with a significant increase in Tyr, Lys and Ala against Val, Met and Gly. Present results indicate that, together with a high content in charged residues, greater proportion of smaller aliphatic residues, and especially alanine, may be a different path for metazoans to face relatively 'high' temperatures and thus a novelty in thermophilic metazoans.

  1. The Replacement of 10 Non-Conserved Residues in the Core Protein of JFH-1 Hepatitis C Virus Improves Its Assembly and Secretion

    PubMed Central

    Etienne, Loïc; Blanchard, Emmanuelle; Boyer, Audrey; Desvignes, Virginie; Gaillard, Julien; Meunier, Jean-Christophe; Roingeard, Philippe; Hourioux, Christophe

    2015-01-01

    Hepatitis C virus (HCV) assembly is still poorly understood. It is thought that trafficking of the HCV core protein to the lipid droplet (LD) surface is essential for its multimerization and association with newly synthesized HCV RNA to form the viral nucleocapsid. We carried out a mapping analysis of several complete HCV genomes of all genotypes, and found that the genotype 2 JFH-1 core protein contained 10 residues different from those of other genotypes. The replacement of these 10 residues of the JFH-1 strain sequence with the most conserved residues deduced from sequence alignments greatly increased virus production. Confocal microscopy of the modified JFH-1 strain in cell culture showed that the mutated JFH-1 core protein, C10M, was present mostly at the endoplasmic reticulum (ER) membrane, but not at the surface of the LDs, even though its trafficking to these organelles was possible. The non-structural 5A protein of HCV was also redirected to ER membranes and colocalized with the C10M core protein. Using a Semliki forest virus vector to overproduce core protein, we demonstrated that the C10M core protein was able to form HCV-like particles, unlike the native JFH-1 core protein. Thus, the substitution of a few selected residues in the JFH-1 core protein modified the subcellular distribution and assembly properties of the protein. These findings suggest that the early steps of HCV assembly occur at the ER membrane rather than at the LD surface. The C10M-JFH-1 strain will be a valuable tool for further studies of HCV morphogenesis. PMID:26339783

  2. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    PubMed

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Identification of human ferritin, heavy polypeptide 1 (FTH1) and yeast RGI1 (YER067W) as pro-survival sequences that counteract the effects of Bax and copper in Saccharomyces cerevisiae.

    PubMed

    Eid, Rawan; Boucher, Eric; Gharib, Nada; Khoury, Chamel; Arab, Nagla T T; Murray, Alistair; Young, Paul G; Mandato, Craig A; Greenwood, Michael T

    2016-03-01

    Ferritin is a sub-family of iron binding proteins that form multi-subunit nanotype iron storage structures and prevent oxidative stress induced apoptosis. Here we describe the identification and characterization of human ferritin, heavy polypeptide 1 (FTH1) as a suppressor of the pro-apoptotic murine Bax sequence in yeast. In addition we demonstrate that FTH1 is a general pro-survival sequence since it also prevents the cell death inducing effects of copper when heterologously expressed in yeast. Although ferritins are phylogenetically widely distributed and are present in most species of Bacteria, Archaea and Eukarya, ferritin is conspicuously absent in most fungal species including Saccharomyces cerevisiae. An in silico analysis of the yeast proteome lead to the identification of the 161 residue RGI1 (YER067W) encoded protein as a candidate for being a yeast ferritin. In addition to sharing 20% sequence identity with the 183 residue FTH1, RGI1 also has similar pro-survival properties as ferritin when overexpressed in yeast. Analysis of recombinant protein by SDS-PAGE and by electron microscopy revealed the expected formation of higher-order structures for FTH1 that was not observed with Rgi1p. Further analysis revealed that cells overexpressing RGI1 do not show increased resistance to iron toxicity and do not have enhanced capacity to store iron. In contrast, cells lacking RGI1 were found to be hypersensitive to the toxic effects of iron. Overall, our results suggest that Rgi1p is a novel pro-survival protein whose function is not related to ferritin but nevertheless it may have a role in regulating yeast sensitivity to iron stress. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Roles of histidine residues in plant vacuolar H(+)-pyrophosphatase.

    PubMed

    Hsiao, Yi Y; Van, Ru C; Hung, Shu H; Lin, Hsin H; Pan, Rong L

    2004-02-15

    Vacuolar proton pumping pyrophosphatase (H(+)-PPase; EC 3.6.1.1) plays a pivotal role in electrogenic translocation of protons from cytosol to the vacuolar lumen at the expense of PP(i) hydrolysis. Alignment analysis on amino acid sequence demonstrates that vacuolar H(+)-PPase of mung bean contains six highly conserved histidine residues. Previous evidence indicated possible involvement of histidine residue(s) in enzymatic activity and H(+)-translocation of vacuolar H(+)-PPase as determined by using histidine specific modifier, diethylpyrocarbonate [J. Protein Chem. 21 (2002) 51]. In this study, we further attempted to identify the roles of histidine residues in mung bean vacuolar H(+)-PPase by site-directed mutagenesis. A line of mutants with histidine residues singly replaced by alanine was constructed, over-expressed in Saccharomyces cerevisiae, and then used to determine their enzymatic activities and proton translocations. Among the mutants scrutinized, only the mutation of H716 significantly decreased the enzymatic activity, the proton transport, and the coupling ratio of vacuolar H(+)-PPase. The enzymatic activity of H716A is relatively resistant to inhibition by diethylpyrocarbonate as compared to wild-type and other mutants, indicating that H716 is probably the target residue for the attack by this modifier. The mutation at H716 of V-PPase shifted the optimum pH value but not the T(1/2) (pretreatment temperature at which half enzymatic activity is observed) for PP(i) hydrolytic activity. Mutation of histidine residues obviously induced conformational changes of vacuolar H(+)-PPase as determined by immunoblotting analysis after limited trypsin digestion. Furthermore, mutation of these histidine residues modified the inhibitory effects of F(-) and Na(+), but not that of Ca(2+). Single substitution of H704, H716 and H758 by alanine partially released the effect of K(+) stimulation, indicating possible location of K(+) binding in the vicinity of domains surrounding these residues.

  5. Combined proteomic and molecular approaches for cloning and characterization of copper-zinc superoxide dismutase (Cu, Zn-SOD2) from garlic (Allium sativum).

    PubMed

    Hadji Sfaxi, Imen; Ezzine, Aymen; Coquet, Laurent; Cosette, Pascal; Jouenne, Thierry; Marzouki, M Nejib

    2012-09-01

    Superoxide dismutases (SODs; EC 1.15.1.1) are key enzymes in the cells protection against oxidant agents. Thus, SODs play a major role in the protection of aerobic organisms against oxygen-mediated damages. Three SOD isoforms were previously identified by zymogram staining from Allium sativum bulbs. The purified Cu, Zn-SOD2 shows an antagonist effect to an anticancer drug and alleviate cytotoxicity inside tumor cells lines B16F0 (mouse melanoma cells) and PAE (porcine aortic endothelial cells). To extend the characterization of Allium SODs and their corresponding genes, a proteomic approach was applied involving two-dimensional gel electrophoresis and LC-MS/MS analyses. From peptide sequence data obtained by mass spectrometry and sequences homologies, primers were defined and a cDNA fragment of 456 bp was amplified by RT-PCR. The cDNA nucleotide sequence analysis revealed an open reading frame coding for 152 residues. The deduced amino acid sequence showed high identity (82-87%) with sequences of Cu, Zn-SODs from other plant species. Molecular analysis was achieved by a protein 3D structural model.

  6. Generating intrinsically disordered protein conformational ensembles from a Markov chain

    NASA Astrophysics Data System (ADS)

    Cukier, Robert I.

    2018-03-01

    Intrinsically disordered proteins (IDPs) sample a diverse conformational space. They are important to signaling and regulatory pathways in cells. An entropy penalty must be payed when an IDP becomes ordered upon interaction with another protein or a ligand. Thus, the degree of conformational disorder of an IDP is of interest. We create a dichotomic Markov model that can explore entropic features of an IDP. The Markov condition introduces local (neighbor residues in a protein sequence) rotamer dependences that arise from van der Waals and other chemical constraints. A protein sequence of length N is characterized by its (information) entropy and mutual information, MIMC, the latter providing a measure of the dependence among the random variables describing the rotamer probabilities of the residues that comprise the sequence. For a Markov chain, the MIMC is proportional to the pair mutual information MI which depends on the singlet and pair probabilities of neighbor residue rotamer sampling. All 2N sequence states are generated, along with their probabilities, and contrasted with the probabilities under the assumption of independent residues. An efficient method to generate realizations of the chain is also provided. The chain entropy, MIMC, and state probabilities provide the ingredients to distinguish different scenarios using the terminologies: MoRF (molecular recognition feature), not-MoRF, and not-IDP. A MoRF corresponds to large entropy and large MIMC (strong dependence among the residues' rotamer sampling), a not-MoRF corresponds to large entropy but small MIMC, and not-IDP corresponds to low entropy irrespective of the MIMC. We show that MorFs are most appropriate as descriptors of IDPs. They provide a reasonable number of high-population states that reflect the dependences between neighbor residues, thus classifying them as IDPs, yet without very large entropy that might lead to a too high entropy penalty.

  7. Dissecting substrate specificities of the mitochondrial AFG3L2 protease.

    PubMed

    Ding, Bojian; Martin, Dwight W; Rampello, Anthony J; Glynn, Steven E

    2018-06-22

    Human AFG3L2 is a compartmental AAA+ protease that performs ATP-fueled degradation at the matrix face of the inner mitochondrial membrane. Identifying how AFG3L2 selects substrates from the diverse complement of matrix-localized proteins is essential for understanding mitochondrial protein biogenesis and quality control. Here, we create solubilized forms of AFG3L2 to examine the enzyme's substrate specificity mechanisms. We show that conserved residues within the pre-sequence of the mitochondrial ribosomal protein, MrpL32, target the subunit to the protease for processing into a mature form. Moreover, these residues can act as a degron, delivering diverse model proteins to AFG3L2 for degradation. By determining the sequence of degra-dation products from multiple substrates using mass spectrometry, we construct a peptidase specificity pro-file that displays constrained product lengths and is dominated by the identity of the residue at the P1' posi-tion, with a strong preference for hydrophobic and small polar residues. This specificity profile is validated by examining the cleavage of both fluorogenic reporter peptides and full polypeptide substrates bearing different P1' residues. Together, these results demonstrate that AFG3L2 contains multiple modes of specificity, dis-criminating between potential substrates by recognizing accessible degron sequences, and performing peptide bond cleavage at preferred patterns of residues within the compartmental chamber.

  8. Molecular cloning and sequence analysis of the Anticarsia gemmatalis multicapsid nuclear polyhedrosis virus GP64 glycoprotein.

    PubMed

    Pilloff, Marcela Gabriela; Bilen, Marcos Fabián; Belaich, Mariano Nicolás; Lozano, Mario Enrique; Ghiringhelli, Pablo Daniel

    2003-01-01

    The gp64 locus of Anticarsia gemmatalis multicapsid nucleopolyhedrovirus isolate Santa Fe (AgMNPV-SF) was characterised molecularly in our laboratory. To this end, we have located and cloned a AgMNPV-SF genomic DNA fragment containing the gp64 gene and sequenced the complete gp64 locus. Nucleotide sequence analysis indicated that the AgMNPV gp64 gene consists of a 1500 nucleotide open reading frame (ORF), encoding a protein of 499 amino acids. Of the seven gp64 homologues identified to date, the AgMNPV gp64 ORF shared most sequence similarity with the gp64 gene of Orgyia pseudotsugata MNPV. The GP64 from AgMNPV is the smallest baculoviral envelope glycoprotein found to date, differing in 10 or more residues from the other group I nucleopolyhedroviruses. The biological activity of AgMNPV GP64 protein was assessed by cell fusion assays in UFL-AG-286 cells using the obtained recombinant plasmids. In the upstream and downstream regions, relative to the gp64 ORF, we found different conserved transcriptional and post-transcriptional regulatory elements, respectively.

  9. Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing.

    PubMed

    Giraud, Mathieu; Salson, Mikaël; Duez, Marc; Villenet, Céline; Quief, Sabine; Caillault, Aurélie; Grardel, Nathalie; Roumier, Christophe; Preudhomme, Claude; Figeac, Martin

    2014-05-28

    V(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood. We propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols. The proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.

  10. Molecular Cloning and Sequence Analysis of a Phenylalanine Ammonia-Lyase Gene from Dendrobium

    PubMed Central

    Cai, Yongping; Lin, Yi

    2013-01-01

    In this study, a phenylalanine ammonia-lyase (PAL) gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748) has 2,458 bps and contains a complete open reading frame (ORF) of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum. PMID:23638048

  11. A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

    PubMed Central

    Freschi, Valerio; Bogliolo, Alessandro

    2012-01-01

    In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086

  12. Structure-activity analysis of synthetic alpha-thrombin-receptor-activating peptides.

    PubMed

    Van Obberghen-Schilling, E; Rasmussen, U B; Vouret-Craviari, V; Lentes, K U; Pavirani, A; Pouysségur, J

    1993-06-15

    alpha-Thrombin stimulates G-protein-coupled effectors leading to secretion and aggregation in human platelets, and to a mitogenic response in CCL39 hamster fibroblasts. alpha-Thrombin receptors can be activated by synthetic peptides corresponding to the receptor sequence starting with serine-42, at the proposed cleavage site. We have previously determined that the agonist domain of receptor-activating peptides resides within the five N-terminal residues [Vouret-Craviari, Van Obberghen-Schilling, Rasmussen, Pavirani, Lecocq and Pouysségur (1992) Mol. Biol. Cell. 3, 95-102], although the 7-residue peptide (SFFLRNP) corresponding to the hamster alpha-thrombin receptor was 10 times more potent than the 5-residue peptide for activation of human platelets. In the present study we have analysed the role of individual amino acids in receptor activation by using a series of modified hexa- or hepta-peptides derived from the human alpha-thrombin-receptor sequence. Cellular events examined here include phospholipase C activation, adenylyl cyclase inhibition and DNA synthesis stimulation in non-transformed CCL39 fibroblasts and a tumorigenic variant of that line (A71 cells). Modification of the peptide sequence had similar functional consequence for each of the assays described, indicating that either a unique receptor or pharmacologically indistinguishable receptor subtypes activate distinct G-protein signalling pathways. Furthermore, we found that: (1) the N-terminal serine can be replaced by small or intermediately sized amino acids (+/- hydroxyl groups) without loss of activity. However, its replacement by an aromatic side-chain or omission of the N-terminal amino group severely reduces activity. (2) An aromatic side-chain on the penultimate N-terminal residue appears to play a critical role since phenylalanine in this position can be substituted by tyrosine without complete loss of activity whereas an alanine in its place is not tolerated. (3) Deletion of the first, second or third N-terminal residue leads to a loss of activity, suggesting that a defined spacing of more than one structural component may be important for ligand-receptor interaction. Finally, we did not observe an antagonistic effect of the inactive peptides on phospholipase C activation or DNA synthesis induced by alpha-thrombin (1 nM) or SFLLRNP (3 microM).

  13. Structure-activity analysis of synthetic alpha-thrombin-receptor-activating peptides.

    PubMed Central

    Van Obberghen-Schilling, E; Rasmussen, U B; Vouret-Craviari, V; Lentes, K U; Pavirani, A; Pouysségur, J

    1993-01-01

    alpha-Thrombin stimulates G-protein-coupled effectors leading to secretion and aggregation in human platelets, and to a mitogenic response in CCL39 hamster fibroblasts. alpha-Thrombin receptors can be activated by synthetic peptides corresponding to the receptor sequence starting with serine-42, at the proposed cleavage site. We have previously determined that the agonist domain of receptor-activating peptides resides within the five N-terminal residues [Vouret-Craviari, Van Obberghen-Schilling, Rasmussen, Pavirani, Lecocq and Pouysségur (1992) Mol. Biol. Cell. 3, 95-102], although the 7-residue peptide (SFFLRNP) corresponding to the hamster alpha-thrombin receptor was 10 times more potent than the 5-residue peptide for activation of human platelets. In the present study we have analysed the role of individual amino acids in receptor activation by using a series of modified hexa- or hepta-peptides derived from the human alpha-thrombin-receptor sequence. Cellular events examined here include phospholipase C activation, adenylyl cyclase inhibition and DNA synthesis stimulation in non-transformed CCL39 fibroblasts and a tumorigenic variant of that line (A71 cells). Modification of the peptide sequence had similar functional consequence for each of the assays described, indicating that either a unique receptor or pharmacologically indistinguishable receptor subtypes activate distinct G-protein signalling pathways. Furthermore, we found that: (1) the N-terminal serine can be replaced by small or intermediately sized amino acids (+/- hydroxyl groups) without loss of activity. However, its replacement by an aromatic side-chain or omission of the N-terminal amino group severely reduces activity. (2) An aromatic side-chain on the penultimate N-terminal residue appears to play a critical role since phenylalanine in this position can be substituted by tyrosine without complete loss of activity whereas an alanine in its place is not tolerated. (3) Deletion of the first, second or third N-terminal residue leads to a loss of activity, suggesting that a defined spacing of more than one structural component may be important for ligand-receptor interaction. Finally, we did not observe an antagonistic effect of the inactive peptides on phospholipase C activation or DNA synthesis induced by alpha-thrombin (1 nM) or SFLLRNP (3 microM). PMID:7686363

  14. Active site characterization and molecular cloning of Tenebrio molitor midgut trehalase and comments on their insect homologs.

    PubMed

    Gomez, Ana; Cardoso, Christiane; Genta, Fernando A; Terra, Walter R; Ferreira, Clélia

    2013-08-01

    The soluble midgut trehalase from Tenebrio molitor (TmTre1) was purified after several chromatographic steps, resulting in an enzyme with 58 kDa and pH optimum 5.3 (ionizing active groups in the free enzyme: pK(e1) = 3.8 ± 0.2 pK(e2) = 7.4 ± 0.2). The purified enzyme corresponds to the deduced amino acid sequence of a cloned cDNA (TmTre1-cDNA), because a single cDNA coding a soluble trehalase was found in the T. molitor midgut transcriptome. Furthermore, the mass of the protein predicted to be coded by TmTre1-cDNA agrees with that of the purified enzyme. TmTre1 has the essential catalytic groups Asp 315 and Glu 513 and the essential Arg residues R164, R217, R282. Carbodiimide inactivation of the purified enzyme at different pH values reveals an essential carboxyl group with pKa = 3.5 ± 0.3. Phenylglyoxal modified a single Arg residue with pKa = 7.5 ± 0.2, as observed in the soluble trehalase from Spodoptera frugiperda (SfTre1). Diethylpyrocarbonate modified a His residue that resulted in a less active enzyme with pK(e1) changed to 4.8 ± 0.2. In TmTre1 the modified His residue (putatively His 336) is more exposed than the His modified in SfTre1 (putatively His 210) and that affects the ionization of an Arg residue. The architecture of the active site of TmTre1 and SfTre1 is different, as shown by multiple inhibition analysis, the meaning of which demands further research. Trehalase sequences obtained from midgut transcriptomes (pyrosequencing and Illumina data) from 8 insects pertaining to 5 different orders were used in a cladogram, together with other representative sequences. The data suggest that the trehalase gene went duplication and divergence prior to the separation of the paraneopteran and holometabolan orders and that the soluble trehalase derived from the membrane-bound one by losing the C-terminal transmembrane loop. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. The point mutation process in proteins

    NASA Technical Reports Server (NTRS)

    Schwartz, R. M.; Dayhoff, M. O.

    1978-01-01

    An optimized scoring matrix for residue-by-residue comparisons of distantly related protein sequences has been developed. The scoring matrix is based on observed exchanges and mutabilities of amino acids in 1572 closely related sequences derived from a cross-section of protein groups. Very few superimposed or parallel mutations are included in the data. The scoring matrix is most useful for demonstrating the relatedness of proteins between 65 and 85% different.

  16. Friedelin Synthase from Maytenus ilicifolia: Leucine 482 Plays an Essential Role in the Production of the Most Rearranged Pentacyclic Triterpene

    PubMed Central

    Souza-Moreira, Tatiana M.; Alves, Thaís B.; Pinheiro, Karina A.; Felippe, Lidiane G.; De Lima, Gustavo M. A.; Watanabe, Tatiana F.; Barbosa, Cristina C.; Santos, Vânia A. F. F. M.; Lopes, Norberto P.; Valentini, Sandro R.; Guido, Rafael V. C.; Furlan, Maysa; Zanelli, Cleslei F.

    2016-01-01

    Among the biologically active triterpenes, friedelin has the most-rearranged structure produced by the oxidosqualene cyclases and is the only one containing a cetonic group. In this study, we cloned and functionally characterized friedelin synthase and one cycloartenol synthase from Maytenus ilicifolia (Celastraceae). The complete coding sequences of these 2 genes were cloned from leaf mRNA, and their functions were characterized by heterologous expression in yeast. The cycloartenol synthase sequence is very similar to other known OSCs of this type (approximately 80% identity), although the M. ilicifolia friedelin synthase amino acid sequence is more related to β-amyrin synthases (65–74% identity), which is similar to the friedelin synthase cloned from Kalanchoe daigremontiana. Multiple sequence alignments demonstrated the presence of a leucine residue two positions upstream of the friedelin synthase Asp-Cys-Thr-Ala-Glu (DCTAE) active site motif, while the vast majority of OSCs identified so far have a valine or isoleucine residue at the same position. The substitution of the leucine residue with valine, threonine or isoleucine in M. ilicifolia friedelin synthase interfered with substrate recognition and lead to the production of different pentacyclic triterpenes. Hence, our data indicate a key role for the leucine residue in the structure and function of this oxidosqualene cyclase. PMID:27874020

  17. Friedelin Synthase from Maytenus ilicifolia: Leucine 482 Plays an Essential Role in the Production of the Most Rearranged Pentacyclic Triterpene

    NASA Astrophysics Data System (ADS)

    Souza-Moreira, Tatiana M.; Alves, Thaís B.; Pinheiro, Karina A.; Felippe, Lidiane G.; de Lima, Gustavo M. A.; Watanabe, Tatiana F.; Barbosa, Cristina C.; Santos, Vânia A. F. F. M.; Lopes, Norberto P.; Valentini, Sandro R.; Guido, Rafael V. C.; Furlan, Maysa; Zanelli, Cleslei F.

    2016-11-01

    Among the biologically active triterpenes, friedelin has the most-rearranged structure produced by the oxidosqualene cyclases and is the only one containing a cetonic group. In this study, we cloned and functionally characterized friedelin synthase and one cycloartenol synthase from Maytenus ilicifolia (Celastraceae). The complete coding sequences of these 2 genes were cloned from leaf mRNA, and their functions were characterized by heterologous expression in yeast. The cycloartenol synthase sequence is very similar to other known OSCs of this type (approximately 80% identity), although the M. ilicifolia friedelin synthase amino acid sequence is more related to β-amyrin synthases (65-74% identity), which is similar to the friedelin synthase cloned from Kalanchoe daigremontiana. Multiple sequence alignments demonstrated the presence of a leucine residue two positions upstream of the friedelin synthase Asp-Cys-Thr-Ala-Glu (DCTAE) active site motif, while the vast majority of OSCs identified so far have a valine or isoleucine residue at the same position. The substitution of the leucine residue with valine, threonine or isoleucine in M. ilicifolia friedelin synthase interfered with substrate recognition and lead to the production of different pentacyclic triterpenes. Hence, our data indicate a key role for the leucine residue in the structure and function of this oxidosqualene cyclase.

  18. The legumin gene family: structure of a B type gene of Vicia faba and a possible legumin gene specific regulatory element.

    PubMed Central

    Bäumlein, H; Wobus, U; Pustell, J; Kafatos, F C

    1986-01-01

    The field bean, Vicia faba L. var. minor, possesses two sub-families of 11 S legumin genes named A and B. We isolated from a genomic library a B-type gene (LeB4) and determined its primary DNA sequence. Gene LeB4 codes for a 484 amino acid residue prepropolypeptide, encompassing a signal peptide of 22 amino acid residues, an acidic, very hydrophilic alpha-chain of 281 residues and a basic, somewhat hydrophobic beta-chain of 181 residues. The latter two coding regions are immediately contiguous, but each is interrupted by a short intron. Type A legumin genes from soybean and pea are known to have introns in the same two positions, in addition to an extra intron (within the alpha-coding sequence). Sequence comparisons of legumin genes from these three plants revealed a highly conserved sequence element of at least 28 bp, centered at approximately 100 bp upstream of each cap site. The element is absent from the equivalent position of all non-legumin and other plant and fungal genes examined. We tentatively name this element "legumin box" and suggest that it may have a function in the regulation of legumin gene expression. PMID:3960730

  19. Prediction of active sites of enzymes by maximum relevance minimum redundancy (mRMR) feature selection.

    PubMed

    Gao, Yu-Fei; Li, Bi-Qing; Cai, Yu-Dong; Feng, Kai-Yan; Li, Zhan-Dong; Jiang, Yang

    2013-01-27

    Identification of catalytic residues plays a key role in understanding how enzymes work. Although numerous computational methods have been developed to predict catalytic residues and active sites, the prediction accuracy remains relatively low with high false positives. In this work, we developed a novel predictor based on the Random Forest algorithm (RF) aided by the maximum relevance minimum redundancy (mRMR) method and incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility to predict active sites of enzymes and achieved an overall accuracy of 0.885687 and MCC of 0.689226 on an independent test dataset. Feature analysis showed that every category of the features except disorder contributed to the identification of active sites. It was also shown via the site-specific feature analysis that the features derived from the active site itself contributed most to the active site determination. Our prediction method may become a useful tool for identifying the active sites and the key features identified by the paper may provide valuable insights into the mechanism of catalysis.

  20. Metagenome enrichment approach used for selection of oil-degrading bacteria consortia for drill cutting residue bioremediation.

    PubMed

    Guerra, Alaine B; Oliveira, Jorge S; Silva-Portela, Rita C B; Araújo, Wydemberg; Carlos, Aline C; Vasconcelos, Ana Tereza R; Freitas, Ana Teresa; Domingos, Yldeney Silva; de Farias, Mirna Ferreira; Fernandes, Glauber José Turolla; Agnez-Lima, Lucymara F

    2018-04-01

    Drill cuttings leave behind thousands of tons of residues without adequate treatment, generating a large environmental liability. Therefore knowledge about the microbial community of drilling residue may be useful for developing bioremediation strategies. In this work, samples of drilling residue were enriched in different culture media in the presence of petroleum, aiming to select potentially oil-degrading bacteria and biosurfactant producers. Total DNA was extracted directly from the drill cutting samples and from two enriched consortia and sequenced using the Ion Torrent platform. Taxonomic analysis revealed the predominance of Proteobacteria in the metagenome from the drill cuttings, while Firmicutes was enriched in consortia samples. Functional analysis using the Biosurfactants and Biodegradation Database (BioSurfDB) revealed a similar pattern among the three samples regarding hydrocarbon degradation and biosurfactants production pathways. However, some statistical differences were observed between samples. Namely, the pathways related to the degradation of fatty acids, chloroalkanes, and chloroalkanes were enriched in consortia samples. The degradation colorimetric assay using dichlorophenolindophenol as an indicator was positive for several hydrocarbon substrates. The consortia were also able to produce biosurfactants, with biosynthesis of iturin, lichnysin, and surfactin among the more abundant pathways. A microcosms assay followed by gas chromatography analysis showed the efficacy of the consortia in degrading alkanes, as we observed a reduction of around 66% and 30% for each consortium in total alkanes. These data suggest the potential use of these consortia in the bioremediation of drilling residue based on autochthonous bioaugmentation. Copyright © 2018 Elsevier Ltd. All rights reserved.

  1. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    PubMed

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Amino acid substitutions in subunit 9 of the mitochondrial ATPase complex of Saccharomyces cerevisiae. Sequence analysis of a series of revertants of an oli1 mit- mutant carrying an amino acid substitution in the hydrophilic loop of subunit 9.

    PubMed

    Willson, T A; Nagley, P

    1987-09-01

    This work concerns a biochemical genetic study of subunit 9 of the mitochondrial ATPase complex of Saccharomyces cerevisiae. Subunit 9, encoded by the mitochondrial oli1 gene, contains a hydrophilic loop connecting two transmembrane stems. In one particular oli1 mit- mutant 2422, the substitution of a positively charged amino acid in this loop (Arg39----Met) renders the ATPase complex non-functional. A series of 20 revertants, selected for their ability to grow on nonfermentable substrates, has been isolated from mutant 2422. The results of DNA sequence analysis of the oli1 gene in each revertant have led to the recognition of three groups of revertants. Class I revertants have undergone a same-site reversion event: the mutant Met39 is replaced either by arginine (as in wild-type) or lysine. Class II revertants maintain the mutant Met39 residue, but have undergone a second-site reversion event (Asn35----Lys). Two revertants showing an oligomycin-resistant phenotype carry this same second-site reversion in the loop region together with a further amino acid substitution in either of the two membrane-spanning segments of subunit 9 (either Gly23----Ser or Leu53----Phe). Class III revertants contain subunit 9 with the original mutant 2422 sequence, and additionally carry a recessive nuclear suppressor, demonstrated to represent a single gene. The results on the revertants in classes I and II indicate that there is a strict requirement for a positively charged residue in the hydrophilic loop close to the boundary of the lipid bilayer. The precise location of this positive charge is less stringent; in functional ATPase complexes it can be found at either residue 39 or 35. This charged residue is possibly required to interact with some other component of the mitochondrial ATPase complex. These findings, together with hydropathy plots of subunit 9 polypeptides from normal, mutant and revertant strains, led to the conclusion that the hydrophilic loop in normal subunit 9 extends further than previously suggested, with the boundary of the N-terminal membrane-embedded stem lying at residue 34. The possibility is raised that the observed suppression of the 2422 mutant phenotype in class III revertants is manifested through an accommodating change in a nuclear-encoded subunit of the ATPase complex.

  3. DNA sequence analysis, expression, distribution, and physiological role of the Xaa-prolyldipeptidyl aminopeptidase gene from Lactobacillus helveticus CNRZ32.

    PubMed

    Yüksel, G U; Steele, J L

    1996-02-01

    Lactobacillus helveticus CNRZ32 possesses an Xaa-prolyldipeptidyl aminopeptidase (PepX), which releases amino-terminal dipeptides from peptides containing proline residues in the penultimate position. The PepX gene, designated pepX, from Lb. helveticus CNRZ32 was sequenced. Analysis of the sequence identified a putative 2379-bp pepX open-reading frame, which encodes a polypeptide of 793 amino acid residues with a deduced molecular mass of 88,111 Da. The gene shows significant sequence identity with sequenced pepX genes from lactic acid bacteria. The product of the gene contains a motif that is almost identical with the active-site motif of the serine-dependent PepX from lactococci. The introduction of pepX into Lactococcus lactis LM0230 on either pGK12 (a low-copy-number plasmid vector) or pIL253 (a high-copy-number plasmid vector) did not result in a significant increase in PepX activity, while the introduction of pepX into CNRZ32 on pGK12 resulted in a four-fold increase in PepX activity. Southern hybridization experiments revealed that the pepX gene from CNRZ32 is well conserved in lactobacilli, pediococci and streptococci. The physiological role of PepX during growth in lactobacillus MRS (a rich medium containing protein hydrolysates along with other ingredients) and milk was examined by comparing growth of CNRZ32 and a CNRZ32 PepX-negative derivative. No difference in growth rate or acid production was observed between CNRZ32 and its PepX-negative derivative in MRS. However, the CNRZ32 PepX-negative derivative grew in milk at a reduced specific growth rate when compared to wild-type CNRZ32. Introduction of the cloned PepX determinant into the CNRZ32 PepX-negative derivative resulted in a construct with a specific growth rate similar to that of wild-type CNRZ32.

  4. A novel HLA-B allele, B*5214, detected in a Taiwanese volunteer bone marrow donor using a sequence-based typing method.

    PubMed

    Chen, M J; Chu, C C; Shyr, M H; Lin, C L; Lin, P Y; Yang, K L

    2010-02-01

    HLA-B*5214, a novel rare allele of HLA-B*52 variant, was found in a Taiwanese volunteer bone marrow donor by sequence-based typing method. The sequence of B*5214 is identical to that of B*520101 in exon 2 but differs from B*520101 in exon 3 at nucleotide positions 419 A-->T and 435 A-->G. Alteration of these two nucleotides resulted an amino acid substitution at amino acid residue 116 Y-->F ( TAC-->TTC) and a silent exchange at residue 121 K-->K (AAA-->AAG).

  5. Organization and transient expression of the gene for human U11 snRNA

    PubMed Central

    Clemens, Suter-Crazzolara; Walter, Keller

    1991-01-01

    The nucleotide sequence of U11 small nuclear RNA, a minor U RNA from HeLa cells, was determined. Computer analysis of the sequence (135 residues) predicts two strong hairpin loops which are separated by seventeen nucleotides containing an Sm binding site (AAUUUUUUGG). A synthetic gene was constructed in which the coding region of U11 RNA is under the control of a T7 promoter. This vector can be used to produce U11 RNA in vitro. Southern hybridization and PCR analysis of HeLa genomic DNA suggest that U11 RNA is encoded by a single copy gene, and that at least three genomic regions could be U11 RNA pseudogenes. A HeLa genomic copy of a U11 gene was isolated by inverted PCR. This gene contains the U11 RNA coding sequence and several sequence elements unique for the U RNA genes. These include a Distal Sequence Element (DSE, ATTTGCATA) present between positions −215 and −223 relative to the start of transcription; a Proximal Sequence Element (PSE, TTCACCTTTACCAAAAATG) located between positions −43 and −63 ; and a 3′box (GTTAGGCGAAATATTA) between positions +150 and +166. Transfection of HeLa cells with this gene revealed that it is functioning in vivo and can produce U11 RNA. PMID:1820214

  6. Metagenomic analysis reveals the influences of milk containing antibiotics on the rumen microbes of calves.

    PubMed

    Li, Wei; Han, Yunsheng; Yuan, Xue; Wang, Guan; Wang, Zhibo; Pan, Qiqi; Gao, Yan; Qu, Yongli

    2017-04-01

    Milk containing antibiotics is used as cost-effective feed for calves, which may lead to antibiotic residues-associated food safety problems. This study aims to investigate the influence of antibiotics on rumen microbes. Through metagenomic sequencing, the rumen microbial communities of calves fed with pasteurized milk containing antibiotics (B1), milk containing antibiotics (B2) and fresh milk (B3) were explored. Each milk group included calves in 2 (T1), 3 (T2) and 6 (T3) months of age. Using FastQC software and SOAPdenovo 2, the filtered data, respectively, were performed with quality control and sequence splicing. Following KEGG annotation was conducted for the uploaded sequences using KAAS software. Using R software, both species abundance analysis and differential abundance analysis were performed. In the B1 samples, the species abundance of Bacteroidetes gradually decreased along with the extension of feeding time, while that of Fibrobacteres gradually increased. The species abundances of Proteobacteria (p value = 0.01) and Spirochaetes (p value = 0.03) had significant differences among T1, T2 and T3 samples. Meanwhile, only the species abundance of Spirochaetes (p value = 0.04) had significant difference among B1, B2 and B3 samples. Cell cycle involving GSK3β, CDK2 and CDK7 was significantly enriched for the differentially expressed genes in the T1 versus T2 and T1 versus T3 comparison groups. Milk containing antibiotics might have a great influence on these rumen microbes and lead to antibiotic residues-associated food safety problems. Furthermore, GSK3β, CDK2 and CDK7 in rumen bacteria might affect milk fat metabolism in early growth stages of calves.

  7. Structural basis for ribosome protein S1 interaction with RNA in trans-translation of Mycobacterium tuberculosis.

    PubMed

    Fan, Yi; Dai, Yazhuang; Hou, Meijing; Wang, Huilin; Yao, Hongwei; Guo, Chenyun; Lin, Donghai; Liao, Xinli

    2017-05-27

    Ribosomal protein S1 (RpsA), the largest 30S protein in ribosome, plays a significant role in translation and trans-translation. In Mycobacterium tuberculosis, the C-terminus of RpsA is known as tuberculosis drug target of pyrazinoic acid, which inhibits the interaction between MtRpsA and tmRNA in trans-translation. However, the molecular mechanism underlying the interaction of MtRpsA with tmRNA remains unknown. We herein analyzed the interaction of the C-terminal domain of MtRpsA with three RNA fragments poly(A), sMLD and pre-sMLD. NMR titration analysis revealed that the RNA binding sites on MtRpsA CTD are mainly located in the β2, β3 and β5 strands and the adjacent L3 loop of the S1 domain. Fluorescence experiments determined the MtRpsA CTD binding to RNAs are in the micromolar affinity range. Sequence analysis also revealed conserved residues in the mapped RNA binding region. Residues L304, V305, G308, F310, H322, I323, R357 and I358 were verified to be the key residues influencing the interaction between MtRpsA CTD and pre-sMLD. Molecular docking further confirmed that the poly(A)-like sequence and sMLD of tmRNA are all involved in the protein-RNA interaction, through charged interaction and hydrogen bonds. The results will be beneficial for designing new anti-tuberculosis drugs. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Automated identification of complementarity determining regions (CDRs) reveals peculiar characteristics of CDRs and B cell epitopes.

    PubMed

    Ofran, Yanay; Schlessinger, Avner; Rost, Burkhard

    2008-11-01

    Exact identification of complementarity determining regions (CDRs) is crucial for understanding and manipulating antigenic interactions. One way to do this is by marking residues on the antibody that interact with B cell epitopes on the antigen. This, of course, requires identification of B cell epitopes, which could be done by marking residues on the antigen that bind to CDRs, thus requiring identification of CDRs. To circumvent this vicious circle, existing tools for identifying CDRs are based on sequence analysis or general biophysical principles. Often, these tools, which are based on partial data, fail to agree on the boundaries of the CDRs. Herein we present an automated procedure for identifying CDRs and B cell epitopes using consensus structural regions that interact with the antigens in all known antibody-protein complexes. Consequently, we provide the first comprehensive analysis of all CDR-epitope complexes of known three-dimensional structure. The CDRs we identify only partially overlap with the regions suggested by existing methods. We found that the general physicochemical properties of both CDRs and B cell epitopes are rather peculiar. In particular, only four amino acids account for most of the sequence of CDRs, and several types of amino acids almost never appear in them. The secondary structure content and the conservation of B cell epitopes are found to be different than previously thought. These characteristics of CDRs and epitopes may be instrumental in choosing which residues to mutate in experimental search for epitopes. They may also assist in computational design of antibodies and in predicting B cell epitopes.

  9. The Isolation and Characterization of Glycosylated Phosphoproteins from Herring Fish Bones*

    PubMed Central

    Zhou, Hai-Yan; Salih, Erdjan; Glimcher, Melvin J.

    2010-01-01

    Past studies of bone extracellular matrix phosphoproteins such as osteopontin and bone sialoprotein have yielded important biological information regarding their role in calcification and the regulation of cellular activity. Most of these studies have been limited to proteins extracted from mammalian and avian vertebrates and nonvertebrates. The present work describes the isolation and purification of two major highly glycosylated and phosphorylated extracellular matrix proteins of 70 and 22 kDa from herring fish bones. The 70-kDa phosphoprotein has some characteristics of osteopontin with respect to amino acid composition and susceptibility to thrombin cleavage. Unlike osteopontin, however, it was found to contain high levels of sialic acid similar to bone sialoprotein. The 22-kDa protein has very different properties such as very high content of phosphoserine (∼270 Ser(P) residues/1000 amino acid residues), Ala, and Asx residues. The N-terminal amino acid sequence analysis of both the 70-kDa (NPIMA(M)ETTS(M)DSKVNPLL) and the 22-kDa (NQDMAMEASSDPEAA) fish phosphoproteins indicate that these unique amino acid sequences are unlike any published in protein databases. An enzyme-linked immunosorbent assay revealed that the 70-kDa phosphoprotein was present principally in bone and in calcified scales, whereas the 22-kDa phosphoprotein was detected only in bone. Immunohistological analysis revealed diffusely positive immunostaining for both the 70- and 22-kDa phosphoproteins throughout the matrix of the bone. Overall, this work adds additional support to the concept that the mechanism of biological calcification has common evolutionary and fundamental bases throughout vertebrate species. PMID:20833721

  10. Characterization, production, and purification of leucocin H, a two-peptide bacteriocin from Leuconostoc MF215B.

    PubMed

    Blom, H; Katla, T; Holck, A; Sletten, K; Axelsson, L; Holo, H

    1999-07-01

    Leuconostoc MF215B was found to produce a two-peptide bacteriocin referred to as leucocin H. The two peptides were termed leucocin Halpha and leucocin Hbeta. When acting together, they inhibit, among others, Listeria monocytogenes, Bacillus cereus, and Clostridium perfringens. Production of leucocin H in growth medium takes place at temperatures down to 6 degrees C and at pH below 7. The highest activity of leucocin H in growth medium was demonstrated in the late exponential growth phase. The bacteriocin was purified by precipitation with ammonium sulfate, ion-exchange (SP Sepharose) and reverse phase chromatography. Upon purification, specific activity increased 10(5)-fold, and the final specific activity was 2 x 10(7) BU/OD280. Amino acid composition analyses of leucocin Halpha and leucocin Hbeta indicated that both peptides consisted of around 40 amino acid residues. Their N-termini were blocked for Edman degradation, and the methionin residues of leucocin Hbeta did not respond to Cyanogen Bromide (CNBr) cleavage. Absorbance at 280 nm indicated the presence of tryptophan residues and tryptophan-fracturing opened for partial sequencing by Edman degradation. From leucocin Halpha, the sequence of 20 amino acids was obtained; from leucocin Hbeta the sequence of 28 amino acid residues was obtained. No sequence homology to other known bacteriocins could be demonstrated. It also appeared that the two peptides themselves shared little or no sequence homology. The presence of soy oil did not affect the activity of leucocin H in agar.

  11. FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.

    PubMed

    El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant

    2016-01-01

    A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein-DNA interfaces.

  12. Complete amino acid sequence of ananain and a comparison with stem bromelain and other plant cysteine proteases.

    PubMed Central

    Lee, K L; Albee, K L; Bernasconi, R J; Edmunds, T

    1997-01-01

    The amino acid sequences of ananain (EC3.4.22.31) and stem bromelain (3.4.22.32), two cysteine proteases from pineapple stem, are similar yet ananain and stem bromelain possess distinct specificities towards synthetic peptide substrates and different reactivities towards the cysteine protease inhibitors E-64 and chicken egg white cystatin. We present here the complete amino acid sequence of ananain and compare it with the reported sequences of pineapple stem bromelain, papain and chymopapain from papaya and actinidin from kiwifruit. Ananain is comprised of 216 residues with a theoretical mass of 23464 Da. This primary structure includes a sequence insert between residues 170 and 174 not present in stem bromelain or papain and a hydrophobic series of amino acids adjacent to His-157. It is possible that these sequence differences contribute to the different substrate and inhibitor specificities exhibited by ananain and stem bromelain. PMID:9355753

  13. Probabilistic grammatical model for helix‐helix contact site classification

    PubMed Central

    2013-01-01

    Background Hidden Markov Models power many state‐of‐the‐art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium‐ and long‐range residue‐residue interactions. This requires an expressive power of at least context‐free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited. Results In this work, we present a probabilistic grammatical framework for problem‐specific protein languages and apply it to classification of transmembrane helix‐helix pairs configurations. The core of the model consists of a probabilistic context‐free grammar, automatically inferred by a genetic algorithm from only a generic set of expert‐based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix‐helix contact site configurations. The highest performance of the classifiers reached AUCROC of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix‐helix contact sites. Conclusions We demonstrated that our probabilistic context‐free framework for analysis of protein sequences outperforms the state of the art in the task of helix‐helix contact site classification. However, this is achieved without necessarily requiring modeling long range dependencies between interacting residues. A significant feature of our approach is that grammar rules and parse trees are human‐readable. Thus they could provide biologically meaningful information for molecular biologists. PMID:24350601

  14. Cloning and sequencing of the cDNA species for mammalian dimeric dihydrodiol dehydrogenases.

    PubMed Central

    Arimitsu, E; Aoki, S; Ishikura, S; Nakanishi, K; Matsuura, K; Hara, A

    1999-01-01

    Cynomolgus and Japanese monkey kidneys, dog and pig livers and rabbit lens contain dimeric dihydrodiol dehydrogenase (EC 1.3.1.20) associated with high carbonyl reductase activity. Here we have isolated cDNA species for the dimeric enzymes by reverse transcriptase-PCR from human intestine in addition to the above five animal tissues. The amino acid sequences deduced from the monkey, pig and dog cDNA species perfectly matched the partial sequences of peptides digested from the respective enzymes of these animal tissues, and active recombinant proteins were expressed in a bacterial system from the monkey and human cDNA species. Northern blot analysis revealed the existence of a single 1.3 kb mRNA species for the enzyme in these animal tissues. The human enzyme shared 94%, 85%, 84% and 82% amino acid identity with the enzymes of the two monkey strains (their sequences were identical), the dog, the pig and the rabbit respectively. The sequences of the primate enzymes consisted of 335 amino acid residues and lacked one amino acid compared with the other animal enzymes. In contrast with previous reports that other types of dihydrodiol dehydrogenase, carbonyl reductases and enzymes with either activity belong to the aldo-keto reductase family or the short-chain dehydrogenase/reductase family, dimeric dihydrodiol dehydrogenase showed no sequence similarity with the members of the two protein families. The dimeric enzyme aligned with low degrees of identity (14-25%) with several prokaryotic proteins, in which 47 residues are strictly or highly conserved. Thus dimeric dihydrodiol dehydrogenase has a primary structure distinct from the previously known mammalian enzymes and is suggested to constitute a novel protein family with the prokaryotic proteins. PMID:10477285

  15. NS5A Sequence Heterogeneity and Mechanisms of Daclatasvir Resistance in Hepatitis C Virus Genotype 4 Infection.

    PubMed

    Zhou, Nannan; Hernandez, Dennis; Ueland, Joseph; Yang, Xiaoyan; Yu, Fei; Sims, Karen; Yin, Philip D; McPhee, Fiona

    2016-01-15

    Daclatasvir is an NS5A inhibitor approved for treatment of infection due to hepatitis C virus (HCV) genotypes (GTs) 1-4. To support daclatasvir use in HCV genotype 4 infection, we examined a diverse genotype 4-infected population for HCV genotype 4 subtype prevalence, NS5A polymorphisms at residues associated with daclatasvir resistance (positions 28, 30, 31, or 93), and their effects on daclatasvir activity in vitro and clinically. We performed phylogenetic analysis of genotype 4 NS5A sequences from 186 clinical trial patients and 43 sequences from the European HCV database, and susceptibility analyses of NS5A polymorphisms and patient-derived NS5A sequences by using genotype 4 NS5A hybrid genotype 2a replicons. The clinical trial patients represented 14 genotype 4 subtypes; most prevalent were genotype 4a (55%) and genotype 4d (27%). Daclatasvir 50% effective concentrations for 10 patient-derived NS5A sequences representing diverse phylogenetic clusters were ≤0.080 nM. Most baseline sequences had ≥1 NS5A polymorphism at residues associated with daclatasvir resistance; however, only 3 patients (1.6%) had polymorphisms conferring ≥1000-fold daclatasvir resistance in vitro. Among 46 patients enrolled in daclatasvir trials, all 20 with baseline resistance polymorphisms achieved a sustained virologic response. Circulating genotype 4 subtypes are genetically diverse. Polymorphisms conferring high-level daclatasvir resistance in vitro are uncommon before therapy, and clinical data suggest that genotype 4 subtype and baseline polymorphisms have minimal impact on responses to daclatasvir-containing regimens. © The Author 2015. Published by Oxford University Press for the Infectious Diseases Society of America.

  16. SODa: an Mn/Fe superoxide dismutase prediction and design server.

    PubMed

    Kwasigroch, Jean Marc; Wintjens, René; Gilis, Dimitri; Rooman, Marianne

    2008-06-02

    Superoxide dismutases (SODs) are ubiquitous metalloenzymes that play an important role in the defense of aerobic organisms against oxidative stress, by converting reactive oxygen species into nontoxic molecules. We focus here on the SOD family that uses Fe or Mn as cofactor. The SODa webtool http://babylone.ulb.ac.be/soda predicts if a target sequence corresponds to an Fe/Mn SOD. If so, it predicts the metal ion specificity (Fe, Mn or cambialistic) and the oligomerization mode (dimer or tetramer) of the target. In addition, SODa proposes a list of residue substitutions likely to improve the predicted preferences for the metal cofactor and oligomerization mode. The method is based on residue fingerprints, consisting of residues conserved in SOD sequences or typical of SOD subgroups, and of interaction fingerprints, containing residue pairs that are in contact in SOD structures. SODa is shown to outperform and to be more discriminative than traditional techniques based on pairwise sequence alignments. Moreover, the fact that it proposes selected mutations makes it a valuable tool for rational protein design.

  17. Photoaffinity Labeling of Ras Converting Enzyme using Peptide Substrates that Incorporate Benzoylphenylalanine (Bpa) Residues: Improved Labeling and Structural Implications

    PubMed Central

    Kyro, Kelly; Manandhar, Surya P.; Mullen, Daniel; Schmidt, Walter K.; Distefano, Mark D.

    2012-01-01

    Rce1p catalyzes the proteolytic trimming of C-terminal tripeptides from isoprenylated proteins containing CAAX-box sequences. Because Rce1p processing is a necessary component in the Ras pathway of oncogenic signal transduction, Rce1p holds promise as a potential target for therapeutic intervention. However, its mechanism of proteolysis and active site have yet to be defined. Here, we describe synthetic peptide analogues that mimic the natural lipidated Rce1p substrate and incorporate photolabile groups for photoaffinity-labeling applications. These photoactive peptides are designed to crosslink to residues in or near the Rce1p active site. By incorporating the photoactive group via p-benzoyl-L-phenylalanine (Bpa) residues directly into the peptide substrate sequence, the labeling efficiency was substantially increased relative to a previously-synthesized compound. Incorporation of biotin on the N-terminus of the peptides permitted photolabeled Rce1p to be isolated via streptavidin affinity capture. Our findings further suggest that residues outside the CAAX-box sequence are in contact with Rce1p, which has implications for future inhibitor design. PMID:22079863

  18. Biosynthesis and processing of the somatostatin family of peptide hormones.

    PubMed

    Andrews, P C; Dixon, J E

    1986-01-01

    Understanding of the biosynthesis of the somatostatin family of peptide hormones has greatly increased in recent years. Isolation and sequencing of the rat somatostatin gene indicates that it contains a single intron located between the codons for Gn(-57) and Glu(-56) of pre-prosomatostatin. The gene contains three repetitive sequences, one at the 5' end of the gene and two of them 3' to the coding portion. Two of the sequences consist of alternating purine-pyrimidine bases and have been shown to adopt Z-DNA structures in vitro. The cDNA for rat somatostatin codes for a 116-residue peptide structurally similar to the anglerfish and catfish precursors to the 14-residue somatostatin (SST-14). In addition to SST-14, the catfish and the anglerfish both contain an additional pancreatic somatostatin, each derived from a different gene. The catfish contains a 22-residue somatostatin, which is O-glycosylated at Thr-5. The second somatostatin gene from anglerfish encodes a prosomatostatin that is processed to a 28-residue peptide. The mature peptide contains a hydroxylated lysine at position 23.

  19. Practical analysis of specificity-determining residues in protein families.

    PubMed

    Chagoyen, Mónica; García-Martín, Juan A; Pazos, Florencio

    2016-03-01

    Determining the residues that are important for the molecular activity of a protein is a topic of broad interest in biomedicine and biotechnology. This knowledge can help understanding the protein's molecular mechanism as well as to fine-tune its natural function eventually with biotechnological or therapeutic implications. Some of the protein residues are essential for the function common to all members of a family of proteins, while others explain the particular specificities of certain subfamilies (like binding on different substrates or cofactors and distinct binding affinities). Owing to the difficulty in experimentally determining them, a number of computational methods were developed to detect these functional residues, generally known as 'specificity-determining positions' (or SDPs), from a collection of homologous protein sequences. These methods are mature enough for being routinely used by molecular biologists in directing experiments aimed at getting insight into the functional specificity of a family of proteins and eventually modifying it. In this review, we summarize some of the recent discoveries achieved through SDP computational identification in a number of relevant protein families, as well as the main approaches and software tools available to perform this type of analysis. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  20. Predicting Protein-Protein Interactions by Combing Various Sequence-Derived.

    PubMed

    Zhao, Xiao-Wei; Ma, Zhi-Qiang; Yin, Ming-Hao

    2011-09-20

    Knowledge of protein-protein interactions (PPIs) plays an important role in constructing protein interaction networks and understanding the general machineries of biological systems. In this study, a new method is proposed to predict PPIs using a comprehensive set of 930 features based only on sequence information, these features measure the interactions between residues a certain distant apart in the protein sequences from different aspects. To achieve better performance, the principal component analysis (PCA) is first employed to obtain an optimized feature subset. Then, the resulting 67-dimensional feature vectors are fed to Support Vector Machine (SVM). Experimental results on Drosophila melanogaster and Helicobater pylori datasets show that our method is very promising to predict PPIs and may at least be a useful supplement tool to existing methods.

  1. Existence of a True Phosphofructokinase in Bacillus sphaericus: Cloning and Sequencing of the pfk Gene

    PubMed Central

    Alice, Alejandro F.; Pérez-Martínez, Gaspar; Sánchez-Rivas, Carmen

    2002-01-01

    Some strains of Bacillus sphaericus are entomopathogenic to mosquito larvae, which transmit diseases, such as filariasis and malaria, affecting millions of people worldwide. This species is unable to use hexoses and pentoses as unique carbon sources, which was proposed to be due to the lack of glycolytic enzymes, such as 6-phosphofructokinase (PFK). In this study, PFK activity was detected and the pfk gene was cloned and sequenced. Furthermore, this gene was shown to be present in strains belonging to all the homology groups of this heterogeneous species, in which PFK activity was also detected. A careful sequence analysis revealed the conservation of different catalytic and regulatory residues, as well as the enzyme's phylogenetic affiliation with the family of allosteric ATP-PFK enzymes. PMID:12450869

  2. Toward rules relating zinc finger protein sequences and DNA binding site preferences.

    PubMed

    Desjarlais, J R; Berg, J M

    1992-08-15

    Zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain appears to contact three adjacent base pairs of DNA through three key residues. We have designed and prepared a series of variants of the central zinc finger within the DNA binding domain of Sp1 by using information from an analysis of a large data base of zinc finger protein sequences. Through systematic variations at two of the three contact positions (underlined), relatively specific recognition of sequences of the form 5'-GGGGN(G or T)GGG-3' has been achieved. These results provide the basis for rules that may develop into a code that will allow the design of zinc finger proteins with preselected DNA site specificity.

  3. An extension of command shaping methods for controlling residual vibration using frequency sampling

    NASA Technical Reports Server (NTRS)

    Singer, Neil C.; Seering, Warren P.

    1992-01-01

    The authors present an extension to the impulse shaping technique for commanding machines to move with reduced residual vibration. The extension, called frequency sampling, is a method for generating constraints that are used to obtain shaping sequences which minimize residual vibration in systems such as robots whose resonant frequencies change during motion. The authors present a review of impulse shaping methods, a development of the proposed extension, and a comparison of results of tests conducted on a simple model of the space shuttle robot arm. Frequency shaping provides a method for minimizing the impulse sequence duration required to give the desired insensitivity.

  4. Mutations in the Ada O6-alkylguanine-DNA alkyltransferase conferring sensitivity to inactivation by O6-benzylguanine and 2,4-diamino-6-benzyloxy-5-nitrosopyrimidine.

    PubMed

    Crone, T M; Kanugula, S; Pegg, A E

    1995-08-01

    Although the human O6-alkylguanine-DNA alkyltransferase (AGT) is very sensitive to inactivation by O6-benzylguanine (BG) or 2,4-diamino-6-benzyloxy-5-nitrosopyrimidine (5-nitroso-BP), the equivalent protein formed by the carboxyl terminal domain of the product of the Escherichia coli ada gene (Ada-C) is unaffected by these inhibitors. This difference is remarkable in view of the substantial similarity between these proteins (33% of the residues in the common sequence are identical) and is potentially very important since these inhibitors are under development as drugs to enhance the anti-tumor activity of alkylating agents. In order to understand the reason for the resistance of the Ada-C protein, we have made chimeras between Ada-C and AGT sequences and mutations in the Ada-C protein, expressed the altered proteins in an E. coli strain lacking endogenous alkyltransferase activity and tested the inactivation of the resulting proteins by BG or 5-nitroso-BP. Chimeric alkyltransferase proteins were made in which the residues on the amino side of the cysteine acceptor site came from Ada-C and the residues on the carboxyl side came from AGT and vice versa but these did not show sensitivity to BG suggesting that resistance is produced by residues in both segments of the protein. Analysis of the Ada-C mutant proteins revealed two sites for mutations that confer sensitivity to these inhibitors. One of these was tryptophan-336 and the other was residues lysine-314 and alanine-316. Thus, when the combined mutations of A316P/W336A were made in the Ada-C sequence, the protein was sensitive to inactivation by BG. This A316P/W336A mutant protein was even more sensitive to 5-nitroso-BP and the mutant proteins W336A, K314P/A316P and A316P could also be inhibited by this drug (in decreasing order of sensitivity) although the control Ada-C and a mutant R335S were not inhibited. These results provide strong support for the hypothesis that the resistance of the Ada-C alkyl-transferase is due to a steric effect limiting access to the active site. Insertion of proline residues at positions 314 and 316 and removal of the bulky tryptophan residue at position 336 increases the space available at the active site and permits these inhibitors to be effective.

  5. A G-to-A mutation in IVS-3 of the human gamma fibrinogen gene causing afibrinogenemia due to abnormal RNA splicing.

    PubMed

    Margaglione, M; Santacroce, R; Colaizzo, D; Seripa, D; Vecchione, G; Lupone, M R; De Lucia, D; Fortina, P; Grandone, E; Perricone, C; Di Minno, G

    2000-10-01

    Congenital afibrinogenemia is a rare autosomal recessive disorder characterized by a hemorrhagic diathesis of variable severity. Although more than 100 families with this disorder have been described, genetic defects have been characterized in few cases. An investigation of a young propositus, offspring of a consanguineous marriage, with undetectable levels of functional and quantitative fibrinogen, was conducted. Sequence analysis of the fibrinogen genes showed a homozygous G-to-A mutation at the fifth nucleotide (nt 2395) of the third intervening sequence (IVS) of the gamma-chain gene. Her first-degree relatives, who had approximately half the normal fibrinogen values and showed concordance between functional and immunologic levels, were heterozygtes. The G-to-A change predicts the disappearance of a donor splice site. After transfection with a construct, containing either the wild-type or the mutated sequence, cells with the mutant construct showed an aberrant messenger RNA (mRNA), consistent with skipping of exon 3, but not the expected mRNA. Sequencing of the abnormal mRNA showed the complete absence of exon 3. Skipping of exon 3 predicts the deletion of amino acid sequence from residue 16 to residue 75 and shifting of reading frame at amino acid 76 with a premature stop codon within exon 4 at position 77. Thus, the truncated gamma-chain gene product would not interact with other chains to form the mature fibrinogen molecule. The current findings show that mutations within highly conserved IVS regions of fibrinogen genes could affect the efficiency of normal splicing, giving rise to congenital afibrinogenemia.

  6. On the Split Personality of Penultimate Proline

    PubMed Central

    Glover, Matthew S.; Shi, Liuqing; Fuller, Daniel R.; Arnold, Randy J.; Radivojac, Predrag; Clemmer, David E.

    2014-01-01

    The influence of the position of the amino acid proline in polypeptide sequences is examined by a combination of ion mobility spectrometry-mass spectrometry (IMS-MS), amino acid substitutions, and molecular modeling. The results suggest that when proline exists as the second residue from the N-terminus (i.e., penultimate proline), two families of conformers are formed. We demonstrate the existence of these families by a study of a series of truncated and mutated peptides derived from the 11-residue peptide Ser1-Pro2-Glu3-Leu4-Pro5-Ser6-Pro7-Gln8-Ala9-Glu10-Lys11. We find that every peptide from this sequence with a penultimate proline residue has multiple conformations. Substitution of Ala for Pro residues indicates that multiple conformers arise from the cis- trans isomerization of Xaa1–Pro2 peptide bonds as Xaa–Ala peptide bonds are unlikely to adopt the cis isomer, and examination of spectra from a library of 58 peptides indicates that ~80% of sequences show this effect. A simple mechanism suggesting that the barrier between the cis-and trans-proline forms is lowered because of low steric impedance is proposed. This observation may have interesting biological implications as well, and we note that a number of biologically active peptides have penultimate proline residues. PMID:25503299

  7. Effect of P to A Mutation of the N-Terminal Residue Adjacent to the Rgd Motif on Rhodostomin: Importance of Dynamics in Integrin Recognition

    PubMed Central

    Chen, Yi-Chun; Chang, Yao-Tsung; Chang, Yung-Sheng; Huang, Chun-Hao; Chuang, Woei-Jer

    2012-01-01

    Rhodostomin (Rho) is an RGD protein that specifically inhibits integrins. We found that Rho mutants with the P48A mutation 4.4–11.5 times more actively inhibited integrin α5β1. Structural analysis showed that they have a similar 3D conformation for the RGD loop. Docking analysis also showed no difference between their interactions with integrin α5β1. However, the backbone dynamics of RGD residues were different. The values of the R2 relaxation parameter for Rho residues R49 and D51 were 39% and 54% higher than those of the P48A mutant, which caused differences in S2, Rex, and τe. The S2 values of the P48A mutant residues R49, G50, and D51 were 29%, 14%, and 28% lower than those of Rho. The Rex values of Rho residues R49 and D51 were 0.91 s−1 and 1.42 s−1; however, no Rex was found for those of the P48A mutant. The τe values of Rho residues R49 and D51 were 9.5 and 5.1 times lower than those of P48A mutant. Mutational study showed that integrin α5β1 prefers its ligands to contain (G/A)RGD but not PRGD sequences for binding. These results demonstrate that the N-terminal proline residue adjacent to the RGD motif affect its function and dynamics, which suggests that the dynamic properties of the RGD motif may be important in Rho's interaction with integrin α5β1. PMID:22238583

  8. Quantitative expression of protein heterogeneity: Response of amino acid side chains to their local environment.

    PubMed

    Bandyopadhyay, Debashree; Mehler, Ernest L

    2008-08-01

    A general method has been developed to characterize the hydrophobicity or hydrophilicity of the microenvironment (MENV), in which a given amino acid side chain is immersed, by calculating a quantitative property descriptor (QPD) based on the relative (to water) hydrophobicity of the MENV. Values of the QPD were calculated for a test set of 733 proteins to analyze the modulating effects on amino acid residue properties by the MENV in which they are imbedded. The QPD values and solvent accessibility were used to derive a partitioning of residues based on the MENV hydrophobicities. From this partitioning, a new hydrophobicity scale was developed, entirely in the context of protein structure, where amino acid residues are immersed in one or more "MENVpockets." Thus, the partitioning is based on the residues "sampling" a large number of "solvents" (MENVs) that represent a very large range of hydrophobicity values. It was found that the hydrophobicity of around 80% of amino acid side chains and their MENV are complementary to each other, but for about 20%, the MENV and their imbedded residue can be considered as mismatched. Many of these mismatches could be rationalized in terms of the structural stability of the protein and/or the involvement of the imbedded residue in function. The analysis also indicated a remarkable conservation of local environments around highly conserved active site residues that have similar functions across protein families, but where members have relatively low sequence homology. Thus, quantitative evaluation of this QPD is suggested, here, as a tool for structure-function prediction, analysis, and parameter development for the calculation of properties in proteins. (c) 2008 Wiley-Liss, Inc.

  9. Experimental assessment of the importance of amino acid positions identified by an entropy-based correlation analysis of multiple-sequence alignments.

    PubMed

    Dietrich, Susanne; Borst, Nadine; Schlee, Sandra; Schneider, Daniel; Janda, Jan-Oliver; Sterner, Reinhard; Merkl, Rainer

    2012-07-17

    The analysis of a multiple-sequence alignment (MSA) with correlation methods identifies pairs of residue positions whose occupation with amino acids changes in a concerted manner. It is plausible to assume that positions that are part of many such correlation pairs are important for protein function or stability. We have used the algorithm H2r to identify positions k in the MSAs of the enzymes anthranilate phosphoribosyl transferase (AnPRT) and indole-3-glycerol phosphate synthase (IGPS) that show a high conn(k) value, i.e., a large number of significant correlations in which k is involved. The importance of the identified residues was experimentally validated by performing mutagenesis studies with sAnPRT and sIGPS from the archaeon Sulfolobus solfataricus. For sAnPRT, five H2r mutant proteins were generated by replacing nonconserved residues with alanine or the prevalent residue of the MSA. As a control, five residues with conn(k) values of zero were chosen randomly and replaced with alanine. The catalytic activities and conformational stabilities of the H2r and control mutant proteins were analyzed by steady-state enzyme kinetics and thermal unfolding studies. Compared to wild-type sAnPRT, the catalytic efficiencies (k(cat)/K(M)) were largely unaltered. In contrast, the apparent thermal unfolding temperature (T(M)(app)) was lowered in most proteins. Remarkably, the strongest observed destabilization (ΔT(M)(app) = 14 °C) was caused by the V284A exchange, which pertains to the position with the highest correlation signal [conn(k) = 11]. For sIGPS, six H2r mutant and four control proteins with alanine exchanges were generated and characterized. The k(cat)/K(M) values of four H2r mutant proteins were reduced between 13- and 120-fold, and their T(M)(app) values were decreased by up to 5 °C. For the sIGPS control proteins, the observed activity and stability decreases were much less severe. Our findings demonstrate that positions with high conn(k) values have an increased probability of being important for enzyme function or stability.

  10. A new earthworm cellulase and its possible role in the innate immunity.

    PubMed

    Park, In Yong; Cha, Ju Roung; Ok, Suk-Mi; Shin, Chuog; Kim, Jin-Se; Kwak, Hee-Jin; Yu, Yun-Sang; Kim, Yu-Kyung; Medina, Brenda; Cho, Sung-Jin; Park, Soon Cheol

    2017-02-01

    A new endogenous cellulase (Ean-EG) from the earthworm, Eisenia andrei and its expression pattern are demonstrated. Based on a deduced amino acid sequence, the open reading frame (ORF) of Ean-EG consisted of 1368 bps corresponding to a polypeptide of 456 amino acid residues in which is contained the conserved region specific to GHF9 that has the essential amino acid residues for enzyme activity. In multiple alignments and phylogenetic analysis, the deduced amino acid sequence of Ean- EG showed the highest sequence similarity (about 79%) to that of an annelid (Pheretima hilgendorfi) and could be clustered together with other GHF9 cellulases, indicating that Ean-EG could be categorized as a member of the GHF9 to which most animal cellulases belong. The histological expression pattern of Ean-EG mRNA using in situ hybridization revealed that the most distinct expression was observed in epithelial cells with positive hybridization signal in epidermis, chloragogen tissue cells, coelomic cell-aggregate, and even blood vessel, which could strongly support the fact that at least in the earthworm, Eisenia andrei, cellulase function must not be limited to digestive process but be possibly extended to the innate immunity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. PuLSE: Quality control and quantification of peptide sequences explored by phage display libraries.

    PubMed

    Shave, Steven; Mann, Stefan; Koszela, Joanna; Kerr, Alastair; Auer, Manfred

    2018-01-01

    The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.

  12. The amino acid sequence of Staphylococcus aureus penicillinase.

    PubMed Central

    Ambler, R P

    1975-01-01

    The amino acid sequence of the penicillinase (penicillin amido-beta-lactamhydrolase, EC 3.5.2.6) from Staphylococcus aureus strain PC1 was determined. The protein consists of a single polypeptide chain of 257 residues, and the sequence was determined by characterization of tryptic, chymotryptic, peptic and CNBr peptides, with some additional evidence from thermolysin and S. aureus proteinase peptides. A mistake in the preliminary report of the sequence is corrected; residues 113-116 are now thought to be -Lys-Lys-Val-Lys- rather than -Lys-Val-Lys-Lys-. Detailed evidence for the amino acid sequence has been deposited as Supplementary Publication SUP 50056 (91 pages) at the British Library (Lending Division), Boston Spa, Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1218078

  13. Homologous kappa-neurotoxins exhibit residue-specific interactions with the alpha 3 subunit of the nicotinic acetylcholine receptor: a comparison of the structural requirements for kappa-bungarotoxin and kappa-flavitoxin binding.

    PubMed

    McLane, K E; Weaver, W R; Lei, S; Chiappinelli, V A; Conti-Tronconi, B M

    1993-07-13

    kappa-Flavotoxin (kappa-FTX), a snake neurotoxin that is a selective antagonist of certain neuronal nicotinic acetylcholine receptors (AChRs), has recently been isolated and characterized [Grant, G. A., Frazier, M. W., & Chiappinelli, V. A. (1988) Biochemistry 27, 1532-1537]. Like the related snake toxin kappa-bungarotoxin (kappa-BTX), kappa-FTX binds with high affinity to alpha 3 subtypes of neuronal AChRs, even though there are distinct sequence differences between the two toxins. To further characterize the sequence regions of the neuronal AChR alpha 3 subunit involved in formation of the binding site for this family of kappa-neurotoxins, we investigated kappa-FTX binding to overlapping synthetic peptides screening the alpha 3 subunit sequence. A sequence region forming a "prototope" for kappa-FTX was identified within residues alpha 3 (51-70), confirming the suggestions of previous studies on the binding of kappa-BTX to the alpha 3 subunit [McLane, K. E., Tang, F., & Conti-Tronconi, B. M. (1990) J. Biol. Chem. 265, 1537-1544] and alpha-bungarotoxin to the Torpedo AChR alpha subunit [Conti-Tronconi, B. M., Tang, F., Diethelm, B. M., Spencer, S. R., Reinhardt-Maelicke, S., & Maelicke, A. (1990) Biochemistry 29, 6221-6230] that this sequence region is involved in formation of a cholinergic site. Single residue substituted analogues, where each residue of the sequence alpha 3 (51-70) was sequentially replaced by a glycine, were used to identify the amino acid side chains involved in the interaction of this prototope with kappa-FTX.(ABSTRACT TRUNCATED AT 250 WORDS)

  14. Preparation and properties of pure, full-length IclR protein of Escherichia coli. Use of time-of-flight mass spectrometry to investigate the problems encountered.

    PubMed Central

    Donald, L. J.; Chernushevich, I. V.; Zhou, J.; Verentchikov, A.; Poppe-Schriemer, N.; Hosfield, D. J.; Westmore, J. B.; Ens, W.; Duckworth, H. W.; Standing, K. G.

    1996-01-01

    IclR protein, the repressor of the aceBAK operon of Escherichia coli, has been examined by time-of-flight mass spectrometry, with ionization by matrix assisted laser desorption or by electrospray. The purified protein was found to have a smaller mass than that predicted from the base sequence of the cloned iclR gene. Additional measurements were made on mixtures of peptides derived from IclR by treatment with trypsin and cyanogen bromide. They showed that the amino acid sequence is that predicted from the gene sequence, except that the protein has suffered truncation by removal of the N-terminal eight or, in some cases, nine amino acid residues. The peptide bond whose hydrolysis would remove eight residues is a typical target for the E. coli protease OmpT. We find that, by taking precautions to minimize Omp T proteolysis, or by eliminating it through mutation of the host strain, we can isolate full-length IclR protein (lacking only the N-terminal methionine residue). Full-length IclR is a much better DNA-binding protein than the truncated versions: it binds the aceBAK operator sequence 44-fold more tightly, presumably because of additional contacts that the N-terminal residues make with the DNA. Our experience thus demonstrates the advantages of using mass spectrometry to characterize newly purified proteins produced from cloned genes, especially where proteolysis or other covalent modification is a concern. This technique gives mass spectra from complex peptide mixtures that can be analyzed completely, without any fractionation of the mixtures, by reference to the amino acid sequence inferred from the base sequence of the cloned gene. PMID:8844850

  15. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

    NASA Astrophysics Data System (ADS)

    McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  16. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides.

    PubMed

    McMillen, Chelsea L; Wright, Patience M; Cassady, Carolyn J

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  17. Peptide Analysis Using Tandem Mass Spectrometry

    DTIC Science & Technology

    1989-06-01

    to give pyroglutamic acid during storage, eliminating ammonia. It is almost absent in the spectrum of a freshly-prepared sample and is not seen in...USING TANDEM MASS SPECTROMETRY INTRODUCTION S The objective of the project was to determine the complete amino acid sequence of the large polypeptide...Ubiquitin by use of fast atom bombardment (FAB) ionization and tandem mass spectrometry. The peptide containing 76 amino acid residues was available

  18. Molecular Cloning and Characteristic Features of a Novel Extracellular Tyrosinase from Aspergillus niger PA2.

    PubMed

    Agarwal, Pragati; Singh, Jyoti; Singh, R P

    2017-05-01

    Aspergillus niger PA2, a novel strain isolated from waste effluents of food industry, is a potential extracellular tyrosinase producer. Enzyme activity and L-DOPA production were maximum when glucose and peptone were employed as C source and nitrogen source respectively in the medium and enhanced notably when the copper was supplemented, thus depicting the significance of copper in tyrosinase activity. Tyrosinase-encoding gene from the fungus was cloned, and amplification of the tyrosinase gene yielded a 1127-bp DNA fragment and 374 amino acid residue long product that encoded for a predicted protein of 42.3 kDa with an isoelectric point of 4.8. Primary sequence analysis of A. niger PA2 tyrosinase had shown that it had approximately 99% identity with that of A. niger CBS 513.88, which was further confirmed by phylogenetic analysis. The inferred amino acid sequence of A. niger tyrosinase contained two putative copper-binding sites comprising of six histidines, a characteristic feature for type-3 copper proteins, which were highly conserved in all tyrosinases throughout the Aspergillus species. When superimposed onto the tertiary structure of A. oryzae tyrosinase, the conserved residues from both the organisms occupied same spatial positions to provide a di-copper-binding peptide groove.

  19. Structural characterisation of a water-soluble polysaccharide from tissue-cultured Dendrobium huoshanense C.Z. Tang et S.J. Cheng.

    PubMed

    Si, Hua-Yang; Chen, Nai-Fu; Chen, Nai-Dong; Huang, Cheng; Li, Jun; Wang, Hui

    2018-02-01

    A water-soluble polysaccharide TC-DHPA4 with a molecular weight of 8.0 × 10 5  Da was isolated from tissue-cultured Dendrobium huoshanense by anion exchange and gel permeation chromatography. Monosaccharide analysis revealed that the homogeneous polysaccharide was made up of rhamnose, arabinose, mannose, glucose, galactose and glucuronic acid with a molar ratio of 1.28:1:1.67:4.71:10.43:1.42. The sugar residue sequence analysis based on the GC-MS files and NMR spectra indicated that the backbone of TC-DHPA4 consisted of the repeated units:→6)-β-Galp-(1→6)-β-Galp-(1→4)-β-GlcpA-(1→6)-β-Glcp-(1→6)-β-Glcp-(→. The sugar residue sequences β-Glcp-(1→)-α-Rhap-(1→3)-β-Galp-(1→, β-Glcp-(1→4)-α-Rhap-(1→3)-β-Galp-(1→, β-Galp-(1→6)-β-Manp-(1→3)-β-Galp-(1→, and α-l-Araf-(1→2)-β-Manp-(1→3)-β-Galp-(1→ were identified as the branches attached to the C-3 position of (1→6)-linked galactose in the backbone.

  20. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression

    PubMed Central

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271

  1. m6aViewer: software for the detection, analysis, and visualization of N6-methyladenosine peaks from m6A-seq/ME-RIP sequencing data.

    PubMed

    Antanaviciute, Agne; Baquero-Perez, Belinda; Watson, Christopher M; Harrison, Sally M; Lascelles, Carolina; Crinnion, Laura; Markham, Alexander F; Bonthron, David T; Whitehouse, Adrian; Carr, Ian M

    2017-10-01

    Recent methods for transcriptome-wide N 6 -methyladenosine (m 6 A) profiling have facilitated investigations into the RNA methylome and established m 6 A as a dynamic modification that has critical regulatory roles in gene expression and may play a role in human disease. However, bioinformatics resources available for the analysis of m 6 A sequencing data are still limited. Here, we describe m6aViewer-a cross-platform application for analysis and visualization of m 6 A peaks from sequencing data. m6aViewer implements a novel m 6 A peak-calling algorithm that identifies high-confidence methylated residues with more precision than previously described approaches. The application enables data analysis through a graphical user interface, and thus, in contrast to other currently available tools, does not require the user to be skilled in computer programming. m6aViewer and test data can be downloaded here: http://dna2.leeds.ac.uk/m6a. © 2017 Antanaviciute et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  2. Molecular cloning and sequence analysis of stearoyl-CoA desaturase in milkfish, Chanos chanos.

    PubMed

    Hsieh, S L; Liao, W L; Kuo, C M

    2001-12-01

    Stearoyl-CoA desaturase (EC 1.14.99.5) is a key enzyme in the biosynthesis of polyunsaturated fatty acids and the maintenance of the homeoviscous fluidity of biological membranes. The stearoyl-CoA desaturase cDNA in milkfish (Chanos chanos) was cloned by RT-PCR and RACE, and it was compared with the stearoyl-CoA desaturase in cold-tolerant teleosts, common carp and grass carp. Nucleotide sequence analysis revealed that the cDNA clone has a 972-bp open reading frame encoding 323 amino acid residues. Alignments of the deduced amino acid sequence showed that the milkfish stearoyl-CoA desaturase shares 79% and 75% identity with common carp and grass carp, and 63%-64% with other vertebrates such as sheep, hamsters, rats, mice, and humans. Like common carp and grass carp, the deduced amino acid sequence in milkfish well conserves three histidine cluster motifs (one HXXXXH and two HXXHH) that are essential for catalysis of stearoyl-CoA desaturase activity. However, RT-PCR analysis showed that stearoyl-CoA desaturase expression in milkfish is detected in the tissues of liver, muscle, kidney, brain, and gill, and more expression sites were found in milkfish than in common carp and grass carp. Phylogenic relationships among the deduced stearoyl-CoA desaturase amino acid sequence in milkfish and those in other vertebrates showed that the milkfish stearoyl-CoA desaturase amino acid sequence is phylogenetically closer to those of common carp and grass carp than to other higher vertebrates.

  3. Exploring the Origin of Differential Binding Affinities of Human Tubulin Isotypes αβII, αβIII and αβIV for DAMA-Colchicine Using Homology Modelling, Molecular Docking and Molecular Dynamics Simulations

    PubMed Central

    Panda, Dulal; Kunwar, Ambarish

    2016-01-01

    Tubulin isotypes are found to play an important role in regulating microtubule dynamics. The isotype composition is also thought to contribute in the development of drug resistance as tubulin isotypes show differential binding affinities for various anti-cancer agents. Tubulin isotypes αβII, αβIII and αβIV show differential binding affinity for colchicine. However, the origin of differential binding affinity is not well understood at the molecular level. Here, we investigate the origin of differential binding affinity of a colchicine analogue N-deacetyl-N-(2-mercaptoacetyl)-colchicine (DAMA-colchicine) for human αβII, αβIII and αβIV isotypes, employing sequence analysis, homology modeling, molecular docking, molecular dynamics simulation and MM-GBSA binding free energy calculations. The sequence analysis study shows that the residue compositions are different in the colchicine binding pocket of αβII and αβIII, whereas no such difference is present in αβIV tubulin isotypes. Further, the molecular docking and molecular dynamics simulations results show that residue differences present at the colchicine binding pocket weaken the bonding interactions and the correct binding of DAMA-colchicine at the interface of αβII and αβIII tubulin isotypes. Post molecular dynamics simulation analysis suggests that these residue variations affect the structure and dynamics of αβII and αβIII tubulin isotypes, which in turn affect the binding of DAMA-colchicine. Further, the binding free-energy calculation shows that αβIV tubulin isotype has the highest binding free-energy and αβIII has the lowest binding free-energy for DAMA-colchicine. The order of binding free-energy for DAMA-colchicine is αβIV ≃ αβII >> αβIII. Thus, our computational approaches provide an insight into the effect of residue variations on differential binding of αβII, αβIII and αβIV tubulin isotypes with DAMA-colchicine and may help to design new analogues with higher binding affinities for tubulin isotypes. PMID:27227832

  4. Relationships between residue Voronoi volume and sequence conservation in proteins.

    PubMed

    Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung

    2018-02-01

    Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Increasing Prion Propensity by Hydrophobic Insertion

    PubMed Central

    Petri, Michelina; Flores, Noe; Rogge, Ryan A.; Cascarina, Sean M.; Ross, Eric D.

    2014-01-01

    Prion formation involves the conversion of proteins from a soluble form into an infectious amyloid form. Most yeast prion proteins contain glutamine/asparagine-rich regions that are responsible for prion aggregation. Prion formation by these domains is driven primarily by amino acid composition, not primary sequence, yet there is a surprising disconnect between the amino acids thought to have the highest aggregation propensity and those that are actually found in yeast prion domains. Specifically, a recent mutagenic screen suggested that both aromatic and non-aromatic hydrophobic residues strongly promote prion formation. However, while aromatic residues are common in yeast prion domains, non-aromatic hydrophobic residues are strongly under-represented. Here, we directly test the effects of hydrophobic and aromatic residues on prion formation. Remarkably, we found that insertion of as few as two hydrophobic residues resulted in a multiple orders-of-magnitude increase in prion formation, and significant acceleration of in vitro amyloid formation. Thus, insertion or deletion of hydrophobic residues provides a simple tool to control the prion activity of a protein. These data, combined with bioinformatics analysis, suggest a limit on the number of strongly prion-promoting residues tolerated in glutamine/asparagine-rich domains. This limit may explain the under-representation of non-aromatic hydrophobic residues in yeast prion domains. Prion activity requires not only that a protein be able to form prion fibers, but also that these fibers be cleaved to generate new independently-segregating aggregates to offset dilution by cell division. Recent studies suggest that aromatic residues, but not non-aromatic hydrophobic residues, support the fiber cleavage step. Therefore, we propose that while both aromatic and non-aromatic hydrophobic residues promote prion formation, aromatic residues are favored in yeast prion domains because they serve a dual function, promoting both prion formation and chaperone-dependent prion propagation. PMID:24586661

  6. Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species

    NASA Technical Reports Server (NTRS)

    Haney, P. J.; Badger, J. H.; Buldak, G. L.; Reich, C. I.; Woese, C. R.; Olsen, G. J.

    1999-01-01

    The genome sequence of the extremely thermophilic archaeon Methanococcus jannaschii provides a wealth of data on proteins from a thermophile. In this paper, sequences of 115 proteins from M. jannaschii are compared with their homologs from mesophilic Methanococcus species. Although the growth temperatures of the mesophiles are about 50 degrees C below that of M. jannaschii, their genomic G+C contents are nearly identical. The properties most correlated with the proteins of the thermophile include higher residue volume, higher residue hydrophobicity, more charged amino acids (especially Glu, Arg, and Lys), and fewer uncharged polar residues (Ser, Thr, Asn, and Gln). These are recurring themes, with all trends applying to 83-92% of the proteins for which complete sequences were available. Nearly all of the amino acid replacements most significantly correlated with the temperature change are the same relatively conservative changes observed in all proteins, but in the case of the mesophile/thermophile comparison there is a directional bias. We identify 26 specific pairs of amino acids with a statistically significant (P < 0.01) preferred direction of replacement.

  7. Joint Frequency-Domain Equalization and Despreading for Multi-Code DS-CDMA Using Cyclic Delay Transmit Diversity

    NASA Astrophysics Data System (ADS)

    Yamamoto, Tetsuya; Takeda, Kazuki; Adachi, Fumiyuki

    Frequency-domain equalization (FDE) based on the minimum mean square error (MMSE) criterion can provide a better bit error rate (BER) performance than rake combining. To further improve the BER performance, cyclic delay transmit diversity (CDTD) can be used. CDTD simultaneously transmits the same signal from different antennas after adding different cyclic delays to increase the number of equivalent propagation paths. Although a joint use of CDTD and MMSE-FDE for direct sequence code division multiple access (DS-CDMA) achieves larger frequency diversity gain, the BER performance improvement is limited by the residual inter-chip interference (ICI) after FDE. In this paper, we propose joint FDE and despreading for DS-CDMA using CDTD. Equalization and despreading are simultaneously performed in the frequency-domain to suppress the residual ICI after FDE. A theoretical conditional BER analysis is presented for the given channel condition. The BER analysis is confirmed by computer simulation.

  8. Asymmetric Preorganization of Inverted Pair Residues in the Sodium-Calcium Exchanger

    PubMed Central

    Giladi, Moshe; Almagor, Lior; van Dijk, Liat; Hiller, Reuben; Man, Petr; Forest, Eric; Khananshvili, Daniel

    2016-01-01

    In analogy with many other proteins, Na+/Ca2+ exchangers (NCX) adapt an inverted twofold symmetry of repeated structural elements, while exhibiting a functional asymmetry by stabilizing an outward-facing conformation. Here, structure-based mutant analyses of the Methanococcus jannaschii Na+/Ca2+ exchanger (NCX_Mj) were performed in conjunction with HDX-MS (hydrogen/deuterium exchange mass spectrometry) to identify the structure-dynamic determinants of functional asymmetry. HDX-MS identified hallmark differences in backbone dynamics at ion-coordinating residues of apo-NCX_Mj, whereas Na+or Ca2+ binding to the respective sites induced relatively small, but specific, changes in backbone dynamics. Mutant analysis identified ion-coordinating residues affecting the catalytic capacity (kcat/Km), but not the stability of the outward-facing conformation. In contrast, distinct “noncatalytic” residues (adjacent to the ion-coordinating residues) control the stability of the outward-facing conformation, but not the catalytic capacity. The helix-breaking signature sequences (GTSLPE) on the α1 and α2 repeats (at the ion-binding core) differ in their folding/unfolding dynamics, while providing asymmetric contributions to transport activities. The present data strongly support the idea that asymmetric preorganization of the ligand-free ion-pocket predefines catalytic reorganization of ion-bound residues, where secondary interactions with adjacent residues couple the alternating access. These findings provide a structure-dynamic basis for ion-coupled alternating access in NCX and similar proteins. PMID:26876271

  9. Coordinate action of distinct sequence elements localizes checkpoint kinase Hsl1 to the septin collar at the bud neck in Saccharomyces cerevisiae

    PubMed Central

    Finnigan, Gregory C.; Sterling, Sarah M.; Duvalyan, Angela; Liao, Elizabeth N.; Sargsyan, Aspram; Garcia, Galo; Nogales, Eva; Thorner, Jeremy

    2016-01-01

    Passage through the eukaryotic cell cycle requires processes that are tightly regulated both spatially and temporally. Surveillance mechanisms (checkpoints) exert quality control and impose order on the timing and organization of downstream events by impeding cell cycle progression until the necessary components are available and undamaged and have acted in the proper sequence. In budding yeast, a checkpoint exists that does not allow timely execution of the G2/M transition unless and until a collar of septin filaments has properly assembled at the bud neck, which is the site where subsequent cytokinesis will occur. An essential component of this checkpoint is the large (1518-residue) protein kinase Hsl1, which localizes to the bud neck only if the septin collar has been correctly formed. Hsl1 reportedly interacts with particular septins; however, the precise molecular determinants in Hsl1 responsible for its recruitment to this cellular location during G2 have not been elucidated. We performed a comprehensive mutational dissection and accompanying image analysis to identify the sequence elements within Hsl1 responsible for its localization to the septins at the bud neck. Unexpectedly, we found that this targeting is multipartite. A segment of the central region of Hsl1 (residues 611–950), composed of two tandem, semiredundant but distinct septin-associating elements, is necessary and sufficient for binding to septin filaments both in vitro and in vivo. However, in addition to 611–950, efficient localization of Hsl1 to the septin collar in the cell obligatorily requires generalized targeting to the cytosolic face of the plasma membrane, a function normally provided by the C-terminal phosphatidylserine-binding KA1 domain (residues 1379–1518) in Hsl1 but that can be replaced by other, heterologous phosphatidylserine-binding sequences. PMID:27193302

  10. C-terminal amino acid residue loss for deprotonated peptide ions containing glutamic acid, aspartic acid, or serine residues at the C-terminus.

    PubMed

    Li, Zhong; Yalcin, Talat; Cassady, Carolyn J

    2006-07-01

    Deprotonated peptides containing C-terminal glutamic acid, aspartic acid, or serine residues were studied by sustained off-resonance irradiation collision-induced dissociation (SORI-CID) in a Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometer with ion production by electrospray ionization (ESI). Additional studies were performed by post source decay (PSD) in a matrix-assisted laser desorption ionization/time-of-flight (MALDI/TOF) mass spectrometer. This work included both model peptides synthesized in our laboratory and bioactive peptides with more complex sequences. During SORI-CID and PSD, [M - H]- and [M - 2H]2- underwent an unusual cleavage corresponding to the elimination of the C-terminal residue. Two mechanisms are proposed to occur. They involve nucleophilic attack on the carbonyl carbon of the adjacent residue by either the carboxylate group of the C-terminus or the side chain carboxylate group of C-terminal glutamic acid and aspartic acid residues. To confirm the proposed mechanisms, AAAAAD was labelled by 18O specifically on the side chain of the aspartic acid residue. For peptides that contain multiple C-terminal glutamic acid residues, each of these residues can be sequentially eliminated from the deprotonated ions; a driving force may be the formation of a very stable pyroglutamatic acid neutral. For peptides with multiple aspartic acid residues at the C-terminus, aspartic acid residue loss is not sequential. For peptides with multiple serine residues at the C-terminus, C-terminal residue loss is sequential; however, abundant loss of other neutral molecules also occurs. In addition, the presence of basic residues (arginine or lysine) in the sequence has no effect on C-terminal residue elimination in the negative ion mode.

  11. Protein consensus-based surface engineering (ProCoS): a computer-assisted method for directed protein evolution.

    PubMed

    Shivange, Amol V; Hoeffken, Hans Wolfgang; Haefner, Stefan; Schwaneberg, Ulrich

    2016-12-01

    Protein consensus-based surface engineering (ProCoS) is a simple and efficient method for directed protein evolution combining computational analysis and molecular biology tools to engineer protein surfaces. ProCoS is based on the hypothesis that conserved residues originated from a common ancestor and that these residues are crucial for the function of a protein, whereas highly variable regions (situated on the surface of a protein) can be targeted for surface engineering to maximize performance. ProCoS comprises four main steps: ( i ) identification of conserved and highly variable regions; ( ii ) protein sequence design by substituting residues in the highly variable regions, and gene synthesis; ( iii ) in vitro DNA recombination of synthetic genes; and ( iv ) screening for active variants. ProCoS is a simple method for surface mutagenesis in which multiple sequence alignment is used for selection of surface residues based on a structural model. To demonstrate the technique's utility for directed evolution, the surface of a phytase enzyme from Yersinia mollaretii (Ymphytase) was subjected to ProCoS. Screening just 1050 clones from ProCoS engineering-guided mutant libraries yielded an enzyme with 34 amino acid substitutions. The surface-engineered Ymphytase exhibited 3.8-fold higher pH stability (at pH 2.8 for 3 h) and retained 40% of the enzyme's specific activity (400 U/mg) compared with the wild-type Ymphytase. The pH stability might be attributed to a significantly increased (20 percentage points; from 9% to 29%) number of negatively charged amino acids on the surface of the engineered phytase.

  12. A Bioinformatics Approach to the Structure, Function, and Evolution of the Nucleoprotein of the Order Mononegavirales

    PubMed Central

    Cleveland, Sean B.; Davies, John; McClure, Marcella A.

    2011-01-01

    The goal of this Bioinformatic study is to investigate sequence conservation in relation to evolutionary function/structure of the nucleoprotein of the order Mononegavirales. In the combined analysis of 63 representative nucleoprotein (N) sequences from four viral families (Bornaviridae, Filoviridae, Rhabdoviridae, and Paramyxoviridae) we predict the regions of protein disorder, intra-residue contact and co-evolving residues. Correlations between location and conservation of predicted regions illustrate a strong division between families while high- lighting conservation within individual families. These results suggest the conserved regions among the nucleoproteins, specifically within Rhabdoviridae and Paramyxoviradae, but also generally among all members of the order, reflect an evolutionary advantage in maintaining these sites for the viral nucleoprotein as part of the transcription/replication machinery. Results indicate conservation for disorder in the C-terminus region of the representative proteins that is important for interacting with the phosphoprotein and the large subunit polymerase during transcription and replication. Additionally, the C-terminus region of the protein preceding the disordered region, is predicted to be important for interacting with the encapsidated genome. Portions of the N-terminus are responsible for N∶N stability and interactions identified by the presence or lack of co-evolving intra-protein contact predictions. The validation of these prediction results by current structural information illustrates the benefits of the Disorder, Intra-residue contact and Compensatory mutation Correlator (DisICC) pipeline as a method for quickly characterizing proteins and providing the most likely residues and regions necessary to target for disruption in viruses that have little structural information available. PMID:21559282

  13. Re-Introduction of Transmembrane Serine Residues Reduce the Minimum Pore Diameter of Channelrhodopsin-2

    PubMed Central

    Richards, Ryan; Dempski, Robert E.

    2012-01-01

    Channelrhodopsin-2 (ChR2) is a microbial-type rhodopsin found in the green algae Chlamydomonas reinhardtii. Under physiological conditions, ChR2 is an inwardly rectifying cation channel that permeates a wide range of mono- and divalent cations. Although this protein shares a high sequence homology with other microbial-type rhodopsins, which are ion pumps, ChR2 is an ion channel. A sequence alignment of ChR2 with bacteriorhodopsin, a proton pump, reveals that ChR2 lacks specific motifs and residues, such as serine and threonine, known to contribute to non-covalent interactions within transmembrane domains. We hypothesized that reintroduction of the eight transmembrane serine residues present in bacteriorhodopsin, but not in ChR2, will restrict the conformational flexibility and reduce the pore diameter of ChR2. In this work, eight single serine mutations were created at homologous positions in ChR2. Additionally, an endogenous transmembrane serine was replaced with alanine. We measured kinetics, changes in reversal potential, and permeability ratios in different alkali metal solutions using two-electrode voltage clamp. Applying excluded volume theory, we calculated the minimum pore diameter of ChR2 constructs. An analysis of the results from our experiments show that reintroducing serine residues into the transmembrane domain of ChR2 can restrict the minimum pore diameter through inter- and intrahelical hydrogen bonds while the removal of a transmembrane serine results in a larger pore diameter. Therefore, multiple positions along the intracellular side of the transmembrane domains contribute to the cation permeability of ChR2. PMID:23185520

  14. The Effects of Trivalent Lanthanide Cationization on the Electron Transfer Dissociation of Acidic Fibrinopeptide B and its Analogs

    NASA Astrophysics Data System (ADS)

    Commodore, Juliette J.; Cassady, Carolyn J.

    2016-09-01

    Electrospray ionization (ESI) on mixtures of acidic fibrinopeptide B and two peptide analogs with trivalent lanthanide salts generates [M + Met + H]4+, [M + Met]3+, and [M + Met -H]2+, where M = peptide and Met = metal (except radioactive promethium). These ions undergo extensive and highly efficient electron transfer dissociation (ETD) to form metallated and non-metallated c- and z-ions. All metal adducted product ions contain at least two acidic sites, which suggest attachment of the lanthanide cation at the side chains of one or more acidic residues. The three peptides undergo similar fragmentation. ETD on [M + Met + H]4+ leads to cleavage at every residue; the presence of both a metal ion and an extra proton is very effective in promoting sequence-informative fragmentation. Backbone dissociation of [M + Met]3+ is also extensive, although cleavage does not always occur between adjacent glutamic acid residues. For [M + Met - H ]2+, a more limited range of product ions form. All lanthanide metal peptide complexes display similar fragmentation except for europium (Eu). ETD on [M + Eu - H]2+ and [M + Eu]3+ yields a limited amount of peptide backbone cleavage; however, [M + Eu + H]4+ dissociates extensively with cleavage at every residue. With the exception of the results for Eu(III), metallated peptide ion formation by ESI, ETD fragmentation efficiencies, and product ion formation are unaffected by the identity of the lanthanide cation. Adduction with trivalent lanthanide metal ions is a promising tool for sequence analysis of acidic peptides by ETD.

  15. The role of plastic β-hairpin and weak hydrophobic core in the stability and unfolding of a full sequence design protein

    NASA Astrophysics Data System (ADS)

    Lei, Hongxing; Duan, Yong

    2004-12-01

    In this study, the thermal stability of a designed α/β protein FSD (full sequence design) was studied by explicit solvent simulations at three moderate temperatures, 273 K, 300 K, and 330 K. The average properties of the ten trajectories at each temperature were analyzed. The thermal unfolding, as judged by backbone root-mean-square deviation and percentage of native contacts, was displayed with increased sampling outside of the native basin as the temperature was raised. The positional fluctuation of the hairpin residues was significantly higher than that of the helix residues at all three temperatures. The hairpin segment displayed certain plasticity even at 273 K. Apart from the terminal residues, the highest fluctuation was shown in the turn residues 7-9. Secondary structure analysis manifested the structural heterogeneity of the hairpin segment. It was also revealed by the simulation that the hydrophobic core was vulnerable to thermal denaturation. Consistent with the experiment, the I7Y mutation in the double mutant FSD-EY (FSD with mutations Q1E and I7Y) dramatically increased the protein stability in the simulation, suggesting that the plasticity of the hairpin can be partially compensated by a stronger hydrophobic core. As for the unfolding pathway, the breathing of the hydrophobic core and the separation of the two secondary structure elements (α helix and β hairpin) was the initiation step of the unfolding. The loss of global contacts from the separation further destabilized the hairpin structure and also led to the unwinding of the helix.

  16. The role of plastic beta-hairpin and weak hydrophobic core in the stability and unfolding of a full sequence design protein.

    PubMed

    Lei, Hongxing; Duan, Yong

    2004-12-15

    In this study, the thermal stability of a designed alpha/beta protein FSD (full sequence design) was studied by explicit solvent simulations at three moderate temperatures, 273 K, 300 K, and 330 K. The average properties of the ten trajectories at each temperature were analyzed. The thermal unfolding, as judged by backbone root-mean-square deviation and percentage of native contacts, was displayed with increased sampling outside of the native basin as the temperature was raised. The positional fluctuation of the hairpin residues was significantly higher than that of the helix residues at all three temperatures. The hairpin segment displayed certain plasticity even at 273 K. Apart from the terminal residues, the highest fluctuation was shown in the turn residues 7-9. Secondary structure analysis manifested the structural heterogeneity of the hairpin segment. It was also revealed by the simulation that the hydrophobic core was vulnerable to thermal denaturation. Consistent with the experiment, the I7Y mutation in the double mutant FSD-EY (FSD with mutations Q1E and I7Y) dramatically increased the protein stability in the simulation, suggesting that the plasticity of the hairpin can be partially compensated by a stronger hydrophobic core. As for the unfolding pathway, the breathing of the hydrophobic core and the separation of the two secondary structure elements (alpha helix and beta hairpin) was the initiation step of the unfolding. The loss of global contacts from the separation further destabilized the hairpin structure and also led to the unwinding of the helix. (c) 2004 American Institute of Physics

  17. Soil Organic Matter Quality of an Oxisol Affected by Plant Residues and Crop Sequence under No-Tillage

    NASA Astrophysics Data System (ADS)

    Cora, Jose; Marcelo, Adolfo

    2013-04-01

    Plant residues are considered the primarily resource for soil organic matter (SOM) formation and the amounts and properties of plant litter are important controlling factors for the SOM quality. We determined the amounts, quality and decomposition rate of plant residues and the effects of summer and winter crop sequences on soil organic C (TOC) content, both particulate organic C (POC) and mineral-associated organic C (MOC) pools and humic substances in a Brazilian Rhodic Eutrudox soil under a no-tillage system. The organic C analysis in specifics pools used in this study was effective and should be adopted in tropical climates to evaluate the soil quality and the sustainability of various cropping systems. Continuous growth of soybean (Glycine max L. Merrill) on summer provided higher contents of soil POC and continuous growth of maize (Zea mays L.) provided higher soil humic acid and MOC contents. Summer soybean-maize rotation provided the higher plant diversity, which likely improved the soil microbial activity and the soil organic C consumption. The winter sunn hemp (Crotalaria juncea L.), pigeon pea (Cajanus cajan (L.) Millsp), oilseed radish (Raphanus sativus L.) and pearl millet (Pennisetum americanum (L.) Leeke) enhanced the soil MOC, a finding that is attributable to the higher N content of the crop residue. Sunn hemp and pigeon pea provided the higher soil POC content. Sunn hemp showed better performance and positive effects on the SOM quality, making it a suitable winter crop choice for tropical conditions with a warm and dry winter.

  18. Basis for substrate recognition and distinction by matrix metalloproteinases

    PubMed Central

    Ratnikov, Boris I.; Cieplak, Piotr; Gramatikoff, Kosi; Pierce, James; Eroshkin, Alexey; Igarashi, Yoshinobu; Kazanov, Marat; Sun, Qing; Godzik, Adam; Osterman, Andrei; Stec, Boguslaw; Strongin, Alex; Smith, Jeffrey W.

    2014-01-01

    Genomic sequencing and structural genomics produced a vast amount of sequence and structural data, creating an opportunity for structure–function analysis in silico [Radivojac P, et al. (2013) Nat Methods 10(3):221–227]. Unfortunately, only a few large experimental datasets exist to serve as benchmarks for function-related predictions. Furthermore, currently there are no reliable means to predict the extent of functional similarity among proteins. Here, we quantify structure–function relationships among three phylogenetic branches of the matrix metalloproteinase (MMP) family by comparing their cleavage efficiencies toward an extended set of phage peptide substrates that were selected from ∼64 million peptide sequences (i.e., a large unbiased representation of substrate space). The observed second-order rate constants [k(obs)] across the substrate space provide a distance measure of functional similarity among the MMPs. These functional distances directly correlate with MMP phylogenetic distance. There is also a remarkable and near-perfect correlation between the MMP substrate preference and sequence identity of 50–57 discontinuous residues surrounding the catalytic groove. We conclude that these residues represent the specificity-determining positions (SDPs) that allowed for the expansion of MMP proteolytic function during evolution. A transmutation of only a few selected SDPs proximal to the bound substrate peptide, and contributing the most to selectivity among the MMPs, is sufficient to enact a global change in the substrate preference of one MMP to that of another, indicating the potential for the rational and focused redesign of cleavage specificity in MMPs. PMID:25246591

  19. Structural Insight into and Mutational Analysis of Family 11 Xylanases: Implications for Mechanisms of Higher pH Catalytic Adaptation.

    PubMed

    Bai, Wenqin; Zhou, Cheng; Zhao, Yueju; Wang, Qinhong; Ma, Yanhe

    2015-01-01

    To understand the molecular basis of higher pH catalytic adaptation of family 11 xylanases, we compared the structures of alkaline, neutral, and acidic active xylanases and analyzed mutants of xylanase Xyn11A-LC from alkalophilic Bacillus sp. SN5. It was revealed that alkaline active xylanases have increased charged residue content, an increased ratio of negatively to positively charged residues, and decreased Ser, Thr, and Tyr residue content relative to non-alkaline active counterparts. Between strands β6 and β7, alkaline xylanases substitute an α-helix for a coil or turn found in their non-alkaline counterparts. Compared with non-alkaline xylanases, alkaline active enzymes have an inserted stretch of seven amino acids rich in charged residues, which may be beneficial for xylanase function in alkaline conditions. Positively charged residues on the molecular surface and ionic bonds may play important roles in higher pH catalytic adaptation of family 11 xylanases. By structure comparison, sequence alignment and mutational analysis, six amino acids (Glu16, Trp18, Asn44, Leu46, Arg48, and Ser187, numbering based on Xyn11A-LC) adjacent to the acid/base catalyst were found to be responsible for xylanase function in higher pH conditions. Our results will contribute to understanding the molecular mechanisms of higher pH catalytic adaptation in family 11 xylanases and engineering xylanases to suit industrial applications.

  20. Structural Insight into and Mutational Analysis of Family 11 Xylanases: Implications for Mechanisms of Higher pH Catalytic Adaptation

    PubMed Central

    Bai, Wenqin; Zhou, Cheng; Zhao, Yueju; Wang, Qinhong; Ma, Yanhe

    2015-01-01

    To understand the molecular basis of higher pH catalytic adaptation of family 11 xylanases, we compared the structures of alkaline, neutral, and acidic active xylanases and analyzed mutants of xylanase Xyn11A-LC from alkalophilic Bacillus sp. SN5. It was revealed that alkaline active xylanases have increased charged residue content, an increased ratio of negatively to positively charged residues, and decreased Ser, Thr, and Tyr residue content relative to non-alkaline active counterparts. Between strands β6 and β7, alkaline xylanases substitute an α-helix for a coil or turn found in their non-alkaline counterparts. Compared with non-alkaline xylanases, alkaline active enzymes have an inserted stretch of seven amino acids rich in charged residues, which may be beneficial for xylanase function in alkaline conditions. Positively charged residues on the molecular surface and ionic bonds may play important roles in higher pH catalytic adaptation of family 11 xylanases. By structure comparison, sequence alignment and mutational analysis, six amino acids (Glu16, Trp18, Asn44, Leu46, Arg48, and Ser187, numbering based on Xyn11A-LC) adjacent to the acid/base catalyst were found to be responsible for xylanase function in higher pH conditions. Our results will contribute to understanding the molecular mechanisms of higher pH catalytic adaptation in family 11 xylanases and engineering xylanases to suit industrial applications. PMID:26161643

  1. The amino acid sequences of carboxypeptidases I and II from Aspergillus niger and their stability in the presence of divalent cations.

    PubMed

    Svendsen, I; Dal Degan, F

    1998-09-08

    The amino acid sequences of serine carboxypeptidase I (CPD-I) and II (CPD-II), respectively, from Aspergillus niger have been determined by conventional Edman degradation of the reduced and vinylpyridinated enzymes and peptides hereof generated by cleavage with cyanogen bromide, iodobenzoic acid, glutamic acid cleaving enzyme, AspN-endoproteinase and EndoLysC proteinase. CPD-I consists of a single peptide chain of 471 amino acid residues, three disulfide bridges and nine N-glycosylated asparaginyl residues, while CPD-II consists of a single peptide chain of 481 amino acid residues, has three disulfide bridges, one free cysteinyl residue and nine glycosylated asparaginyl residues. The enzymes are closely related to carboxypeptidase S3 from Penicillium janthinellum. Both Ca2+ and Mg2+ stabilize CPD-I as well as CPD-II, at basic pH values, Ca2+ being most effective, while the divalent ions have no effect on the activity of the two enzymes.

  2. Sequence-Based Prediction of RNA-Binding Residues in Proteins.

    PubMed

    Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.

  3. Sequence-Based Prediction of RNA-Binding Residues in Proteins

    PubMed Central

    Walia, Rasna R.; EL-Manzalawy, Yasser; Honavar, Vasant G.; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein–RNA complexes is important for understanding the molecular determinants of protein–RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein–RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein–RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. PMID:27787829

  4. Identification of residues within the African swine fever virus DP71L protein required for dephosphorylation of translation initiation factor eIF2α and inhibiting activation of pro-apoptotic CHOP

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barber, Claire; Netherton, Chris; Goatley, Lynnett

    The African swine fever virus DP71L protein recruits protein phosphatase 1 (PP1) to dephosphorylate the translation initiation factor 2α (eIF2α) and avoid shut-off of global protein synthesis and downstream activation of the pro-apoptotic factor CHOP. Residues V16 and F18A were critical for binding of DP71L to PP1. Mutation of this PP1 binding motif or deletion of residues between 52 and 66 reduced the ability of DP71L to cause dephosphorylation of eIF2α and inhibit CHOP induction. The residues LSAVL, between 57 and 61, were also required. PP1 was co-precipitated with wild type DP71L and the mutant lacking residues 52- 66 ormore » the LSAVL motif, but not with the PP1 binding motif mutant. The residues in the LSAVL motif play a critical role in DP71L function but do not interfere with binding to PP1. Instead we propose these residues are important for DP71L binding to eIF2α. - Highlights: •The African swine fever virus DP71L protein recruits protein phosphatase 1 (PP1) to dephosphorylate translation initiation factor eIF2α (eIF2α). •The residues V{sup 16}, F{sup 18} of DP71L are required for binding to the α, β and γ isoforms of PP1 and for DP71L function. •The sequence LSAVL downstream from the PP1 binding site (residues 57–61) are also important for DP71L function. •DP71L mutants of the LSAVL sequence retain ability to co-precipitate with PP1 showing these sequences have a different role to PP1 binding.« less

  5. Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network.

    PubMed

    Zhang, Buzhong; Li, Linqing; Lü, Qiang

    2018-05-25

    Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson's correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset.

  6. Cloning of a cDNA encoding bovine mitochondrial NADP(+)-specific isocitrate dehydrogenase and structural comparison with its isoenzymes from different species.

    PubMed Central

    Huh, T L; Ryu, J H; Huh, J W; Sung, H C; Oh, I U; Song, B J; Veech, R L

    1993-01-01

    Mitochondrial NADP(+)-specific isocitrate dehydrogenase (IDP) was co-purified with the pyruvate dehydrogenase complex from bovine kidney mitochondria. The determination of its N-terminal 16-amino-acid sequence revealed that it is highly similar to the IDP from yeast. A cDNA clone (1.8 kb long) encoding this protein was isolated from a bovine kidney lambda gt11 cDNA library using a synthetic oligodeoxynucleotide. The deduced protein sequence of this cDNA clone rendered a precursor protein of 452 amino-acid residues (50,830 Da) and a mature protein of 413 amino-acid residues (46,519 Da). It is 100% identical to the internal tryptic peptide sequences of the autologous form from pig heart and 62% similar to that from yeast. However, it shares little similarity with the mitochondrial NAD(+)-specific isoenzyme from yeast. Structural analyses of the deduced proteins of IDP isoenzymes from different species indicated that similarity exists in certain regions, which may represent the common domains for the active sites or coenzyme-binding sites. In Northern-blot analysis, one species of mRNA (about 2.2 kb for both bovine and human) was hybridized with a 32P-labelled cDNA probe. Southern-blot analysis of genomic DNAs verified simple patterns of hybridization with this cDNA. These results strongly indicate that the mitochondrial IDP may be derived from a single gene family which does not appear to be closely related to that of the NAD(+)-specific isoenzyme. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:8318002

  7. Multi-loci diagnosis of acute lymphoblastic leukaemia with high-throughput sequencing and bioinformatics analysis.

    PubMed

    Ferret, Yann; Caillault, Aurélie; Sebda, Shéhérazade; Duez, Marc; Grardel, Nathalie; Duployez, Nicolas; Villenet, Céline; Figeac, Martin; Preudhomme, Claude; Salson, Mikaël; Giraud, Mathieu

    2016-05-01

    High-throughput sequencing (HTS) is considered a technical revolution that has improved our knowledge of lymphoid and autoimmune diseases, changing our approach to leukaemia both at diagnosis and during follow-up. As part of an immunoglobulin/T cell receptor-based minimal residual disease (MRD) assessment of acute lymphoblastic leukaemia patients, we assessed the performance and feasibility of the replacement of the first steps of the approach based on DNA isolation and Sanger sequencing, using a HTS protocol combined with bioinformatics analysis and visualization using the Vidjil software. We prospectively analysed the diagnostic and relapse samples of 34 paediatric patients, thus identifying 125 leukaemic clones with recombinations on multiple loci (TRG, TRD, IGH and IGK), including Dd2/Dd3 and Intron/KDE rearrangements. Sequencing failures were halved (14% vs. 34%, P = 0.0007), enabling more patients to be monitored. Furthermore, more markers per patient could be monitored, reducing the probability of false negative MRD results. The whole analysis, from sample receipt to clinical validation, was shorter than our current diagnostic protocol, with equal resources. V(D)J recombination was successfully assigned by the software, even for unusual recombinations. This study emphasizes the progress that HTS with adapted bioinformatics tools can bring to the diagnosis of leukaemia patients. © 2016 John Wiley & Sons Ltd.

  8. Analysis of Infiltration-Suction Response in Unsaturated Residual Soil Slope in Gelugor, Penang

    NASA Astrophysics Data System (ADS)

    Ashraf Mohamad Ismail, Mohd; Hasliza Hamzah, Nur; Min, Ng Soon; Hazreek Zainal Abidin, Mohd; Tajudin, Saiful Azhar Ahmad; Madun, Aziman

    2018-04-01

    Rainfall infiltration on residual soil slope may impair slope stability by altering the pore-water pressure in the soil. A study has been carried out on unsaturated residual soil slope in Gelugor, Penang to determine the changes in matric suction of residual soils at different depth due to rainwater infiltration. The sequence of this study includes the site investigation, field instrumentation, laboratory experiment and numerical modeling. Void ratio and porosity of soil were found to be decreasing with depth while the bulk density and dry density of soil increased due to lower porosity of soil at greater depth. Soil infiltration rate and matric suction of all depths decrease with the increase of volumetric water content as well as the degree of saturation. Numerical modeling was used to verify and predict the relationship between infiltration-suction response and degree of saturation. Numerical models can be used to integrate the rainfall scenarios into quantitative landslide hazard assessments. Thus, development plans and mitigation measures can be designed for estimated impacts from hazard assessments based on collected data.

  9. Identifying functionally informative evolutionary sequence profiles.

    PubMed

    Gil, Nelson; Fiser, Andras

    2018-04-15

    Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. andras.fiser@einstein.yu.edu. Supplementary data are available at Bioinformatics online.

  10. Thermodynamic model for uranium release from hanford site tank residual waste.

    PubMed

    Cantrell, Kirk J; Deutsch, William J; Lindberg, Mike J

    2011-02-15

    A thermodynamic model of U solid-phase solubility and paragenesis was developed for Hanford Site tank residual waste that will remain in place after tank closure. The model was developed using a combination of waste composition data, waste leach test data, and thermodynamic modeling of the leach test data. The testing and analyses were conducted using actual Hanford Site tank residual waste. Positive identification of U phases by X-ray diffraction was generally not possible either because solids in the waste were amorphous or their concentrations were not detectable by XRD for both as-received and leached residual waste. Three leachant solutions were used in the studies: deionized water, CaCO3 saturated solution, and Ca(OH)2 saturated solution. Analysis of calculated saturation indices indicate that NaUO2PO4·xH2O and Na2U2O7(am) are present in the residual wastes initially. Leaching of the residual wastes with deionized water or CaCO3 saturated solution results in preferential dissolution Na2U2O7(am) and formation of schoepite. Leaching of the residual wastes with Ca(OH)2 saturated solution appears to result in transformation of both NaUO2PO4·xH2O and Na2U2O7(am) to CaUO4. Upon the basis of these results, the paragenetic sequence of secondary phases expected to occur as leaching of residual waste progresses for two tank closure scenarios was identified.

  11. Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko

    2006-06-10

    Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, amore » cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25.« less

  12. Regulation of the Production of Infectious Genotype 1a Hepatitis C Virus by NS5A Domain III▿

    PubMed Central

    Kim, Seungtaek; Welsch, Christoph; Yi, MinKyung; Lemon, Stanley M.

    2011-01-01

    Although hepatitis C virus (HCV) assembly remains incompletely understood, recent studies with the genotype 2a JFH-1 strain suggest that it is dependent upon the phosphorylation of Ser residues near the C terminus of NS5A, a multifunctional nonstructural protein. Since genotype 1 viruses account for most HCV disease yet differ substantially in sequence from that of JFH-1, we studied the role of NS5A in the production of the H77S virus. While less efficient than JFH-1, genotype 1a H77S RNA produces infectious virus when transfected into permissive Huh-7 cells. The exchange of complete NS5A sequences between these viruses was highly detrimental to replication, while exchanges of the C-terminal domain III sequence (46% amino acid sequence identity) were well tolerated, with little effect on RNA synthesis. Surprisingly, the placement of the H77S domain III sequence into JFH-1 resulted in increased virus yields; conversely, H77S yields were reduced by the introduction of domain III from JFH-1. These changes in infectious virus yield correlated well with changes in the abundance of NS5A in RNA-transfected cells but not with RNA replication or core protein expression levels. Alanine replacement mutagenesis of selected Ser and Thr residues in the C-terminal domain III sequence revealed no single residue to be essential for infectious H77S virus production. However, virus production was eliminated by Ala substitutions at multiple residues and could be restored by phosphomimetic Asp substitutions at these sites. Thus, despite low overall sequence homology, the production of infectious virus is regulated similarly in JFH-1 and H77S viruses by a conserved function associated with a C-terminal Ser/Thr cluster in domain III of NS5A. PMID:21525356

  13. Library analysis of SCHEMA-guided protein recombination

    PubMed Central

    Meyer, Michelle M.; Silberg, Jonathan J.; Voigt, Christopher A.; Endelman, Jeffrey B.; Mayo, Stephen L.; Wang, Zhen-Gang; Arnold, Frances H.

    2003-01-01

    The computational algorithm SCHEMA was developed to estimate the disruption caused when amino acid residues that interact in the three-dimensional structure of a protein are inherited from different parents upon recombination. To evaluate how well SCHEMA predicts disruption, we have shuffled the distantly-related β-lactamases PSE-4 and TEM-1 at 13 sites to create a library of 214 (16,384) chimeras and examined which ones retain lactamase function. Sequencing the genes from ampicillin-selected clones revealed that the percentage of functional clones decreased exponentially with increasing calculated disruption (E = the number of residue–residue contacts that are broken upon recombination). We also found that chimeras with low E have a higher probability of maintaining lactamase function than chimeras with the same effective level of mutation but chosen at random from the library. Thus, the simple distance metric used by SCHEMA to identify interactions and compute E allows one to predict which chimera sequences are most likely to retain their function. This approach can be used to evaluate crossover sites for recombination and to create highly mosaic, folded chimeras. PMID:12876318

  14. Systemic AA amyloidosis in the red fox (Vulpes vulpes).

    PubMed

    Rising, Anna; Cederlund, Ella; Palmberg, Carina; Uhlhorn, Henrik; Gaunitz, Stefan; Nordling, Kerstin; Ågren, Erik; Ihse, Elisabet; Westermark, Gunilla T; Tjernberg, Lars; Jörnvall, Hans; Johansson, Jan; Westermark, Per

    2017-11-01

    Amyloid A (AA) amyloidosis occurs spontaneously in many mammals and birds, but the prevalence varies considerably among different species, and even among subgroups of the same species. The Blue fox and the Gray fox seem to be resistant to the development of AA amyloidosis, while Island foxes have a high prevalence of the disease. Herein, we report on the identification of AA amyloidosis in the Red fox (Vulpes vulpes). Edman degradation and tandem MS analysis of proteolyzed amyloid protein revealed that the amyloid partly was composed of full-length SAA. Its amino acid sequence was determined and found to consist of 111 amino acid residues. Based on inter-species sequence comparisons we found four residue exchanges (Ser31, Lys63, Leu71, Lys72) between the Red and Blue fox SAAs. Lys63 seems unique to the Red fox SAA. We found no obvious explanation to how these exchanges might correlate with the reported differences in SAA amyloidogenicity. Furthermore, in contrast to fibrils from many other mammalian species, the isolated amyloid fibrils from Red fox did not seed AA amyloidosis in a mouse model. © 2017 The Protein Society.

  15. The cDNA-derived amino acid sequence of hemoglobin II from Lucina pectinata.

    PubMed

    Torres-Mercado, Elineth; Renta, Jessicca Y; Rodríguez, Yolanda; López-Garriga, Juan; Cadilla, Carmen L

    2003-11-01

    Hemoglobin II from the clam Lucina pectinata is an oxygen-reactive protein with a unique structural organization in the heme pocket involving residues Gln65 (E7), Tyr30 (B10), Phe44 (CD1), and Phe69 (E11). We employed the reverse transcriptase-polymerase chain reaction (RT-PCR) and methods to synthesize various cDNA(HbII). An initial 300-bp cDNA clone was amplified from total RNA by RT-PCR using degenerate oligonucleotides. Gene-specific primers derived from the HbII-partial cDNA sequence were used to obtain the 5' and 3' ends of the cDNA by RACE. The length of the HbII cDNA, estimated from overlapping clones, was approximately 2114 bases. Northern blot analysis revealed that the mRNA size of HbII agrees with the estimated size using cDNA data. The coding region of the full-length HbII cDNA codes for 151 amino acids. The calculated molecular weight of HbII, including the heme group and acetylated N-terminal residue, is 17,654.07 Da.

  16. Biochemistry of terminal deoxynucleotidyltransferase. Identification and unity of ribo- and deoxyribonucleoside triphosphate binding site in terminal deoxynucleotidyltransferase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pandey, V.N.; Modak, M.J.

    Terminal deoxynucleotidyltransferase is the only DNA polymerase that is strongly inhibited in the presence of ATP. We have labeled calf terminal deoxynucleotidyltransferase with (/sup 32/P)ATP in order to identify its binding site in terminal deoxynucleotidyltransferase. The specificity of ATP cross-linking to terminal deoxynucleotidyltransferase is shown by the competitive inhibition of the overall cross-linking reaction by deoxynucleoside triphosphates, as well as the ATP analogs Ap4A and Ap5A. Tryptic peptide mapping of (/sup 32/P)ATP-labeled enzyme revealed a peptide fraction that contained the majority of cross-linked ATP. The properties, chromatographic characteristics, amino acid composition, and sequence analysis of this peptide fraction were identicalmore » with those found associated with dTTP cross-linked terminal deoxynucleotidyl-transferase peptide. The involvement of the same 2 cysteine residues in the crosslinking of both nucleotides further confirmed the unity of the ATP and dTTP binding domain that contains residues 224-237 in the primary amino acid sequence of calf terminal deoxynucleotidyltransferase.« less

  17. Cloning of the cDNA encoding Scg-SPRP, an unusual Ser-protease-related protein from vitellogenic female desert locusts (Schistocerca gregaria).

    PubMed

    Chiou, S J; Vanden Broeck, J; Janssen, I; Borovsky, D; Vandenbussche, F; Simonet, G; De Loof, A

    1998-10-01

    The cDNA coding for a Ser-protease-related protein (Scg-SPRP) was cloned from desert locust (Schistocerca gregaria) midgut. The derived amino acid sequence consists of 260 residues and shows strong sequence similarity to insect trypsin-like molecules. It is, however, likely that Scg-SPRP is not a proteolytically active enzyme and that it plays another physiologically relevant role, since two out of three residues which are indispensable for catalytic activity of Ser-proteases are replaced. Northern analysis revealed that the Scg-SPRP gene is expressed in midgut tissue and that this expression is strongly induced in adult female locusts. Moreover, the occurrence of the transcript (1.2 kb) fluctuates during the molting cycle and during the female reproductive cycle. Juvenile hormone (JH III) dependence of transcription was investigated by chemical allatectomy (precocene I) of adult females. This resulted in inhibition of vitellogenesis and in disappearance of the Scg-SPRP transcript. Expression of Scg-SPRP in precocene-treated locusts could be reinduced by additional treatment with JH III or with 20-OH-ecdysone.

  18. Escherichia coli ArgR mutants defective in cer/Xer recombination, but not in DNA binding.

    PubMed

    Sénéchal, Hélène; Delesques, Jérémy; Szatmari, George

    2010-04-01

    The Escherichia coli arginine repressor (ArgR) is an L-arginine-dependent DNA-binding protein that controls the expression of the arginine biosynthetic genes and is required as an accessory factor for Xer site-specific recombination at cer and related recombination sites in plasmids. We used the technique of pentapeptide scanning mutagenesis to isolate a series of ArgR mutants that were considerably reduced in cer recombination, but were still able to repress an argA::lacZ fusion. DNA sequence analysis showed that all of the mutants mapped to the same nucleotide, resulting in a five amino acid insertion between residues 149 and 150 of ArgR, corresponding to the end of the alpha6 helix. A truncated ArgR containing a stop codon at residue 150 displayed the same phenotype as the protein with the five amino acid insertion, and both mutants displayed sequence-specific DNA-binding activity that was L-arginine dependent. These results show that the C-terminus of ArgR is more important in cer/Xer site-specific recombination than in DNA binding.

  19. Major Protein of Resting Rhizomes of Calystegia sepium (Hedge Bindweed) Closely Resembles Plant RNases But Has No Enzymatic Activity1

    PubMed Central

    Van Damme, Els J.M.; Hao, Qiang; Barre, Annick; Rougé, Pierre; Van Leuven, Fred; Peumans, Willy J.

    2000-01-01

    The most abundant protein of resting rhizomes of Calystegia sepium (L.) R.Br. (hedge bindweed) has been isolated and its corresponding cDNA cloned. The native protein consists of a single polypeptide of 212 amino acid residues and occurs as a mixture of glycosylated and unglycosylated isoforms. Both forms are derived from the same preproprotein containing a signal peptide and a C-terminal propeptide. Analysis of the deduced amino acid sequence indicated that the C. sepium protein shows high sequence identity and structural similarity with plant RNases. However, no RNase activity could be detected in highly purified preparations of the protein. This apparent lack of activity results most probably from the replacement of a conserved His residue, which is essential for the catalytic activity of plant RNases. Our findings not only demonstrate the occurrence of a catalytically inactive variant of an S-like RNase, but also provide further evidence that genes encoding storage proteins may have evolved from genes encoding enzymes or other biologically active proteins. PMID:10677436

  20. All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences.

    PubMed

    Hayat, Sikander; Sander, Chris; Marks, Debora S; Elofsson, Arne

    2015-04-28

    Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand-strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.

Top